What is Kubernetes? “Kubernetes is a portable, extensible open-source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation.” (https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/, 2018)
^ promtheus.io/scrape’ ➢ ➢ ➢ ➢ ➢ Ser Dis r Collecting K8s Metrics
Is your cluster working? ➔ Pod startup time: 99% of pods and their containers (with pre-pulled images) start within 5s. ➔ API-responsiveness: 99% of all API calls return in less than 1s
Is your cluster working?
Is your cluster working?
Is your cluster working?
Kubernetes Metadata Annotations "annotations": { "kubernetes.io/key/1" : "value1", "kubernetes.io/key/2" : "value2" } "labels": { "key1" : "value1", "key2" : "value2" } Labels Machine readable metadata consumed by tooling and system extensions Human readable metadata to facilitate the organization and management of API resources
Lots of Metadata! Any large organization will end up with inordinate amounts of metadata from their kubernetes cluster… Problems?
Implicit Tags $host.cpu.system
Implicit Tags get messy $region.$zone.$network.$app.$host.cpu.system
Implicit Tags get messy and differs across orgs $region.$zone.$app.$host.cpu.system $region.$zone.$app.$host.cpu-seconds.system $region.$zone.$app.$host.cpu.system.seconds $region.$zone.$network.$env.$app.$host.cpu.system $regionID.$region.$zone.$network.$app.$host.cpu.system
Kubernetes Tags Explicitly, So Should You Container_cpu_system_seconds_total{$region.$zone.$app.$host} Container_cpu_system_seconds_total{$region.$zone.$app.$host} Container_cpu_system_seconds_total{$region.$zone.$app.$host} Container_cpu_system_seconds_total{$region.$zone.$network.$env.$app.$host} Container_cpu_system_seconds_total{$regionID.$region.$zone.$network.$app.$host}
The Curse of Dimensionality O(2^d)
Desire to maintain consistent metric tags
Containers are ephemeral and that’s ok
Workloads ➔ Container v1 core ➔ CronJob v1beta1 batch ➔ DaemonSet v1 apps ➔ Deployment v1 apps ➔ Job v1 batch ➔ Pod v1 core ➔ ReplicaSet v1 apps ➔ ReplicationController v1 core ➔ StatefulSet v1 apps API Overview DISCOVERY & LOAD BALANCING ➔ Endpoints v1 core ➔ Ingress v1beta1 extensions ➔ Service v1 core Cluster ➔ Namemespace v1 core ➔ Node v1 core ➔ etc... Custom Resource Definitions ➔ etc...
Live Demo
API Overview Kubernetes has a well defined API with very specific conventions ➔ Follows a traditional REST pattern ➔ All kubernetes REST objects contain identically structured metadata fields ➔ This allows us to leverage the api as a datasource across different any number of standard or user defined kubernetes resources
Workloads ➔ Container v1 core ➔ CronJob v1beta1 batch ➔ DaemonSet v1 apps ➔ Deployment v1 apps ➔ Job v1 batch ➔ Pod v1 core ➔ ReplicaSet v1 apps ➔ ReplicationController v1 core ➔ StatefulSet v1 apps API Overview DISCOVERY & LOAD BALANCING ➔ Endpoints v1 core ➔ Ingress v1beta1 extensions ➔ Service v1 core Cluster ➔ Namemespace v1 core ➔ Node v1 core ➔ etc... Custom Resource Definitions ➔ etc...
So What? Being able to query on a few extra dimensions is not that special
Monitoring Kubernetes should be Turn-Key and Free A standard set of defined metrics that are tool and database agnostic Tools to auto-generate visualizations and alerts for kubernetes based on best practices A Fractured landscape of tools and practices that differ across companies and teams within companies
Thank You / QA

What Does Kubernetes Look Like?: Performance Monitoring & Visualization with Grafana

  • 1.
    What is Kubernetes? “Kubernetesis a portable, extensible open-source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation.” (https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/, 2018)
  • 2.
  • 3.
    Is your clusterworking? ➔ Pod startup time: 99% of pods and their containers (with pre-pulled images) start within 5s. ➔ API-responsiveness: 99% of all API calls return in less than 1s
  • 4.
  • 5.
  • 6.
  • 7.
    Kubernetes Metadata Annotations "annotations": { "kubernetes.io/key/1": "value1", "kubernetes.io/key/2" : "value2" } "labels": { "key1" : "value1", "key2" : "value2" } Labels Machine readable metadata consumed by tooling and system extensions Human readable metadata to facilitate the organization and management of API resources
  • 8.
    Lots of Metadata! Anylarge organization will end up with inordinate amounts of metadata from their kubernetes cluster… Problems?
  • 9.
  • 10.
    Implicit Tags getmessy $region.$zone.$network.$app.$host.cpu.system
  • 11.
    Implicit Tags getmessy and differs across orgs $region.$zone.$app.$host.cpu.system $region.$zone.$app.$host.cpu-seconds.system $region.$zone.$app.$host.cpu.system.seconds $region.$zone.$network.$env.$app.$host.cpu.system $regionID.$region.$zone.$network.$app.$host.cpu.system
  • 12.
    Kubernetes Tags Explicitly,So Should You Container_cpu_system_seconds_total{$region.$zone.$app.$host} Container_cpu_system_seconds_total{$region.$zone.$app.$host} Container_cpu_system_seconds_total{$region.$zone.$app.$host} Container_cpu_system_seconds_total{$region.$zone.$network.$env.$app.$host} Container_cpu_system_seconds_total{$regionID.$region.$zone.$network.$app.$host}
  • 13.
    The Curse ofDimensionality O(2^d)
  • 14.
    Desire to maintainconsistent metric tags
  • 15.
    Containers are ephemeraland that’s ok
  • 16.
    Workloads ➔ Container v1core ➔ CronJob v1beta1 batch ➔ DaemonSet v1 apps ➔ Deployment v1 apps ➔ Job v1 batch ➔ Pod v1 core ➔ ReplicaSet v1 apps ➔ ReplicationController v1 core ➔ StatefulSet v1 apps API Overview DISCOVERY & LOAD BALANCING ➔ Endpoints v1 core ➔ Ingress v1beta1 extensions ➔ Service v1 core Cluster ➔ Namemespace v1 core ➔ Node v1 core ➔ etc... Custom Resource Definitions ➔ etc...
  • 17.
  • 18.
    API Overview Kubernetes hasa well defined API with very specific conventions ➔ Follows a traditional REST pattern ➔ All kubernetes REST objects contain identically structured metadata fields ➔ This allows us to leverage the api as a datasource across different any number of standard or user defined kubernetes resources
  • 19.
    Workloads ➔ Container v1core ➔ CronJob v1beta1 batch ➔ DaemonSet v1 apps ➔ Deployment v1 apps ➔ Job v1 batch ➔ Pod v1 core ➔ ReplicaSet v1 apps ➔ ReplicationController v1 core ➔ StatefulSet v1 apps API Overview DISCOVERY & LOAD BALANCING ➔ Endpoints v1 core ➔ Ingress v1beta1 extensions ➔ Service v1 core Cluster ➔ Namemespace v1 core ➔ Node v1 core ➔ etc... Custom Resource Definitions ➔ etc...
  • 20.
    So What? Being ableto query on a few extra dimensions is not that special
  • 21.
    Monitoring Kubernetes shouldbe Turn-Key and Free A standard set of defined metrics that are tool and database agnostic Tools to auto-generate visualizations and alerts for kubernetes based on best practices A Fractured landscape of tools and practices that differ across companies and teams within companies
  • 22.