Kubernetes Cluster Monitoring: A Guide to Prometheus and Grafana Integration

Kubernetes Cluster Monitoring: A Guide to Prometheus and Grafana Integration

Unlocking Powerful Insights and Visualization for Enhanced Kubernetes Cluster Monitoring and Management

In this article, we will learn how to monitor the Kubernetes Cluster using Prometheus and Grafana.

Prometheus: An Overview

What is Prometheus?

Prometheus is an open-source monitoring and alerting application well-suited for monitoring systems like Kubernetes. Its architecture is based on a pull model, where Prometheus scrapes metrics from configured targets at specified intervals.

Features of Prometheus:

  1. Multi-dimensional data model: Time series are identified by metric names and key-value pairs.

  2. Flexible Query Language: Prometheus's query language, PromQL, allows users to aggregate time series data in real-time.

  3. Independent from distributed storage: Each Prometheus server is standalone, not depending on network storage or other remote services.

  4. Discover targets dynamically: Especially useful in dynamic cloud environments, Prometheus can discover service instances to be monitored.

  5. Multiple modes of graphing and dashboarding support.

Before deploying Prometheus and Grafana, Let's learn how Prometheus monitors Kubernetes using a high-level Architecture diagram.

Kubernetes is a container orchestration platform that requires deep and real-time insights into cluster health, resource utilization, and application performance. Prometheus, with its multi-dimensional data model and flexible query language, is designed to collect metrics from dynamic environments like Kubernetes.

How Prometheus Discovers Kubernetes Services

  • Service Discovery: At the heart of Prometheus's Kubernetes integration is its ability to discover services as they come up or go down dynamically. Kubernetes service discovery is achieved by querying the Kubernetes API to find new services and endpoints.

  • Relabeling: Once targets are discovered, Prometheus uses "relabeling" to refine and filter these targets before scraping.

Metrics Collection

  • Node Metrics: These metrics give insights into the performance and health of Kubernetes nodes. The node_exporter is commonly used to expose these metrics.

  • Kube-state-metrics: This is an essential service that provides metrics derived from the internal state of Kubernetes resources. For example, it can provide metrics on the number of replicas a deployment is expected to have versus its current number.

  • cAdvisor: Integrated into the Kubelet agent, cAdvisor (Container Advisor) provides detailed information about running container resource usage and performance characteristics.

  • Custom Application Metrics: Developers can instrument their applications to expose custom Prometheus metrics, providing deeper insights into application behavior.

Storing and Querying Metrics

  • Time-Series Database (TSDB): Prometheus stores metrics in its integrated TSDB, optimized for high-cardinality and time-series data.

  • PromQL: Prometheus Query Language (PromQL) is a flexible query language that enables complex queries and aggregations. It can fetch data based on time and other metric labels, providing deep insights into the system's state.

Alerts

Prometheus has a robust alerting mechanism. When pre-defined thresholds or conditions are met, alerts are fired. These alerts can then be routed to external systems using the Alertmanager.

Visualization with Grafana

While Prometheus provides its own basic UI for visualization, Grafana is often paired with Prometheus to create rich, interactive dashboards. Grafana natively supports Prometheus as a data source and provides various pre-built dashboards for Kubernetes monitoring.

A 3-step guide to troubleshooting and visualizing Kubernetes with Grafana  Cloud | Grafana Labs

Kube Prometheus Stack Deployment

I stumbled upon a fantastic GitHub repository named Kube Prometheus. It bundles all the essential Prometheus and Grafana components, providing a streamlined solution for deploying and monitoring Kubernetes clusters.

You can find the repo here: https://github.com/prometheus-operator/kube-prometheus

Let's start deploying the Kube Prometheus stack. I assume you already have a Kubernetes Cluster Deployed, and Kubectl is configured.

As a first step, let's clone the repository.

git clone git@github.com:prometheus-operator/kube-prometheus.git

Next, Deploy the stack using the following commands,

kubectl apply --server-side -f manifests/setup
kubectl wait \
    --for condition=Established \
    --all CustomResourceDefinition \
    --namespace=monitoring
kubectl apply -f manifests/

To delete the stack, use the following commands,

kubectl delete --ignore-not-found=true -f manifests/ -f manifests/setup

That's it! wait for a couple of minutes and check the monitoring namespace if the pods are in ready state.

You can access the Grafana dashboard using the Kubectl port-forward command

kubectl -n monitoring port-forward svc/grafana 8080:3000

if you wonder how you can access the Grafana charts, it is very simple, grab the Grafana username and password from the secret.

admin@Admins-MacBook-Pro kube-prometheus % kubectl get secrets -n monitoring
NAME                  TYPE     DATA   AGE
alertmanager-main     Opaque   1      10m
grafana-config        Opaque   1      10m
grafana-datasources   Opaque   1      10m

Did you find this article valuable?

Support sharon sahadevan by becoming a sponsor. Any amount is appreciated!