Prometheus monitoring example

Prometheus Metric Types

 

Prometheus Metrics types

 

Prometheus Metric Types

Let’s start at the beginning. Prometheus collects four types of metrics as part of its exposition format.

  1. Counters
  2. Gauges
  3. Histograms
  4. Summaries

Prometheus collects these metrics by scraping HTTP endpoints that expose metrics, known as a pull model. Prometheus exporters built by the community can natively expose those endpoints or via the monitored component. In addition to supporting a wide range of programming languages, Prometheus also provides client libraries that can be used to instrument your code.

As well as scraping metrics from Prometheus, Prometheus can scrape metrics from OpenMetrics. HTTP metrics are exposed via text-based formats (more widely used) or more robust and efficient protocol buffer formats. This format has the advantage of being human-readable, so you can open it in a browser or retrieve exposed metrics using a tool like curl.

Prometheus’ metric model supports only four metric types and is only available in client libraries. The exposition format represents all metric types using one or more underlying Prometheus data types. This Prometheus data type includes a metric name, labels, and a floating value. In addition, an agent or a monitoring backend adds the timestamp when scraping metrics (Prometheus, for example).

 

  • Prometheus data types

PromQL, short for Prometheus Querying Language, is the primary way to query metrics within Prometheus. You can display an expression’s return as a graph or export it using the HTTP API. PromQL uses three Prometheus data types. These Prometheus data types include scalars, range vectors, and instant vectors. It also uses strings, but only as literals. 

 

Before you proceed, you may find the following posts helpful:

  1. Observability vs Monitoring

 



Prometheus Metric Types


Key Prometheus Metric Types Discussion Points:


  • Prometheus uses a PULL approach to scarp metrics.

  • Prometheus Exporters & Client libraries.

  • Prometheus Runtime Metrics.

  • Prometheus Infrastructure Metrics.

  • Prometheus Application Metrics.

  • Prometheus CI/CD Pipeline Metrics.

  • Prometheus Time to First Byte Metrics.

 

  • A key point: Video on Prometheus Metric Types

In this video tutorial, we are going through the basics of monitoring systems, particularly the role of Prometheus and its pull approach, along with the different metrics that Prometheus can scrap. So we can scrap several Prometheus metric types, such as Docker container metrics. However, there are Prometheus supports four types of metrics, they are. The Prometheus metric types are the Counter, Gauge, Histogram, and Summary.

 

 

Back to basics: Prometheus Metric Types

For Prometheus metric types, we want as many metrics as possible. These need to be stored to follow trends, understand what has been happening from a historical view, and better predict any issues. So, there are several parts to a Prometheus monitoring solution in microservices observability; we must collect the metrics, known as scraping, store them, and then analyze them. In addition, we need to consider storage security, compliance, and regulatory concerns for distributed systems observability. Monitoring the correct metric is key; having metrics lets you view how the system performs. The Prometheus metrics types represent raw measurement of resource usage and can help you plan toward upgrading and tell you how many resources are being used.  

To be clear, there are two kinds of “types” in Prometheus. There are the metric types of metrics and the data types of PromQL expressions.

Prometheus has four metric types

  1. Counters
  2. Gauges
  3. Histograms 
  4. Summaries 

PromQL subsequently has four data types:

  1. Floats (mostly scalars) 
  2. Range vectors
  3. Instant vectors
  4. Time (though it’s often not counted in this category)

Different types of metrics can be scrapped. We will go through these now.

 

  • Prometheus Pull Approach

It’s easy to monitor a Kubernetes cluster with the pull model because of service discovery and shared network access within the cluster. However, monitoring a dynamic fleet of virtual machines, AWS Fargate containers, or Lambda functions with Prometheus is hard. How come? Identifying metrics endpoints for scraping is difficult; network security policies may restrict access to those endpoints. Prometheus Agent Mode was released at the end of 2021 to solve some of these problems. This mode collects metrics and sends them to a monitoring backend via the remote write protocol.

 

  • Prometheus Service Discovery

Prometheus discovers targets to scrape from service discovery. These can be instrumented or third-party applications you can scrape via an exporter. The scraped data is stored, and you can use it in dashboards using PromQL or send alerts to the Alertmanager, which will convert them into pages, emails, and other notifications. Metrics do not typically magically spring forth from applications; someone has to add the instrumentation that produces them.

 

Starting with Prometheus Metric Types

Metrics can be applied to various components and are a unit of measurement for evaluating an item and are consistently measured. Examples of common measurements include CPU utilization, memory utilization, and interface utilization. These are numbers about how your resources are performing.  For the metrics side of things, we have runtime, infrastructure, and application metrics, including Prometheus Exporters, response codes, and time-to-serve data. We also have CI/CD pipeline metrics such as build time and failures. Let’s discuss these in more detail.

 

  • Counters

To increase measurements, counter-metrics are used. Since they are cumulative, their value can only increase. Exceptionally, the counter’s value is reset to zero when it is restarted. On its own, a counter’s value could be more helpful. However, a counter value is often used to compute the delta or rate of change between two timestamps.

  • Gauges

Measurements that increase or decrease are measured using gauge metrics. This metric type is more familiar since the actual value without additional processing is meaningful. A gauge is, for instance, a metric that measures the temperature, CPU, memory usage, or queue size.

  • Histograms

A histogram represents a distribution of measurements. For example, request durations or response sizes are often measured with them. A histogram counts how many measurements fall into each bucket based on the entire range of measurements.

Prometheus Metric Types
Diagram: Prometheus Metric Types. Source timescale.

 

Highlighting Prometheus Monitoring

Auto Scaling Observability

So previously, Heapsters was used as a monitoring solution that came out of the box with Kubernetes. We now have Prometheus as the de facto standard monitoring system for Kubernetes clusters, bringing many benefits. Firstly, Prometheus monitoring scales with a pull approach and the Prometheus federated options. The challenge is that if we run microservices at scale and the monitoring system pushes metrics out to a metric server, the monitoring system can flood the network.

Also, with a push-based approach, you may need to scale up instead of out, which could be costly.  So we can have a bunch of different systems we want to monitor. Therefore, the metrics content will differ for the systems and components, but Prometheus collects and exports the same. This provides a welcomed layer of unification for the different systems you have in the network.

 

Prometheus Monitoring
Diagram: Prometheus Monitoring. Source is Opcito

 

Prometheus Monitoring: Exporters and Client Libraries

So with Prometheus monitoring, you can get metrics from the systems you want to monitor using pre-built exporters and custom client libraries. So Prometheus works very well with Docker and Kubernetes but can also work outside the container networking world with non-cloud native applications using exporters. So you can monitor your entire stack with a wide range of exporters and client libraries.

We install the code library; we gather custom applications and runtime metrics for cloud-native applications. Here we can see the custom metrics that matter most to us by installing code in the application.

 

    • Prometheus Metric type: Runtime metrics

Runtime Metrics are statistics collected by the operating system and application host. These include CPU usage, memory load, and web server requests. For example, this could be CPU and memory usage from a Tomcat and JVM from a Java app.

 

    • Prometheus Metric type: Infrastructure metrics

We examine CPU utilization, latency, bandwidth, memory, and temperature metrics for Infrastructure metrics. These metrics should be collected over a long period and applied to the infrastructure, such as networking equipment, hypervisor, and host-based systems.

 

    • Prometheus Metric type: Application metrics

Then we have Application metrics and custom statistics relevant only to the application and not the infrastructure. Application metrics pertain specifically to an application. This may include the number of API calls made during a particular time.  This can be quickly done with web-based applications; here, we can get many status codes that provide information. These metrics are easy to measure, and the response codes are available immediately. For example, an HTTP status code of 200 is good, and 400 or more is an issue. 

 

    • Prometheus Metric type: Time to first byte

So another important metric is the time a web server takes to respond to the data. The important metric here is time to the first byte (TTFB). This measures how long it takes for your application to send data. Time to the first byte refers to the time between the browser requesting a page and when it receives the first byte of information from the server. If this metric exceeds the usual, you may need caching, faster storage, or a better CPU.

So let us take the example of the content delivery network (CDN); what is an excellent time to the first byte?  On average, anything with a TTFB under 100 ms is fantastic. Anything between 200-500 ms is standard, and anything between 500 ms and 1 second is less than ideal. Anything more significant than 1 second should likely be investigated further.

 

    • Prometheus Metric type: CI/CD pipeline metrics

For the CI/CD Pipeline metrics, we want to measure how long it takes to do the static code analysis. Next, we want to measure the amount of error while running the pipeline. Finally, we want to measure the build time and build failures. These metrics include how long it took to build an application, the time it took to complete tests, and how often builds fail.  

 

    • Prometheus Metric type: Docker metrics

Docker metrics come from the Docker platform. This may include container health checks, the number of online and offline nodes in a cluster, and the number of containers and actions. These container actions may be stopped, paused, and running. So we have built-in metrics provided by Docker to give additional visibility to the running containers. When running containers in production, monitoring their runtime metrics, such as CPU and memory usage, is essential.

 

Prometheus Metric

 

  • A key point: Docker metrics

Metrics from the Docker Platform are essential for containers when Docker stops and starts applications for you. You can’t gather one metric type without the other. For example, if you look at the application metrics, you only look at half of the puzzle and may miss the problem. For example, if one of your applications is performing poorly and the docker platform constantly spins up new containers, you would not see that just under the application metrics. 

Your application and runtime metrics may seem to be within the standard thresholds. However, combining this with the Docker Platform metrics shows the container stats, showing a spike in container creation.

 

Exposing application metrics to Prometheus

Application metrics give you additional information. Unlike runtime metrics you get for free, you need to explicitly record what you care about. Here we have client libraries that Prometheus offers us. All the major languages have a Prometheus client library which provides the metrics endpoint. The client library makes application metrics available to Prometheus, giving you a very high level of visibility into your application.

With Prometheus client libraries, you can see what is happening inside the application. So we have both Prometheus exporters and Prometheus client libraries that allow Prometheus to monitor everything.  

 

Exposing Docker Metrics to Prometheus

First, Docker Default Networking 101. The Docker Engine interacts with all the clients; its role is to collect and export the metrics. When you build a Docker image, the Engine records a metric. So we need to have insights into the Docker platform. Here we can expose Docker metrics to Prometheus. The Docker Engine has a built-in mechanism to export metrics in Prometheus format. So we have, for example, the Docker metrics covering the Engine and container and metrics about images. 

 

Docker metric types: Three types

The types of metrics have three areas. First, we have the Docker Engine, Builds, and Containers. 

    1. The Engine will give you information on the host, such as the CPU count and O/S version and build of the Docker Engine. 
    2. Then for Build metrics, it is helpful for information such as the number of builds triggered, canceled, and failed. 
    3. Also, container metrics show the number of containers stopped and paused—also the number of health checks that are fired and failed.

 

Wrap up: Prometheus monitoring.

So for Prometheus monitoring, we have Prometheus exporters that can get metrics for, let’s say, a Linux server and application metrics that can support Prometheus using a client library. Both have an HTTP endpoint that returns metrics in the standard Prometheus format. Once the HTTP endpoint is up and running on the application ( legacy or cloud-native ), Prometheus will scrape ( collect ) the metric with dynamic or static approaches.

So we have Exporters that can add metrics to systems that don’t have Prometheus support. Then we also have Prometheus client libraries that can give Prometheus support in the application. These client libraries can provide out-of-the-box runtime metrics and custom metrics relevant to the applications.

 

Matt Conran: The Visual Age
Latest posts by Matt Conran: The Visual Age (see all)

Comments are closed.