Understanding Metrics in OpenTelemetry (OTel) API

Obfuscation added to log monitoring
Using OpenTelemetry in Kubernetes

So you may have heard about the OpenTelemetry or OTel Metrics API. This article will explain the concept of metric, metric instruments and their functions, metric providers, and give you a practical example of implementing metric instruments. 

When we talk about OTel or OpenTelemetry, we are simply referring to a group of tools, SDKs, and APIs that are used to generate, instrument, collect, and export telemetry data (such as metrics, traces, and logs) for better analysis and understanding of the performance of a given software at runtime.

Given all this information, one may wonder what an OpenTelemetry Metric API is and why you should care. The Metric API has a design that supports the explicit processing of raw measurements to reveal continuous summaries that give developers the visibility they need. The OpenTelemetry metric API enables capturing measurements about the execution of a computer program. 

As we gradually jump right into the whole OpenTelemetry metric API brawl, let’s look at some developers’ points of view. Most developers know the capabilities of metrics in somewhat way. They are familiar with monitoring metrics using alerts to indicate when a service violates a predetermined threshold, process memory utilization, or error rates. At the same time, others are familiar with event streaming strategies such as aggregating and recording metrics by tracing or logging systems. Having this in mind, let’s move further to look at instruments within the Opentelemetry Metrics API. 

One of the special features of the Metrics API is that it distinguishes between the metric instruments at the semantic level rather than the eventual type of value they export. We can say the word “semantic” refers to how we give meaning to metric events, as they occur at runtime. Understanding this gives us an overview of the Metric instruments in OpenTelemetry and their functions. 

OpenTelemetry Metric Instruments and their functions

The OpenTelemetry or OTel Metric API provides six metric instruments. These instruments are related to a specific meter API, which becomes the user-facing entry point to the SDK. To be more specific, each instrument supports a single meter function to assist the instrument’s semantics. 

The meter instruments can be synchronous or asynchronous. The latter instruments have a distributed context and are inherent inside a request. Counter and UpDownCounter form the two additive instruments supporting an  Add()  function. On the other hand, ValueRecorder forms the synchronous non-additive instrument. This supports a Record() function as it captures metric event data.

Meanwhile, Asynchronous instruments are defined by a callback, which happens once per collection interval. There are mainly two asynchronous additive instruments, SumObserver and UpDownObserver, while the non-additive instrument is ValueObserver. Note that all three instruments support an Observe() function, implicating that they capture one value per measurement interval. 

Metric events captured through any instrument will consist of:

  • value (signed integer or floating-point number)
  • resources associated with the SDK at startup
  • distributed context (for synchronous events only)
  • timestamp (implicit)
  • instrument definition (name, kind, description, unit of measure)
  • label set (keys and values)

 Here is a quick summary of the six metric instruments and their properties:

Name Synchronous Additive Monotonic Function
Counter Yes Yes Yes Add()
UpDownCounter Yes Yes No Add()
ValueRecorder Yes No No Record()
SumObserver No Yes Yes Observe()
UpDownSumObserver No Yes No Observe()
ValueObserver No No No Observe()

Metric Provider

When you initialize and configure an OTel or OpenTelemtry Metrics SDK, a concrete MetricProvider can be implemented. Once configured, the application chooses which instance to use, whether a global instance or a dependency injection, for a bigger control over the entire configuration process. Check out the Metric API specification for more details about implementing a MeterProvider.

Metric and Distributed Context

There is a strong relationship between the distributed context at runtime and the synchronous measurements, spanning the context and correlation values. Correlation values impact OpenTelemetry as it supports the generation of labels from one process to another in a distributed computation. We can easily configure the correlation context using the (WIP) Views API to select specific key correlation values applied as labels.

Implementing metric instruments: example

Several languages work with OpenTelemetry API. So by design mechanism applied to implement each metric instrument can vary between implementations. This means that the general specification might not match every new metric event created. At this point, consulting the documentation for that specific guide becomes very important. Let’s begin the implementation process.

 The first step is to create an instrument and describe it with a name. also a good idea to provide label keys to optimize the metric export pipeline and initialize a LabelSet for both keys and values that align with attributes on your metric events. 

1 // initialize instruments statically or in an initializer, a Counter and a Value Recorder
2 meter = global.Meter(‘my_application”)
5 requestBytes = meter.NewIntCounter("request.bytes", WithUnit(unit.Bytes))
6 requestLatency = meter.NewFloatValueRecorder("request.latency", WithUnit(unit.Second))78 // then, in a request handler define the labels that apply to the request9 labels = {“path”: “/api/getFoo/{id}”, “host”: “host.name”}

Again, it’s important to know that the standard implementation strategy remains the same, even though you might experience some slight changes with the language used. Once the names are given, recording metric events become pretty straightforward.

1 requestBytes.Add(req.bytes, labels)
2 requestLatency.Record(req.latency, labels)
)8 …9 }