Introduction to APM 1

In the series of these posts I will look into how we can implement an APM solution for .NET applications. Application Performance Management (APM) helps to monitor and diagnose application performance. There are numerous libraries and tools out there to solve this given problem. In this series of posts I will focus on implementing APM for .NET applications (.NET Framework 461 and above, .NET Core and NET5) with OpenTelemetry.

In the following posts I will look into the following topics, starting with the first one in this post:

  • W3C correlation Id specification

  • Creating and recording spans with ActivitySource

  • Using OpenTelemetry and Jaeger

W3C correlation Id specification

The CorrelationId is defined in the Trace Context W3C specification. In this post I explain the structure of the defined headers and introduce how the ASP.NET Core implementation uses these headers.

The W3C specification for trace context defines two headers: traceparent and tracestate.

Tracestate Header

The tracestate allows to pass vendor specific information. It complements the traceparent header. According to the specification vendors may pass custom state in this header. Each vendor may concatenate custom key-value pairs to the current value. An example value of the tracestate header:

tracestate: congo=t61rcWkgMzE

As such it is possible that the tracestate cannot be or only partially can be parsed. The current ASP.NET Core implementation does not add custom logic to the tracestate. On each incoming request, it reads the header value as an opaque string, and sets it on the Activity spanning the current http request. Respectively outgoing http requests will attach the tracestate value with the help of the diagnostics delegating handler.

Traceparent Header

Out of the two headers the traceparent one could be called as the primary header, while the tracestate complements it with vendor specific information. The traceparent consists of four parts:

  • version

  • traceId

  • parentId

  • flags

An example value of the traceparent header:

traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01

The version is a byte value, at the time of the writing this post, the value is set to 00. The specification defines that future version changes will be additive to the traceparent and specifies how a traceparent value should be parsed in a forward compatible manner.

The traceId is a 16 byte array representing the set of traces. Typically a user request has one unique traceId, which is used on all subsequent traces. New traceIds are generated for operations without a traceId or when the tracing is restarted. The traceId is serialized as a 32HEXDIGLC, which means a 32 char long string which consists of digits and [a-f] lowercase hex characters. The traceId may be parsed as a Guid in C#.

The parentId is an 8 byte array. In most tracing systems this is known as a spanId. In .NET these Ids are available as a composite Id and as separate Ids on the current Activity.

The trace flags is the 4th section of the traceparent's value. The caller may add suggestions for the current trace context to the callee with the help of these flags. The value of flags shall be parsed as a bit field. Currently there is a single flag defined, while the other flags are reserved. The sampled flag when set, means that the caller may have recorded the trace data. As an example, a service may decide to sample a trace context only when the call site has denoted that the give trace is recorded. OpenTelemetry ships with four sampler implementations. The ParentBasedSampler type takes a sample if parent Activity or any linked Activity is recorded uses this flag.

Using W3C TraceContext with ASP.NET Core and OpenTelemetry

To use the W3C Id format in a .NET application the DefaultIdFormat static property may be set during application startup:

System.Diagnostics.Activity.DefaultIdFormat = System.Diagnostics.ActivityIdFormat.W3C;

With the above setting ASP.NET Core tries to parse incoming request headers and create a new Activity with the parsed values. An Activity is a specially crafted type for the purpose of capturing metrics. It has a static Current property and a (non-static) Parent property. When a new Activity is created the parent-child relationship is maintained by the Activity itself. An Activity exposes and maintains properties for all aspects of a trace context. It has TraceId, TraceStateString, Recorded, SpanId, etc. properties. An Activity also measures a given operation: when started the current timestamp is recorded, when stopped a Duration is calculated. Using an Activity we may measure custom operations in our source code, and integrate it straight with the BCL provided metrics. By default ASP.NET Core creates an activity for all incoming requests. When traceparent and tracestate headers are provided, the Activitiy is created with a null parent, but the TraceId and SpanId are captured from the incoming request. An Activity and child activities preserve the operation context throughout the context of a request, but it updates and mutates the current span Id. Activities use AsyncLocal<T> to propagate the current context and hierarchy of parent-child activities across threads. Outgoing http requests also create new Activity, which at this point should be a child of a given hierarchy. The W3C traceparent and tracestate headers are enriched with the Id-s of this Activity.

There are two major ways to manually create an Activity:

  • New up an instance using the constructor of the Activity class, and pass a name for the current operation.

  • Use ActivitySource type to create a new Activity.

Conclusion

In this post I looked into the W3C tracecontext standard. Defined the four parts of the traceperent header value. I also introduced Activity type to show how .NET leverages this type to trace operations.In the next post I will dive into creating and recording Activities with either of the above methods, as the creation of an Activity affects how it may be recorded, and how OpenTelemetry records it. This is a key decisions to make during the implemention of APM.