Introduction to APM 1
08/15/2021
6 minutes
In the series of these posts I will look into how we can implement an APM solution for .NET applications. Application Performance Management (APM) helps to monitor and diagnose application performance. There are numerous libraries and tools out there to solve this given problem. In this series of posts I will focus on implementing APM for .NET applications (.NET Framework 461 and above, .NET Core and NET5) with OpenTelemetry.
In the following posts I will look into the following topics, starting with the first one in this post:
W3C correlation Id specification
Creating and recording spans with ActivitySource
Using OpenTelemetry and Jaeger
W3C correlation Id specification
The CorrelationId is defined in the Trace Context W3C specification. In this post I explain the structure of the defined headers and introduce how the ASP.NET Core implementation uses these headers.
The W3C specification for trace context defines two headers: traceparent
and tracestate
.
Tracestate Header
The tracestate
allows to pass vendor specific information. It complements the traceparent
header. According to the specification vendors may pass custom state in this header. Each vendor may concatenate custom key-value pairs to the current value. An example value of the tracestate
header:
tracestate: congo=t61rcWkgMzE
As such it is possible that the tracestate
cannot be or only partially can be parsed. The current ASP.NET Core implementation does not add custom logic to the tracestate
. On each incoming request, it reads the header value as an opaque string
, and sets it on the Activity
spanning the current http request. Respectively outgoing http requests will attach the tracestate
value with the help of the diagnostics delegating handler.
Traceparent Header
Out of the two headers the traceparent
one could be called as the primary header, while the tracestate
complements it with vendor specific information. The traceparent
consists of four parts:
version
traceId
parentId
flags
An example value of the traceparent
header:
traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01
The version is a byte value, at the time of the writing this post, the value is set to 00
. The specification defines that future version changes will be additive to the traceparent
and specifies how a traceparent
value should be parsed in a forward compatible manner.
The traceId is a 16 byte array representing the set of traces. Typically a user request has one unique traceId, which is used on all subsequent traces. New traceIds are generated for operations without a traceId or when the tracing is restarted. The traceId is serialized as a 32HEXDIGLC, which means a 32 char long string which consists of digits and [a-f] lowercase hex characters. The traceId may be parsed as a Guid in C#.
The parentId is an 8 byte array. In most tracing systems this is known as a spanId
. In .NET these Ids are available as a composite Id and as separate Ids on the current Activity.
The trace flags is the 4th section of the traceparent
's value. The caller may add suggestions for the current trace context to the callee with the help of these flags. The value of flags shall be parsed as a bit field. Currently there is a single flag defined, while the other flags are reserved. The sampled flag when set, means that the caller may have recorded the trace data. As an example, a service may decide to sample a trace context only when the call site has denoted that the give trace is recorded. OpenTelemetry ships with four sampler implementations. The ParentBasedSampler
type takes a sample if parent Activity or any linked Activity is recorded uses this flag.
Using W3C TraceContext with ASP.NET Core and OpenTelemetry
To use the W3C Id format in a .NET application the DefaultIdFormat static property may be set during application startup:
System.Diagnostics.Activity.DefaultIdFormat = System.Diagnostics.ActivityIdFormat.W3C;
With the above setting ASP.NET Core tries to parse incoming request headers and create a new Activity
with the parsed values. An Activity
is a specially crafted type for the purpose of capturing metrics. It has a static Current property and a (non-static) Parent property. When a new Activity is created the parent-child relationship is maintained by the Activity itself. An Activity exposes and maintains properties for all aspects of a trace context. It has TraceId, TraceStateString, Recorded, SpanId, etc. properties. An Activity also measures a given operation: when started the current timestamp is recorded, when stopped a Duration is calculated. Using an Activity we may measure custom operations in our source code, and integrate it straight with the BCL provided metrics. By default ASP.NET Core creates an activity for all incoming requests. When traceparent
and tracestate
headers are provided, the Activitiy is created with a null parent, but the TraceId and SpanId are captured from the incoming request. An Activity and child activities preserve the operation context throughout the context of a request, but it updates and mutates the current span Id. Activities use AsyncLocal<T>
to propagate the current context and hierarchy of parent-child activities across threads. Outgoing http requests also create new Activity, which at this point should be a child of a given hierarchy. The W3C traceparent
and tracestate
headers are enriched with the Id-s of this Activity.
There are two major ways to manually create an Activity
:
New up an instance using the constructor of the
Activity
class, and pass a name for the current operation.Use
ActivitySource
type to create a new Activity.
Conclusion
In this post I looked into the W3C tracecontext standard. Defined the four parts of the traceperent
header value. I also introduced Activity
type to show how .NET leverages this type to trace operations.In the next post I will dive into creating and recording Activities with either of the above methods, as the creation of an Activity affects how it may be recorded, and how OpenTelemetry records it. This is a key decisions to make during the implemention of APM.