Introduction to APM

In the series of these posts I will look into how we can implement an APM solution for .NET applications. Application Performance Management (APM) helps to monitor and diagnose application performance. There are numerous libraries and tools out there to solve this given problem. In this series of posts I will focus on implementing APM for .NET applications (.NET Framework 461 and above, .NET Core and NET5) with OpenTelemetry.

In the series of these posts I will look into the following topics. The current post looks into creating and recording spans.

  • W3C correlation Id specification

  • Creating and recording spans with ActivitySource

  • Using OpenTelemetry and Jaeger

OpenTelemetry

Opentelemetry is a collection of tools and APIs, to instrument, collect, and export telemetry data about a software's performance and behavior.

This post gives a quick overview how we can use OpenTelemetry package to collect diagnostics from ActivitySources and to push them to Jaeger in a .NET application.

First of all, add a NuGet reference to OpenTelemtry package:

<PackageReference Include="OpenTelemetry" Version="1.0.1" />

Secondly, configure a tracer during application startup. Create a new TracerProvider field named tracer and set its value as follows:

 tracer = Sdk.CreateTracerProviderBuilder()
                .SetResourceBuilder(ResourceBuilder.CreateEmpty().AddService("ConsoleApp"))
                .SetSampler(new ParentBasedSampler(new AlwaysOnSampler()))
                .AddSource("ConsoleApp2.*")
                .Build();

In the first line, we create a default tracer provider builder, which is customized later on. SetResourceBuilder sets a name for this application. This code adds the current application as a service to the traces, named ConsoleApp. This is the name used to reprent this application in the trace monitoring sytems.

In the second line, set a sampler. A sampler decides which activities or traces should be sampled. The W3C Correlation Id carries a flag denoting if that span is marked as recorded by upstream services. The code above uses a mixed strategy: it records activities if the recorded flag is set. Otherwise it uses an always on sampler, meaning all activities are sampled. Another approach could be to sample activities based on a predefined ratio, say 30% of the activities should be recorded.

Third line, the AddSource() method can filter which ActivitySources should be listened to in the current application. AddSource() method accepts names with wildcards. In the example above all activity sources are recorded which has a name starting with "ConsoleApp2". With a good naming strategy of ActivitySources, it is easier to filter activities. For example using a hierarchical pattern similar to namespaces of classes may enable to record activities in certain modules only. Personally, I find FullName of the classes as a good choice for ActivitySource names. That gives an easy way to filter sources. Third party libraries can be also added as sources using their naming pattern.

Asp.Net Core also uses activities from ActivitySource which could be also listened by OpenTelemetry. Those activities are created by a source named Microsoft.AspNetCore.Hosting. Finding the right sources to listen to gives a head start to the developer, so he/she can focus on the domain related operations to monitor.

Finally, dispose the tracer on application shutdown to flush all recorded spans to the collector.

Jaeger

To correlate, visualize and analyze all the captured spans we need a tool. There are many tools available for the job, such as Jaeger, Zipkin, AppInsights etc. Most of them supports the W3C standard of correlation Id-s and OpenTelemetry. In this post I will show how Jaeger can be set up as exporter.

One can even create a custom exporter to export data to log files or to any custom downstream system.

First add a new NuGet package to the solution:

<PackageReference Include="OpenTelemetry.Exporter.Jaeger" Version="1.0.1" />

To enable the Jaeger exporter, update the tracer provider builder:

.AddSource("ConsoleApp2.*")
.AddJaegerExporter(options =>
{
  options.ExportProcessorType = ExportProcessorType.Simple;
  options.AgentHost = "my-jaeger-instance.northeurope.azurecontainer.io";
})
.Build();

In this sample a Simple exporter is set up. In production application a Batch exporter processor might be more efficient and beneficial. With batch exporter we can fine tune the behavior of how often and how large batches are exported.

We can also configure the AgentHost and AgentPort to a custom address where the Jaeger service is running. In this example I run Jaeger in Azure Container Instances for testing purpose, hence I updated the AgentHost property. Jaeger uses a good set of ports, the default agent port 6831. The UI can be accessed on port 16686. In case of a local installation the http://localhost:16686 will serve the UI endpoint for the data analysis.

To host Jaeger in as a container instance, one can create a new instance by using the following image: jaegertracing/all-in-one. However, for production use cases the endpoints should be secured, and the memory storage should be swapped with persistent storage solution.

Conclusion

In the previous posts I have looked into the W3C Correlation Id standard, creating and listening to activities. In this post I introduced OpenTelemetry SDK to filter and record and export activitities as Spans. This should give us a good overview on how to get started with APM for .NET applications.