Using Local LLM with .NET

In this post I explore how one can run a LLM locally to create a chat assistant with .NET 8. I use two libraries:

There is an excellent blog post on Demystifying Retrieval Augmented Generation with .NET by Stephen Toub describing how to use the SemanticKernel library to create a chat agent.

LLamaSharp is a library that can run local LLaMA/GPT model easily and fast in C#. It uses llama.cpp under the hood. I use version 10.0.0 along with semantic kernel version 1.3.0.

Find out more


Range Kata

I have recently come across the Range kata. In this post I dive into a variation of this kata using integer numbers. One implementation is A Range kata implementation in C# by Mark Seemann focuses on property testing and comparing Haskell, F# and C# implementations.

The referenced post has chosen using Church encoding as the implementation which results in a more complicated code in C#. In this post I will focus on building on the recently added features of .NET 8: IBinaryInteger<T>.

My test cases cover the samples described by the kata. Although I don't find these test cases extensive, they give a good enough starting point. That said, do not expect the code below to handle edge-cases that has no test case detailed in the kata.

This implementation utilizes the following relatively new C# features:

Find out more


Source Generated JSON Serialization using fast path in ASP.NET Core

In this post I explore how ASP.NET Core works together with the source generated JSON serializer. In .NET 8 the streaming JSON serialization can be enabled to use serialization-optimization mode with ASP.NET Core. However, using the default settings does not enable this fast-path. In this post I will explore how to enable the fast-path serialization by exploring the inner workings of JsonTypeInfo.cs. For this post I use .NET 8 with the corresponding ASP.NET Core release.

The .NET 8 runtime comes with three built-in JSON serializer modes:

  • Reflection

  • Source generation (Metadata-based mode)

  • Source generation (Serialization-optimization mode)

Reflection mode is the default one. This mode is used when someone invokes JsonSerializer.(De)SerializeAsync without additional arguments passed. Source Generation (Metadata-based mode) as the name suggests generates source code at compile time that contains the metadata that the Reflection would need to discover during its first invocation. This mode helps to enable JSON serialization in environments where reflection otherwise would not be possible, for example in case of AOT published applications. Source generation (Serialization-optimization mode) also generates code at compile time, but besides the metadata it also uses a generated fast-path method when serializing objects.

Find out more


Exploring DATAS

.NET 8 introduced a new garbage collection feature: Dynamic Adaptation To Application Sizes or DATAS. The inner workings of it are detailed in this post.

This feature tunes the GC to adjust the number of heaps and execute compacting garbage collections more often to have a smaller overall heap size. The result should be a heap size that more closely resembles the actual amount of memory that the application uses.

This feature is great for memory constrained environments or with uneven loads, where memory may be reclaimed between the loads.

In this post I will enable DATAS on my test server application and test client side application to observe the differences in the memory utilization.

Find out more