Allocation free algorithms

With latest C# we have a handful of new constructs to create more performant applications.

C# Challenge: allocation free algorithms

When talking 'allocation free' we mean heap allocations, while on the stack memory is still going to be allocated similarly as before.

Why is this important?

Allocations are relatively inexpensive in .Net Core/Framework. Of course, there is some overhead associated with, but eventually a pointer is incremented within Gen0, when we talk about reference types.

When memory is being deallocated, the Garbage Collector is there to help the developers. But it is not for free, a full GC can take up a lot of resources, and even the application's execution might be frozen for a short period of time. This is usually negligible in most application, but for some low level algorithms not running in nanosecond time frame can hurt performance. Typical example here is a router forwarding millions of packages in each second.

Checksum

In this post I will take a look at one allocation free algorithm written in C# : Calculate the checksum of FIX messages.

The Financial Information eXchange protocol is an electronic communications protocol for real-time exchange of information related to the securities transactions and markets.

We will validate the value of 'tag 10' which is the last tag of the message. It is composed of three characters by summing the ASCII value of all characters in the rest of the message, and performing modulo on the result.

We are given a byte[] or a Span<T>, and we return a bool indicating if the checksum is correct.

public bool IsValid(ReadOnlySpan data)
{
  var endingTagPos = msgContext.ChecksumTagStartIndex;
  if(endingTagPos < 0 || (endingTagPos + KnownFixTags.Checksum.Length + ChecksumLength + 1) != data.Length)
  {
    return false;
  }

  int sum = 1;
  for(int i = 0; i < endingTagPos; i++)
  {
    sum += data[i];
  }
  int expectedChecksum = sum % Modulus;

  Span<byte> expectedDigits = stackalloc byte[ChecksumLength];
  _converter.Convert(number: expectedChecksum, into: expectedDigits, count: ChecksumLength);
  var receivedChecksum = data.Slice(endingTagPos + KnownFixTags.Checksum.Length, ChecksumLength);
  return receivedChecksum.SequenceEqual(expectedDigits);
}

In the first part of this method, we find the start of the checksum tag. ChecksumTag stores a pre-calculated value of the tag name in bytes. Once LastIndexOf returns, we validate if the tag exists (-1 returned if not found) and we validate that this is the last tag of the message.

The second part sums up all bytes in the rest of the application: from the beginning until tag 10's index. Then we calculate the modulo operation for the expected result. So far we had no issues on avoiding heap allocations.

The last part is the most tricky part, we need to compare the results. There are two options to do this:

  • We try to parse the message's checksum into an integer for comparison

  • We calculate the byte representation of the integer and compare this with the message

The first is easier to achieve but it is the slower solution. In the second it is very tempting to call ToString("000") on the integer, but this would allocate a string, which we want to avoid.

Instead, we allocate a byte array (a Span of bytes) on the stack. We can do this safely as

  • this is not a recursive method

  • we have a fixed length of bytes allocated every time

so we can minimize the possibility of stackoverflow.

The convert method is a helper class which uses pre-calculated ASCII byte representations of digits. It takes each digit of the expected checksum integer and sets the digits on expectedDigits Span.

Finally, we compare the byte sequence of the received message with the calculated one.