If it is not string

Introduction

In this post I am looking into the internals of the is not pattern, which is introduced in C# 9. This pattern allows the code to be more expressive. Assume to have a method with a type of an object input parameter, which needs validation. The method may only work if the object is a 2 character long string value. For example, we would like to validate that the string is a two letter country code.

C# 9 allows is not patterns. Using this pattern, we can express the above example with the following code:

using System;
object o1 = "UK";
if (o1 is not string countryCode || countryCode.Length != 2)
    Console.WriteLine("Invalid country code");
else
    Console.WriteLine(countryCode);

Note, that the actual country code is not being validated, we only look at if the string confirms with the format expected.

IL Code

I used ILSpy to look at the Release build of the above sample. ILSpy is a great tool as it also reflects back the original C# code based on the IL.

// string text = "UK" as string;
IL_0000: ldstr "UK"
IL_0005: isinst [System.Runtime]System.String
IL_000a: stloc.0
// if (text == null || text.Length != 2)
IL_000b: ldloc.0
IL_000c: brfalse.s IL_0017

IL_000e: ldloc.0
IL_000f: callvirt instance int32 [System.Runtime]System.String::get_Length()
IL_0014: ldc.i4.2
IL_0015: beq.s IL_0022

// Console.WriteLine("Invalid country code");
IL_0017: ldstr "Invalid country code"
IL_001c: call void [System.Console]System.Console::WriteLine(string)
// }
IL_0021: ret

// Console.WriteLine(text);
IL_0022: ldloc.0
IL_0023: call void [System.Console]System.Console::WriteLine(string)
// (no C# code)
IL_0028: ret

Based on the IL it visible that the is pattern is still using isinst IL instruction to determine the type of the object which is on top of the evaluation stack. In this case the string "UK" is loaded on the top of the evaluation stack by the first instruction. The rest of the code does what a developer would code with the as type-testing operator:

using System;
object o1 = "UK";
var countryCode = o1 as string;
if (countryCode == null || countryCode.Length != 2)
  Console.WriteLine("Invalid country code");
else
  Console.WriteLine(countryCode);

After testing against the string type, if it is null, OR it is string, but not with the length of two, then writes an error message on the console. Otherwise it writes out the country code. It is up to the reader to decide which style is preferred. With the is not pattern I like how an explicit null check is avoided as well it fits closer to how you would express the intention with natural language. On the other hand, I prefer less how the countryCode variable is declared within the pattern inside the if statement, making it more difficult read (or we just need to train our eyes for this).

Performance

Performance wise one could expect that both run with the same performance and it seems so as well. Using BenchmarkDotNet to measure the performance of the two implementations, I do not see a significant difference.

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19042
Intel Core i5-1035G4 CPU 1.10GHz, 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=5.0.101
  [Host]     : .NET Core 5.0.1 (CoreCLR 5.0.120.57516, CoreFX 5.0.120.57516), X64 RyuJIT
  DefaultJob : .NET Core 5.0.1 (CoreCLR 5.0.120.57516, CoreFX 5.0.120.57516), X64 RyuJIT


|    Method |  o1 |      Mean |     Error |    StdDev |
|---------- |---- |----------:|----------:|----------:|
| IsPattern |  UK | 0.8238 ns | 0.0327 ns | 0.0305 ns |
| AsPattern |  UK | 0.8718 ns | 0.0367 ns | 0.0343 ns |
| IsPattern | USA | 0.5688 ns | 0.0399 ns | 0.0444 ns |
| AsPattern | USA | 0.5442 ns | 0.0416 ns | 0.0446 ns |

One interesting point is that the input with 3 characters runs faster compared to the 2 character version. Based on further measurements the difference is due the jump operation when the else branch of the if statement is hit.