Lookups with Switch Expressions
03/30/2024
7 minutes
In a recent PR I have optimized parsing known header values, such CacheControlHeaderValue
or MediaTypeHeaderValue
.
The original code used a Dictionary<>
to look up a Func<string, T?>
parsing method, with a string input and a generic type T
return type. In this blog post I will summarize the pros and cons of different alternatives that were considered as part of the optimization.
FrozenDictionary
This dictionary contains known Func<string, T?>
parsers. These known parsers do not change at runtime. Using a FrozenDictionary would bring performance benefits for the lookup operations. However, freezing the dictionary itself takes time at runtime, because when freezing the new dictionary builds an efficient lookup. This is usually worth the trade-off as freezing only needs to happen once during the lifetime of the application. However, in this case all the contents of this dictionary are known at compile time.
One suggestion could be to avoid the runtime costs by building the lookup data structure at compile time.
Switch
Using switch expression could move building the lookup data structure to compile time. C# compiler generates a decision tree to find the right parsing method based on the generic type parameter. Using switch expressions with Type Pattern Matching could an excellent tool. However, as only the return type of the method is generic, the C# 12 language does not allow pattern matching on this generic type parameter.
This means using the switch expression over the generic type parameter is not a feasible solution. Instead, the code could look up the name of the type with reflection and use switch statements build a decision tree based on the first character of the type name:
private static T? GetSwitch<T>(Dictionary<string, string> headers, string name) { var value = headers[name]; var tn = typeof(T).Name; object? temp = null; switch (tn[0]) { case 'C' when typeof(T) == typeof(CacheControlHeaderValue): temp = ParseCacheControlHeaderValue; break; case 'C' when typeof(T) == typeof(ContentDispositionHeaderValue): temp = ParseCacheContentDispositionHeaderValue; break; // Other cases } // Invokking the parser // returning the results }
Once the switch decides on the branch, we still need to compare the actual type of the generic type parameter with the known parser type. Otherwise, we would not be able to distinguish CacheControlHeaderValue
and ContentDispositionHeaderValue
or someone could create a type with the same name and different namespace. Also notice that switch statement uses a named method ParseCacheControlHeaderValue as a delegate instead of an anonymous method. This allows us to refer to the parser methods at compile time by its name.
If-Else
A suggestion has been raised by the .NET team to simply the proposal above by using if-else branches that compare actual type of the generic type parameter with the known types. This suggestion turns out to be beneficial for performance. The sample below shows the idea for the suggested code structure:
private static T? GetByTypeCast<T>(Dictionary<string, string> headers, string name) { var value = headers[name].ToString(); if (typeof(T) == typeof(CacheControlHeaderValue)) { return (T?)(object?)ParseCacheControlHeaderValue(value); } else if (typeof(T) == typeof(ContentDispositionHeaderValue)) { return (T?)(object?)ParseCacheContentDispositionHeaderValue(value); } // Other cases
Using the double casting is unavoidable to make the C# 12 compiler happy. More precisely it is avoidable only when the method delegates (like ParseCacheControlHeaderValue) are casted to Func<string, T>
before at the callsite.
Comparison
The codebase refers to 9 known parsers. Most of these parsers return a reference type result, except two parsers that return DateTime
and long
value. In .NET 8 the JIT got smarter and even in Tier 0 it can apply dead code elimination. Given the JIT compiles separate code for generic value types, but the same canon for reference types, it can eliminate some of the branches of the method in question. For example, invoking the method GetByTypeCast with a long
type parameter GetByTypeCast<long>(...)
will only contain the parsing logic for the case that handles the long
return types. Similarly, invoking GetByTypeCast<CacheControlHeaderValue>(...)
will result code that eliminates the branches dedicated for the value types.
Value Types
This is key when further looking into the code jitted for the value types. For example, when the generic type parameter is long
, the JIT can simplify the code further as shown below. The result assembly (machine) code of the method for the if-else solution is reduced to a simple type check and the parsing logic.
; Assembly listing for method BenchmarksAspNetCore:GetByTypeCast[System.Nullable`1[long]](System.Collections.Generic.Dictionary`2[System.String,System.String],System.String):System.Nullable`1[long] (Tier1) ; Emitting BLENDED_CODE for X64 with AVX - Windows ; Tier1 code ; optimized code ; rsp based frame ; partially interruptible ; PGO data available, profile data was all zero ; 0 inlinees with PGO data; 1 single block inlinees; 0 inlinees without PGO data G_M000_IG01: ;; offset=0x0000 push rbx sub rsp, 32 mov rbx, rcx G_M000_IG02: ;; offset=0x0008 mov rcx, rdx mov rdx, r8 cmp dword ptr [rcx], ecx call [System.Collections.Generic.Dictionary`2[System.__Canon,System.__Canon]:get_Item(System.__Canon):System.__Canon:this] cmp byte ptr [rax], al mov rcx, rbx mov rdx, rax call [BenchmarksAspNetCore:ParseCacheInt64(System.String):System.Nullable`1[long]] mov rax, rbx G_M000_IG03: ;; offset=0x0027 add rsp, 32 pop rbx ret ; Total bytes of code 45
The switch statement based implementation can also apply the dead code elimination optimization, however it still emits the code required for building the decision tree. Hence it results in a total of 347 bytes of code. At runtime it also needs to execute more code as it looks up the generic type parameters name using reflection and then walks through the decision tree:
call qword ptr [7FF807706DA8]; System.RuntimeType.GetCachedName(System.TypeNameKind) cmp dword ptr [rax+8],0 jbe near ptr M01_L09 movzx r14d,word ptr [rax+0C] cmp r14d,45 jbe short M01_L04 cmp r14d,4D je short M01_L00 cmp r14d,4E
Reference Types
The case for reference types is more interesting as less code can be eliminated, so the if-else based solution might need to evaluate more branch conditions. In the case of the if-else in the best case 1 and in the worst case 7 branch conditions are evaluated. In average that is 4 conditions.
In case the of the switch statements the code generates a lookup tree that evaluates in 2-4 conditions and an additional comparison of the type parameter and the known type. While in certain conditions this result less comparisons to the if-else solution, it comes at an extra cost of loading the type name of the generic type parameter and reading its first character:
call qword ptr [7FF8076E6DA8]; System.RuntimeType.GetCachedName(System.TypeNameKind) cmp dword ptr [rax+8],0 jbe near ptr M01_L23 movzx r15d,word ptr [rax+0C]
Conclusion
Eventually the if-else solution provides better overall performance characteristics, hence that solution has been chosen as the accepted implementation in the pull request.
Note, that this does not provide a general rule of thumb for deciding whether an if-else like solutions are better to switch statements or switch expressions. This investigation is very specific to the described case: comparing generic type parameters with known types in C# 12. The actual types of the known types are also important factor. Given all known types were reference types or value types, one or the other solution would easily swing the pendulum.
While the purpose of this post was not to give a rule of thumb, it offers the background to better understand and replicate such analysis in tight loops that also required to be optimized.