Heap Allocation free Parsing: structs
03/23/2019
10 minutes
I have been working on a library RapideFix to parse fix messages. One of the challenges is to parse a message into a struct with the least amount of allocations possible. That means to allocate only for the references (if any) that are part of the struct's fields.
For the sake of this post, I will use a very simple struct with 2 integer properties and the parser method will set values to them. However, the code should be dynamic, so the user can replace the struct to any other custom one.
Starting Point
So, let's start with a sample code, which is incorrect after all, but it demonstrates what I want to achieve (if it worked correctly).So what I want to do is that given a string with two integers separated by commas "20,19,"
, and the parser should set the integer values on a struct, Point
:
public struct Point { public int X { get; set; } public int Y { get; set; } }
And code for the demonstration:
public class Program { public static void Main(string[] args) { var parsedValue = Parse("20,19,"); Console.WriteLine(parsedValue.X); Console.WriteLine(parsedValue.Y); } public static Point Parse(ReadOnlySpan<char> input) { var result = new Point(); var separator = input.IndexOf(','); var x = int.Parse(input.Slice(0, separator)); var y = int.Parse(input.Slice(separator + 1)); SetValues(result, x, y); return result; } public static void SetValues(Point point, int x, int y) { point.X = x; point.Y = y; } }
First Attempt
There are many problems with it. First, it not generic at all. Secondly, it does not work as it prints out 0,0. Let's work on these issues through a series of changes to correct the above example.
As a first step, making it more generic (yes both ways, you will see).
public class Program { public static void Main(string[] args) { var parsedValue = Parse<Point>("20,19,", typeof(Point).GetProperties()); Console.WriteLine(parsedValue.X); Console.WriteLine(parsedValue.Y); } public static T Parse<T>(ReadOnlySpan<char> input, PropertyInfo[] properties) where T : struct { T result = Activator.CreateInstance<T>(); int separator; int propertySetterIndex = 0; while((separator = input.IndexOf(',')) != -1) { var parsedValue = int.Parse(input.Slice(0, separator)); SetValues(result, properties[propertySetterIndex++], parsedValue); input = input.Slice(separator + 1); } return result; } public static void SetValues(object data, PropertyInfo property, int value) { property.SetValue(data, value); } }
So how much better is this? First, the signature of the Parse method has changed, it has a generic type parameter and a second argument for the list of properties to be set. I also have changed the implementation to be more generic. I iterate over the input value by the commas and the given value is parsed to an integer and set with a property setter. (Yes, it could index out of the array when the length of the properties array is less than the comma separated values in the input, but I will not deal with this for the purpose of this post.)
Let's focus more on the way I set values in the SetValues
method. It has an object parameter because it could be any type, and it uses reflection to set the value. There are just too many problems with this, hence the rest of this post will focus on this method.
It will still print 0 and 0, but at least I can see the values are getting set in debug mode on the properties.
It is slow
It allocates memory on the heap
It is not truly generic
To prove my point, here is a simple benchmark using BenchmarkDotNet:
BenchmarkDotNet=v0.11.4, OS=Windows 10.0.17134.590 (1803/April2018Update/Redstone4), VM=Hyper-V Intel Xeon CPU E5-2673 v3 2.40GHz, 1 CPU, 2 logical and 2 physical cores .NET Core SDK=2.2.101 [Host] : .NET Core 2.1.8 (CoreCLR 4.6.27317.03, CoreFX 4.6.27317.03), 64bit RyuJIT Core : .NET Core 2.1.8 (CoreCLR 4.6.27317.03, CoreFX 4.6.27317.03), 64bit RyuJIT Job=Core Runtime=Core | Method | Mean | Error | StdDev | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op | |--------------- |---------:|---------:|---------:|------------:|------------:|------------:|--------------------:| | ParseBenchmark | 958.3 ns | 19.18 ns | 55.96 ns | 0.0362 | - | - | 248 B |
So, can we do better?
Some can argue that the boxing ongoing with the structs are causing the problem, and I shall return the object from SetValues
method.Doing so does resolve the issue, but still it is a terrible solution:
public T Parse<T>(ReadOnlySpan<char> input, PropertyInfo[] properties) where T : struct { T result = Activator.CreateInstance<T>(); int separator; int propertySetterIndex = 0; while((separator = input.IndexOf(',')) != -1) { var parsedValue = int.Parse(input.Slice(0, separator)); result = (T)SetValues(result, properties[propertySetterIndex++], parsedValue); input = input.Slice(separator + 1); } return result; } public object SetValues(object data, PropertyInfo property, int value) { property.SetValue(data, value); return data; }
Now, 20 and 19 are printed on the console, but the code is not cleaner, as cast is there, still allocating on the heap, and still not generic.
Let's make SetValues Generic
First, I will rename it to SetValue
, as it really sets only a single value at a time. I also replace object to a generic type parameter.
public T SetValue<T>(T data, PropertyInfo property, int value)
This looks better, but now it prints 0,0 again... but at least we are not boxing, right? No, that is not true either. PropertyInfo's SetValue method expects an object for the target and the value, thus both will become boxed. Let's look at the IL:
IL_0001: ldarg.2 IL_0002: ldarg.1 IL_0003: box !!T IL_0008: ldarg.3 IL_0009: box [System.Runtime]System.Int32 IL_000e: callvirt instance void [System.Runtime]System.Reflection.PropertyInfo::SetValue(object,...
There are two boxing happening:
box the data argument (!!T denotes generic type argument)
box the integer value
To fix this, it seems propertyInfo::SetValue
has to go. But what other options we have?
Let's make it more generic, so the value being set is a generic too, also note the ref arguments, and the point it returns with void
.
public void SetValue<TTarget, T>(ref TTarget data, PropertyInfo property, T value)
Now it is time to change the implementation as follows:
public void SetValue<TTarget, T>(ref TTarget data, PropertyInfo property, T value) { GetTypedILSetterAction<TTarget, T>(property).Invoke(ref data, value); }
This invokes only a GetTypedILSetterAction
method, which returns some kind of an Action
which is invoked. Invoking this method sets the value on the data. How does GetTypedILSetterAction
look like?
private ActionRef<TTarget, TypeOfProperty> GetTypedILSetterAction<TTarget, TypeOfProperty>(PropertyInfo property) { if(_propertySetter == null) { var sourceType = typeof(TypeOfProperty); if(!property.PropertyType.IsAssignableFrom(sourceType)) { throw new ArgumentException($"Property {property.Name}'s type is not assignable from type {sourceType.Name}"); } var dynamicMethod = new DynamicMethod("SetTypedValueFor" + property.DeclaringType.FullName + property.Name, null, new[] { typeof(TTarget).MakeByRefType(), sourceType }, typeof(PropertySetter)); var ilGenerator = dynamicMethod.GetILGenerator(); ilGenerator.Emit(OpCodes.Ldarg_0); ilGenerator.Emit(OpCodes.Ldarg_1); ilGenerator.Emit(OpCodes.Call, property.SetMethod); ilGenerator.Emit(OpCodes.Ret); _propertySetter = dynamicMethod.CreateDelegate(typeof(ActionRef<TTarget, TypeOfProperty>)); } return (ActionRef<TTarget, TypeOfProperty>)_propertySetter; }
Simple enough, it creates a delegate, and returns that delegate typed ActionRef<TTarget, TypeOfProperty>
. It means it creates a method which has 2 parameters one typed ref TTarget
the other TypeOfProperty
. Here is the ActionRef delegate:
private delegate void ActionRef<TTarget, T>(ref TTarget target, T value);
I need a custom delegate type so the first parameter can be a ref.
Inside GetTypedILSetterAction
I create a method from IL instructions. Load both arguments on the evaluation stack, finally call the property's setter (the SetMethod). Note, that this is the real setter of the property, not the one used earlier with reflections.
Creating classes
Now, this is just not yet enough. It means that I generate a method for each property to be set, but I need to cache this method in the _propertySetter
field, otherwise I will hit of creating the delegate every time I would set a value. So, I move all these parsing logic to a new class:
public class PropertySetter { private delegate void ActionRef<TTarget, TValue0>(ref TTarget target, TValue0 value0); private Delegate _propertySetter; public PropertySetter(PropertyInfo property) { typeof(PropertySetter).GetMethod(nameof(GetTypedILSetterAction), BindingFlags.Instance | BindingFlags.NonPublic) .MakeGenericMethod(property.DeclaringType, property.PropertyType) .Invoke(this, new[] { property }); } public void SetValue<TTarget, T>(ref TTarget targetObject, T parsedValue) { ((ActionRef<TTarget, T>)_propertySetter)(ref targetObject, parsedValue); } private void GetTypedILSetterAction<TTarget, TypeOfProperty>(PropertyInfo property) { var dynamicMethod = new DynamicMethod("SetTypedValueFor" + property.DeclaringType.FullName + property.Name, null, new[] { typeof(TTarget).MakeByRefType(), typeof(TypeOfProperty) }, typeof(PropertySetter)); var ilGenerator = dynamicMethod.GetILGenerator(); ilGenerator.Emit(OpCodes.Ldarg_0); ilGenerator.Emit(OpCodes.Ldarg_1); ilGenerator.Emit(OpCodes.Call, property.SetMethod); ilGenerator.Emit(OpCodes.Ret); _propertySetter = dynamicMethod.CreateDelegate(typeof(ActionRef<TTarget, TypeOfProperty>)); } }
The only new part here is the constructor, which invokes a generic method by reflection. This is needed because PropertySetter class is still not generic, which will make usage of this type easier. Shall I worry on the performance of the constructor, as Reflection will not be fast?No, it will never be invoked from the hot path, so it should be fine.Why is it good if the type is not generic? I find it easier to handle it in collections, which I will need when it comes to a list of properties.
The generic SetValue
method is also move to the new type, so I can still invoke it with actual types.
For using the new type, I also add an additional type Parser
to handle the iteration of the input data:
public class Parser { public T Parse<T>(ReadOnlySpan<char> input, PropertySetter[] setters) where T : struct { T result = Activator.CreateInstance<T>(); int separator; int propertySetterIndex = 0; while((separator = input.IndexOf(',')) != -1) { var parsedValue = int.Parse(input.Slice(0, separator)); setters[propertySetterIndex++].SetValue(ref result, parsedValue); input = input.Slice(separator + 1); } return result; } }
And here is how that can be invoked, note that I use a link to crate the PropertySetter
array in the parse method. Normally, this would be done once (for example during application startup), and then re-used.
Note, that this method can only parse integers because of the
int.Parse
method call, it can be easily extended to other types, and it being generic too. Currently it is not a concern of this blog post.
var parser = new Parser(); var parsedValue = parser.Parse<Point>("20,19,", typeof(Point).GetProperties().Select(x => new PropertySetter(x)).ToArray());
Running some benchmarks on the new code show a lot less allocations and much (2.7 times) faster code:
| Method | Mean | Error | StdDev | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op | |--------------- |---------:|---------:|---------:|------------:|------------:|------------:|--------------------:| | ParseBenchmark | 352.8 ns | 7.221 ns | 17.44 ns | 0.0029 | - | - | 24 B |
But I can still improve this. There is still 24 byte being allocated, where does that come from? It seems to be just the size of a (on 64 bit architecture) syncblock (8 bytes), MT (8 bytes), and 2 integers (2*4 bytes). Is that familiar? It seems the struct is still allocated on the heap:
T result = Activator.CreateInstance<T>();
One simple solution to this, is using default
which I can as the method's generic type is constrained to structs:
public T Parse<T>(ReadOnlySpan<char> input, PropertySetter[] setters) where T : struct { T result = default; ...
And that results are even better for memory usage and for performance as well:
| Method | Mean | Error | StdDev | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op | |--------------- |---------:|---------:|---------:|------------:|------------:|------------:|--------------------:| | ParseBenchmark | 261.7 ns | 5.242 ns | 13.44 ns | - | - | - | - |
At this point, I am happy with these results for the day. There are still some changes that could be done to further improve the performance, but I will leave that for the upcoming blog post.