Ref Returns

C# has recently introduced some new features (version 7.0) one of which it is ref returns. I will use this feature in this post to further improve the performance of the Parser created in my previous post.

To refresh the Parser class looked as follows:

public class Parser 
{
  public T Parse<T>(ReadOnlySpan<char> input, PropertySetter[] setters) where T : struct
  {
    T result = default;
    int separator;
    int propertySetterIndex = 0;
    while((separator = input.IndexOf(',')) != -1)
    {
      var parsedValue = int.Parse(input.Slice(0, separator));
      setters[propertySetterIndex++].SetValue(ref result, parsedValue);
      input = input.Slice(separator + 1);
    }
    return result;
  }
}

To recap it receives an input of comma separated integers that we are parsing into a struct of the user's choice: T. Extending it to other than integer types and fixing the possible ArgumentOutOfRangeException is not a concern of this blog post.

Assumption can be made here, that we always have exactly as many values comma separated in the input as many properties we have on the struct.And why this is good? I will explain in the next section. (this is not a necessity, but will reduce the number of edge cases)

Intermediate step

How can we make this method faster? One idea could be getting rid of copying a struct on the method return. Instead, I expect it to be passed as a ref parameter by the caller. This will improve execution time, because we save some copying (when the whole struct is returned, we copy all fields, compared here just having a reference as an input). At this point though, it is the caller's responsibility to create the struct.In this step, I also did a small refactor to iterate over the setters instead of the input (which I can do, because of the assumption made above).

public void Parse<T>(ReadOnlySpan<char> input, PropertySetter[] setters, ref T result) where T : struct
{
  foreach(var setter in setters)
  {
    var commaIndex = input.IndexOf(',');
    setter.SetValue(ref result, int.Parse(input.Slice(0, commaIndex)));
    input = input.Slice(commaIndex + 1);
  }
}

Benchmarks

Changing from the while loop to the foreach loop, has some slight performance impact, though not significant. According to my measurement the improvement is around couple of percent (which would decrease with longer input messages). It is mostly caused by the different semantics of looping the setters instead of the input. Note, that we can utilize assumption made above, that is why this change is possible.

Comparing the ref foreach loop solution to the original one:

Method

Mean

Error

StdDev

Ratio

RatioSD

MethodGeneric

275.3 ns

5.483 ns

11.92 ns

1.00

0.00

MethodRefFor

260.4 ns

5.243 ns

12.15 ns

0.94

0.07

Let's see the new benchmark method itself:

[Benchmark]
public void MethodRefFor()
{
  var parsed = new Point();
  _parserRef.Parse(_input, _propertySetters, ref parsed);
  _sum += parsed.X + parsed.Y;
}

I use a new Point struct for every measurement. Depending on application semantics, there are several use-cases where we are in a single-threaded environment, or where threads have no common mutating state. With this in mind, we might ask, do we really need a new Point struct for every parsing?What if I could re-use the same Point struct over and over? At this point, I must refer back again to the assumption above: we have just as many numbers on the input as many properties to be set on the struct. This means, we can re-use the same object without needing to clean it up: there are no stale properties possible.

Can I change my parser implementation to incorporate the above ideas?

Intro Ref Returns

In the previous post, I hinted that the above implementation can be further improved from performance point of view. This is where ref returns come into the view.

Ref returns can be used to return a reference, saving the need for a copy of a larger value type.

Some interesting aspects of ret returns:

  • the variable's lifetime extends the lifetime of the method, so the variable's scope must be larger then the method

  • the method returns an alias to a variable

  • design intent was that the calling method can modify the variable

So this is good, how can we use it?

I cannot create a new Point struct in the method, and return as a ref, because of the lifetime issues. However, I could ref return something that is a private field to the Parser class. As I still prefer to remain generic, I need to change the Parser type to be generic:

public class Parser<T> where T : struct
{
  private T _data;

  public ref readonly T Parse(ReadOnlySpan<char> input, PropertySetter[] setters)
  {
    ref T data = ref _data;
    foreach (var setter in setters)
    {
      var commaIndex = input.IndexOf(',');
      setter.SetValue(ref data, int.Parse(input.Slice(0, commaIndex)));
      input = input.Slice(commaIndex + 1);
    }
    return ref data;

  }
}

I have a private struct T _data field in Parser class, note it is not readonly on purpose. I also changed the Parser class to be generic. The implementation of Parse method has also changed. It does no longer expects a ref T as input parameter, but parsed data is set in the local field instead.The return type of the class is ref T, which is returning the _data local field. This enforces a single-thread environment, but responsibility of creating the struct is kept within the Parser. This class now is not thread-safe, but this should be ok:

  • no two threads should parse the same input into the same _data

  • neither should other threads read the struct while it is being parsed (if they need to, they can create a copy of the struct as the ref-less semantics would do).

Why not using ref readonly? The Point struct being used here is not readonly, which means when we access its X or Y properties (even for read), it creates a copy of the struct. This be observed at IL level.

Final Performance Results

The new benchmark method:

[Benchmark]
public void TypeGeneric()
{
  ref Point parsed = ref _parserTypeGeneric.Parse(_input, _propertySetters);
  _sum += parsed.X + parsed.Y;
}

The results show some further improvements:

Method

Mean

Error

StdDev

Ratio

RatioSD

TypeGeneric

235.8 ns

4.732 ns

13.95 ns

0.81

0.06

MethodGeneric

293.6 ns

5.883 ns

14.54 ns

1.00

0.00

MethodRefFor

258.5 ns

5.172 ns

13.72 ns

0.88

0.06

Why is this faster compared to the case when Point is passed by reference in the input? I will investigate that in my next post.