Switch Expression or Visitor pattern

A few months ago I looked into optimizing dictionary lookups using switch expressions.

In this post I investigate a seemingly similar problem, but with vastly different constraints, that eventually demand a vastly different solution.

Problem Definition

Given a set of polymorphic objects and a DTO. A method needs to iterate all the polymorphic objects and evaluate them against the DTO. Each polymorphic object evaluates a different (C#) property of the DTO. While there are a few designs to achieve this, in this case I seek for a design that does not introduce a coupling between the polymorphic objects and the DTO. I compare two possible solutions in this post.

First, let me show a minimal version of the polymorphic objects discussed above:

public abstract class Value
{
}

public class AValue : Value
{
    public double A { get; set; }
}

public class BValue : Value
{
    public double B { get; set; }
}

// Similar for CValue, DValue, EValue, FValue, ...

In this post I use double value in place of the more complex 'DTO' struct used in the real-life example. Each object needs to compare its double property (A, B, etc.) with the DTO. Note, that the Value type or the derived AValue, BValue etc. types must not reference the DTO and vice-versa to maintain decoupling.

Possible Solutions

The first solution uses the C# switch expressions feature in conjunction with type pattern matching.

In the snippet below I created an enclosing type that has fields of _data referring to the polymorphic collection of objects and _comperand as the DTO. This type knows how to compare the DTO with the Value types. For simplicity, in the example below all polymorphic objects are compared the same way with the DTO. In the real-life example these would be different properties and operations for each derived type.

// fields excluded from this sample

public int SwitchExpressionType()
{
    int sum = 0;
    for (int i = 0; i < _data.Length; i++)
    {
        sum += _data[i] switch
        {
            AValue a => a.A > _comperand ? 1 : 0,
            BValue b => b.B > _comperand ? 1 : 0,
            // ...
            _ => throw new NotImplementedException()
        };
    }
    return sum;
}

The second approach uses the visitor pattern. A visitor object encapsulates the comparisons logic. The polymorphic types can reference the visitor, while avoiding referencing the DTO:

public class Visitor(Value[] data, double comperand)
{
    public int Sum { get; set; }

    public int Process()
    {
        Sum = 0;
        for (int i = 0; i < data.Length; i++)
            data[i].Visit(this);
        return Sum;
    }

    internal void Visit(AValue a) => Sum += a.A > comperand ? 1 : 0;
    internal void Visit(BValue b) => Sum += b.B > comperand ? 1 : 0;
    // ...
}

public abstract class Value
{
    public abstract void Visit(Visitor visitor);
}

public class AValue : Value
{
    public double A { get; set; }
    public override void Visit(Visitor visitor) => visitor.Visit(this);
}

// Similar for BValue, CValue, DValue, EValue, FValue, ...

In production level code the Value polymorphic types should reference an abstraction of the visitor instead of the concrete type.

Performance Comparison

The distribution of the derived Value types significantly impacts the performance of these solutions. In this post I assume (and use) a discrete uniform distribution, meaning each types are equally probable. The DTO's value and each object's double property is generated with the non-cryptographic Random class.

I use .NET 9 Preview 3 and BenchmarkDotNet v0.13.12 to compare the performance. The input is an array of 1000 Value objects. Each of them is an instance of one of the AValue, ..., FValue, ... types (matching the number of types indicated in the results).

In the initial comparison I used 6 derived types.

Method

Mean

Error

StdDev

Code Size

SwitchExpressionType

5.481 us

0.0639 us

0.0598 us

464 B

Visitor

1.510 us

0.0298 us

0.0366 us

134 B

The visitor pattern seems to offer a 3.5 times faster solution. Let me repeat the test with 4, 12, and 18 derived types:

4 derived types with uniform distribution:

Method

Mean

Error

StdDev

Code Size

SwitchExpressionType

4.025 us

0.0453 us

0.0424 us

369 B

Visitor

1.446 us

0.0166 us

0.0155 us

134 B

12 derived types with uniform distribution:

Method

Mean

Error

StdDev

Code Size

SwitchExpressionType

10.665 us

0.1170 us

0.1037 us

756 B

Visitor

3.280 us

0.0607 us

0.0790 us

134 B

18 derived types:

Method

Mean

Error

StdDev

Code Size

SwitchExpressionType

14.937 us

0.1835 us

0.1716 us

1,043 B

Visitor

3.809 us

0.0708 us

0.0627 us

134 B

Both the SwitchExpressionType and Visitor methods performance decrease when there are more types to handle. What drives the performance degradation?

Based on further testing, I observe that the performance of the switch expression depends on two things:

  • how many arms (patterns) the switch expression has

  • how many case guards are evaluated at runtime

For example, when the input collection has a single type only, the performance depends on whether it matches the first or the last type in the patterns of the switch expression. When observing the optimized (Tier-1) code using JITDiasm (environment variable: $env:DOTNET_JitDisasm), a careful reader can observer that the possible types of the switch expression are compared one-by-one with the actual type. A fragment of the generated assembly code:

G_M000_IG04:                ;; offset=0x0075
       mov      rdx, r14
       mov      rcx, 0x7FFB2138B3D8
       call     [CORINFO_HELP_ISINSTANCEOFCLASS]
       test     rax, rax
       jne      G_M000_IG16

G_M000_IG05:                ;; offset=0x0091
       mov      rdx, r14
       mov      rcx, 0x7FFB2138B2D8
       call     [CORINFO_HELP_ISINSTANCEOFCLASS]
       test     rax, rax
       jne      G_M000_IG17

The performance of the Visitor pattern depends on the number of actual types used polymorphic collection. However, the JIT optimized Tier-1 code is the same with a single type or when using 18 derived types. The virtual method calls on the base type require the runtime's help to determine the exact method to be invoked:

mov      rax, qword ptr [rax+0x40]
call     [rax+0x20]Value:Visit(Visitor):this

Here the results seem to be primarily driven by the performance of virtual method dispatch.