Missing Number Performance Tests

In this post I am going to look at the performance aspect for one of the tipical programming exercises on interviews.

Task usually goes by this:

Given an array with integers from 1 to N. The order of the integers is unknown. One of the randomly chosen number is replaced with 0. Create a method that receives the array as an input and returns the replaced number.

I am not going into the details of edge cases, input validation, error handling, etc. This post purely checks the performance aspects of some of the possible solutions. I also assume that N = x * 8.

Next, I will present three solutions:

  • A naive solution

  • One usually accepted solution

  • A fast solution

The naive solution

The naive solution that usually everyone starts with iterates over each possible number from 1 to N, and checks for each item if it is contained by the array.Worst case we check N numbers, so contains might iterate over the array N times, each iteration comparing each item of the array results an O(N*N) operations.

public static int Contains(int[] items)
{
    for (int i = 1; i <= items.Length; i++)
        if (!items.Contains(i))
            return i;
    return -1;
}

Usually accepted solution

The second solution uses the fact that sum of numbers from 1 to N can be calculated with the following formula: (N+1)*N/2. Then we can sum up the items in the array and subtract it from the expected sum, which will give us the missing number. This time we only iterate over the array once, to sum up the integers, which is ends up O(N) operations. By definition this is faster on large enough arrays compared to the naive solution.

public static int Sum(int[] items)
{
    int expected = (items.Length + 1) * items.Length / 2;
    int sum = 0;
    foreach (var item in items)
        sum += item;
    return expected - sum;
}

Vectorizing

The last solution uses the same technique as the previous, but instead of summing items of the array one-by-one, we can use a faster solution. By vectorizing the items of the array and summing the vectors, we can leverage SIMD (Single Instruction Multiple Data) instructions to parallize the addition operations. The size of SIMD registers varies from machine to machine, in my case I have 8 integers in a Vector<int>. This is still ends up O(N), as the number of operations depend linearly compared to the length of the input. In practise though, this results a faster solution to the previous one.

Note, that the below solution presented only work if the length of the array it a multiple of the number of integers a Vector<int> can handle. If that is not the case, we would need to stop the while loop sooner while while(items.Length >= Vector<int>.Count), and we would need a final loop so sum up the last chunk of elements one-by-one.

public static int Vectorize(ReadOnlySpan<int> items)
{
    int expected = (items.Length + 1) * items.Length >> 1;
    var sum = Vector<int>.Zero;
    while (items.Length > 0)
    {
        sum += new Vector<int>(items);
        items = items.Slice(Vector<int>.Count);
    }
    return expected - Vector.Dot(sum, Vector<int>.One);
}

Benchmarks

For benchmarking, I used BenchmarkDotNet library, with one randomized integer array of length 8192 items. The table below shows that the naive solution is the slowest. Having the formula increases the preformance, several magnitude, and vectorization further boosts execution. As I mentioned earlier Vector<int> holds 8 integers on my machine. So why do we see only ~5 times faster execution? The rest of the performance is used for the penalty of vectorizing the input.

|    Method |           Mean |        Error |       StdDev |
|---------- |---------------:|-------------:|-------------:|
|  Contains | 1,636,479.9 ns | 29,170.58 ns | 27,286.18 ns |
|       Sum |     4,774.7 ns |     95.53 ns |    137.01 ns |
| Vectorize |       991.7 ns |     19.56 ns |     31.59 ns |