Know Your Data05/24/2026 | 7 minutes to read
There are many social posts, blog posts, and code suggestions online for .NET code snippets titled "Use X instead of Y" or "Why X is better than Y". These posts usually (but not always) include a basic performance comparison using BenchmarkDotNet to validate their catchy titles.
At first glance, these suggestions appear justified by the measurements. However, upon closer inspection, many important details are typically excluded:
- What version of .NET (SDK and Runtime) was used?
- What version of OS was used?
- What is the underlying hardware architecture (ARM or x64)?
- Does the hardware support vectorized operations?
- What are the sizes of the vector registers?
- What is the memory read latency?
- What are the CPU cache sizes?
- What branch prediction algorithm does the CPU use?
- What are the input data types (structs, classes, primitives)?
- What is the input data structure?
- What is the size of the input data (bytes/array length)?
- What is the data access pattern?
- And many more factors
Since explaining all these factors is beyond this post's scope, I will focus on data access patterns and data sizes. For those interested in a deeper analysis, I recommend reading Pro .NET Benchmarking by Andrey Akinshin, which explains many common pitfalls.
While there are some general "use X over Y" suggestions, these don't necessarily stand the test of time. Each .NET release introduces optimizations that can completely change the performance landscape. Much of the highly-ranked literature on search engines becomes outdated or superseded by less prominent writings.
The Big O notation is commonly used to classify algorithms based on their number of steps required to run to completion.
f(x) is of the order of g(x) if the absolute value of f(x) is at most a positive C constant multiple of the absolute value of g(x) for all sufficiently large values of x.
f(x) = O(g(x)), if|f(x)| <= C*|g(x)|for all x>=x0.
Big O notation classifies algorithms without other preconditions, which is crucial in computer science and should be the general guideline for applications. However, the definition deals with functions approaching infinity, using constants C and x0. In practice, two algorithms with the same O() notation might perform quite differently due to the different C constants. If C equals 2 for one algorithm and 1 for another, you'll see a measurable performance difference. The same applies to x0 - with small inputs, an algorithm might perform significantly better than it does with inputs approaching infinity in size.
Therefore, understanding your application and its data characteristics should influence your choice of solution. The following benchmarks demonstrate this principle.
The benchmark creates a data structure to store data in a hashset or queue or array. It inserts Length number of items into the tested data structure. Then it looks up data in the data structure 30 times, finally clears the data structure. The lookup operation is either always successful or it never successful. ShouldContain, denotes whether the looked up item was in the inserted data set.
Note that, these are not measurements of .NET's built-in
Array/HashSet<T>/Queue<T>types but a derived types using a corresponding underlying data structure. Please do not generalize the findings of these measurements.
Why the this benchmark was designed in a very specific way?
- the application will run on the same hardware as the benchmarking hardware
- the application needs to places 1-100 items in a data structure, perform ~30 lookup operations before clearing the data structure.
High-level benchmark implementation:
public class Benchmarks { private const int Max = 128; private const int LookupCount = 30; private Queue<Data> _queue; private HashSet<int> _idSet; private int[] _idArray; [GlobalSetup] public void Setup() { _queue = new Queue<Data>(Max); _idSet = new HashSet<int>(Max); _idArray = new int[Max]; } [Params(3, 30, 50, 100)] public int QueueLength { get; set; } [Params(true, false)] public bool ShouldContain { get; set; } [Benchmark] public void HashSet() { /*...*/ } [Benchmark] public void SearchQueue() { /*...*/ } [Benchmark] public void SearchArray() { /*...*/ } }
The results:
| Method | Length | ShouldContain | Mean | Error | StdDev | Median | |------------ |------- |-------------- |------------:|-----------:|-----------:|------------:| | HashSet | 3 | False | 97.31 ns | 1.591 ns | 1.410 ns | 96.87 ns | | SearchQueue | 3 | False | 137.76 ns | 1.762 ns | 1.648 ns | 137.13 ns | | SearchArray | 3 | False | 112.17 ns | 0.983 ns | 0.919 ns | 111.97 ns | | HashSet | 3 | True | 116.06 ns | 1.108 ns | 1.036 ns | 115.96 ns | | SearchQueue | 3 | True | 138.21 ns | 1.811 ns | 1.694 ns | 138.58 ns | | SearchArray | 3 | True | 101.40 ns | 0.668 ns | 0.558 ns | 101.48 ns | | HashSet | 30 | False | 719.58 ns | 9.925 ns | 8.798 ns | 717.58 ns | | SearchQueue | 30 | False | 920.76 ns | 6.109 ns | 5.415 ns | 920.64 ns | | SearchArray | 30 | False | 618.84 ns | 3.010 ns | 2.513 ns | 619.41 ns | | HashSet | 30 | True | 765.34 ns | 9.785 ns | 9.153 ns | 763.91 ns | | SearchQueue | 30 | True | 906.78 ns | 5.929 ns | 4.629 ns | 905.16 ns | | SearchArray | 30 | True | 583.87 ns | 9.312 ns | 9.563 ns | 581.83 ns | | HashSet | 50 | False | 1,152.56 ns | 10.839 ns | 9.051 ns | 1,153.60 ns | | SearchQueue | 50 | False | 1,516.34 ns | 22.290 ns | 19.760 ns | 1,508.94 ns | | SearchArray | 50 | False | 958.02 ns | 12.018 ns | 10.654 ns | 957.10 ns | | HashSet | 50 | True | 1,229.79 ns | 15.292 ns | 12.769 ns | 1,225.11 ns | | SearchQueue | 50 | True | 1,833.85 ns | 36.233 ns | 62.500 ns | 1,818.53 ns | | SearchArray | 50 | True | 1,249.61 ns | 26.314 ns | 76.342 ns | 1,228.44 ns | | HashSet | 100 | False | 2,678.44 ns | 53.553 ns | 73.304 ns | 2,678.06 ns | | SearchQueue | 100 | False | 3,476.01 ns | 68.267 ns | 93.444 ns | 3,436.99 ns | | SearchArray | 100 | False | 2,172.79 ns | 42.972 ns | 54.345 ns | 2,163.98 ns | | HashSet | 100 | True | 2,842.06 ns | 56.292 ns | 108.456 ns | 2,824.97 ns | | SearchQueue | 100 | True | 3,460.25 ns | 48.344 ns | 45.221 ns | 3,449.34 ns | | SearchArray | 100 | True | 2,154.58 ns | 42.766 ns | 45.759 ns | 2,163.90 ns |
Notice, that each proposed data structure has its own performance sweetspot. One does not 'win over' another in general terms, instead it depends on the application workload. At this point a developer can choose a solution based on different strategies (or requirements):
- match the most common expected input (optimizing for the best-case scenario)
- optimize for the 'worst case' scenario (for example, considering p95 mean execution time)
- optimize for scaling
One way to assess how these algorithms scale is by increasing the number of lookups from 30 to 300. Here are the results of the same benchmarks with 300 lookups:
| Method | Length | ShouldContain | Mean | Error | StdDev | |------------ |------- |-------------- |-----------:|----------:|----------:| | HashSet | 3 | False | 3.188 us | 0.0433 us | 0.0362 us | | SearchQueue | 3 | False | 7.004 us | 0.0652 us | 0.0578 us | | SearchArray | 3 | False | 4.179 us | 0.0323 us | 0.0252 us | | HashSet | 3 | True | 4.546 us | 0.0314 us | 0.0278 us | | SearchQueue | 3 | True | 6.939 us | 0.0422 us | 0.0353 us | | SearchArray | 3 | True | 3.181 us | 0.0283 us | 0.0265 us | | HashSet | 30 | False | 3.734 us | 0.0188 us | 0.0166 us | | SearchQueue | 30 | False | 36.211 us | 0.4353 us | 0.3859 us | | SearchArray | 30 | False | 5.952 us | 0.0831 us | 0.0777 us | | HashSet | 30 | True | 5.132 us | 0.0336 us | 0.0298 us | | SearchQueue | 30 | True | 34.407 us | 0.3604 us | 0.3195 us | | SearchArray | 30 | True | 4.140 us | 0.0461 us | 0.0431 us | | HashSet | 50 | False | 4.222 us | 0.0553 us | 0.0490 us | | SearchQueue | 50 | False | 55.841 us | 0.2738 us | 0.2427 us | | SearchArray | 50 | False | 9.362 us | 0.0714 us | 0.0633 us | | HashSet | 50 | True | 6.182 us | 0.0573 us | 0.0479 us | | SearchQueue | 50 | True | 62.037 us | 0.9871 us | 0.9233 us | | SearchArray | 50 | True | 7.217 us | 0.0796 us | 0.0705 us | | HashSet | 100 | False | 5.928 us | 0.0519 us | 0.0486 us | | SearchQueue | 100 | False | 127.606 us | 0.4737 us | 0.4431 us | | SearchArray | 100 | False | 17.965 us | 0.1083 us | 0.0960 us | | HashSet | 100 | True | 7.520 us | 0.0613 us | 0.0543 us | | SearchQueue | 100 | True | 120.993 us | 0.6796 us | 0.6357 us | | SearchArray | 100 | True | 11.658 us | 0.1646 us | 0.1459 us |
The results paint a slightly different picture. SearchQueue performs significantly worse compared to the other solution. SearchArray also scales worse in regards of the Length parameter. Such insights can drive decisions for the chosen implementation. If your application frequently deals with lengths around 3-30, SearchArray and HashSet are quite competitive. However, if you expect lengths to often reach 100, HashSet appears to be a better choice than SearchArray. SearchQueue is generally slower and might not be suitable unless there are other compelling reasons to use it (for example if it has a different memory footprint).
An important note, that these benchmarks were executed on the same hardware / OS where the target application runs. Hardware and OS might change without the developer being notified (for example, hardware failures or OS patches). Such scenarios should be also taken into account. That said application code is typically re-written/refactored every few years, which offers a great opportunity to align the implementation with the most recent environment.
Conclusion
Benchmarking related posts on this blog always suggest to benchmark with production like workloads. The provided benchmarks highlights the importance of this suggestion. Different data structures exhibit varying performance characteristics based on factors such as input size, data access patterns, and specific operations performed.
When choosing a data structure or algorithm, consider the following:
- Understand your workload: Profile your application to identify the typical data sizes, access patterns, and operation frequencies.
- Benchmark with realistic data: Use data that resembles your production data in terms of size, distribution, and content.
- Consider scaling: Evaluate how the performance of different solutions scales as your data volume and workload intensity increase.
- Account for environmental factors: Recognize that hardware, operating system, and .NET version can all influence performance.
By carefully considering these factors and conducting thorough benchmarks, you can make informed decisions that optimize your application's performance for its specific use case. Avoid relying solely on theoretical analyses or general recommendations, as real-world performance can often deviate.