Frozen Collections
08/05/2023
5 minutes
.NET 8 (Preview 1) introduces new collection types including FrozenDictionary
and FrozenSet
. Both types are frozen counterparts of Dictionary<TKey, TValue>
and HashSet<T>
. These types reside in the System.Collections.Frozen
namespace in SystemCollections.Immutable.dll package.
The frozen semantics mean, that these collections resist to change once they become frozen: they are immutable. However, they are different to the existing immutable collections, as these are being even more restrictive to change. While immutable collections allow change by creating a new immutable collection frozen semantics discourage such operations. Immutable collections use 'clever' data structures in memory that make operations like Add
and Remove
(relatively) cheap given the underlying data may not change. Contrary, frozen collections can optimize for lookups (or for construction, but more on that later).
For example, ImmutableStack<T>
uses a linked list implementation instead of using an array as backing data structure. It has two fields: a head which contains data stored by a user, and a tail which is a pointer to the next item on the stack. Pushing a new item to the stack creates a new instance of ImmutableStack<T>
where the head field stores the new data pushed, and the tail references the previous instance of ImmutableStack<T>
.
Note, that these are the actual field names in the implementation of ImmutableStack at the time of writing this blog post.
In a similar way the pop operation returns the tail. In these cases, adding and removing items to an immutable stack is cheap. However, enumerating all items in this stack is less optimal compared to reading through an array of items.
Frozen
Creating a frozen collection has different semantics compared to immutables. The BCL exposes extension methods: ToFrozenDictionary
and ToFrozenSet
, which lets converting a frozen collection from an existing one. There is also an empty frozen instance exposed through the static Empty
property:
FrozenDictionary<int,int>.Empty // Not supported: FrozenDictionary<int, int>.Empty.TryAdd(1, 1);
However, if one tries to add items to the empty FrozenDictionary<TKey,TValue>
a NotSupportedException
is thrown.
Deep Frozen Dictionary
Looking into the implementation of FrozenDictionary reveals a few interesting details.
FrozenDictionary itself is an abstract class.
When creating a FrozenDictionary they can be optimized for construction or reading.
The specific implementation depends on the source dictionary's size and type of the
key
.
Let's take an actual example by freezing Dictionary<int, string>
with 100 items. An overload of ToFrozenDictionary
allows to make it optimized for reading:
DictionaryData.ToFrozenDictionary(optimizeForReading: true);
In this case type of the key of this dictionary is a value type. Based on the size of the dictionary and the actual value type, either a SmallComparableValueTypeFrozenDictionary
or Int32FrozenDictionary
or ValueTypeDefaultComparerFrozenDictionary
is used. The implementation of ToFrozenDictionary
also considers reference types, and in particular string
keys.
In the case above with 100 items we get a Int32FrozenDictionary
. The implementation of Int32FrozenDictionary is relatively short, the class can fit on my screen. It has two interesting readonly fields:
private readonly FrozenHashTable _hashTable; private readonly TValue[] _values;
TValue[]
stores the values of the original dictionary in an array. This array is built from the Dictionary upon construction. FrozenHashTable
is responsible for translating a key into a start and an end index representing a range in TValue[]
which has the value corresponding to the key. FrozenHashTable
is shared by multiple other frozen collections. Its implementation is considerably more complex with calculating the number of buckets upon constructions.
Performance
Does all this manifest as a performance improvement? As outlined earlier, the performance characteristics will greatly depend on the concrete FrozenDictionary
implementation. In this post focusing on FrozenDictionary<int, string>
I performed the following measurements:
Using TryGet to lookup a specific key that exists in the dictionary.
Enumerate all items of the dictionary.
Number of keys in the dictionary: 4
Method | Mean | Error | StdDev |
---|---|---|---|
FrozenDictionaryTryGet | 2.864 ns | 0.0786 ns | 0.0735 ns |
DictionaryGet | 3.223 ns | 0.0475 ns | 0.0421 ns |
FrozenDictionaryEnumerate | 8.704 ns | 0.0745 ns | 0.0697 ns |
DictionaryEnumerate | 17.695 ns | 0.1689 ns | 0.1580 ns |
Number of keys in the dictionary: 10
Method | Mean | Error | StdDev |
---|---|---|---|
FrozenDictionaryTryGet | 2.847 ns | 0.0706 ns | 0.0661 ns |
DictionaryGet | 3.195 ns | 0.0735 ns | 0.0651 ns |
FrozenDictionaryEnumerate | 19.413 ns | 0.2199 ns | 0.2057 ns |
DictionaryEnumerate | 28.225 ns | 0.4060 ns | 0.3599 ns |
Number of keys in the dictionay: 11
Method | Mean | Error | StdDev |
---|---|---|---|
FrozenDictionaryTryGet | 3.216 ns | 0.1082 ns | 0.1063 ns |
DictionaryGet | 3.270 ns | 0.1014 ns | 0.0948 ns |
FrozenDictionaryEnumerate | 21.197 ns | 0.2632 ns | 0.2333 ns |
DictionaryEnumerate | 30.961 ns | 0.4135 ns | 0.3868 ns |
Number of keys in the dictionary: 4000
Method | Mean | Error | StdDev |
---|---|---|---|
FrozenDictionaryTryGet | 3.079 ns | 0.0674 ns | 0.0630 ns |
DictionaryGet | 3.104 ns | 0.0303 ns | 0.0236 ns |
FrozenDictionaryEnumerate | 5,870.162 ns | 25.6190 ns | 22.7106 ns |
DictionaryEnumerate | 8,288.615 ns | 58.9123 ns | 52.2242 ns |
Measurements on the early version of the preview show that there are performance gains achieved, however the size of the gain varies greatly on the data type, the number of dictionary items and the operations we perform.
Conclusion
I see Frozen data types are perfect for the use case where a collection is built up once but used for an extended period of time for read operations. For example, an application may create lookup tables during a startup phase, which lookup tables can be used throughout the lifetime of the application. Frozen types might also be beneficial in very tight loops in case the loop needs to read data from a collection. However, using it generally should be done with caution: freezing an existing dictionary or set is considerable amount of work, which can absorb the benefits on the read operations.