Regex and Speed

In the series of these posts, I take a rather regular task, and implement it multiple times in an iteratively manner, while also improving performance.

The task is the following: given a string input with placeholders in it, and a set of placeholders to actual value mappings. Replace the placeholders in the input string with the actual text values from the map and return the final string. For example, given input string Hello [%placeholder%]! and map [%placeholder%] => World should return Hello World!. The placeholder's [%placeholder%] part is referred as the key, and World part as the value in this post.

To further clarify the task, these constraints are also true:

  • placeholder keys are delimited with [% at the beginning and %] at the end

  • placeholders are non-recursive (a placeholder's value may contain another (or the same) placeholder's key, but this should not be processed recursively)

  • input text contains no placeholder that are embedded by other placeholders (such as hello [%outer[%inner%]example%])

  • the key of placeholders contain letters and underscores

  • the same fixed set of placeholders are used on multiple input strings

  • placeholders are used at most once per input text

In a later post of the series, I'll also look into an alternative version of this same task, which requires to process recursive placeholders.

To place the task in a real-life example, imagine a system handling order requests. For each request the system needs to create an e-mail based on a template, a pdf containing billing information, a pdf for shipping information, notification for end users etc. The same set of order parameters (placeholder values) need to be replaced in the template (input string) of the e-mail, pdf, notification etc.

In the series of these posts I present multiple, iterative implementations. As these code samples may be too large for a blog post to present effectively, the actual code is linked from GitHub.

Common Understanding

The description above goes into the details, but still leaves some points undefined. To have a common ground for all of the implemented solutions, I prepared a set of unit tests. Each implemented solution must pass these tests. As in a true TDD (Test Drive Development) style, I will start with the unit tests first in this blog post too. All presented unit tests must be green for all the implementations of the task. With a set of predefined tests can I only assure, that no edge cases left dangling.

The unit tests are available as a Gist. As anyone can see it, there will be 7 different non-recursive implementation discussed in these blog posts, as well as two recursive implementations. For these unit tests I used xUnit framework with version 2.4.1. The test cases are passed as input arguments, the [MemberData] attribute declares which helper method provides the concrete test cases. The Common() method defines test cases that must be passed by both recursive and non-recursive implementations, while the InputOutputData() describes the non-recursive cases and expected results.

The rest of this post focuses on the non-recursive solutions.

Solution 0

Solution0.cs shows the initial implementation. The Substitute() method has two input parameters, the string replacement is the input string, and IReadOnlyDictionary<string, string> map contains the placeholders. The Substitute() method returns the result of the task defined above.

The body of the method creates a regular expression to match the placeholder pattern. Matches() method of the regular expression returns an enumerable of matches. Each match is checked if it exists as a key in the placeholders' dictionary. If so the match is replaced in the result string with the placeholder's value.

This is a very naive implementation, and it will serve as the basis for improvements. There are several inefficiencies in this implementation that can be improved.

  • the way regular expressions are created

  • the way regular expressions are used

  • the way string replacement is done

  • the way placeholders are looked up

Notice, that using this method multiple times (for substituting placeholders in e-mails, notifications, etc.) with multiple input values, will result creating the same regular expression over-and-over again. The result string has even worse performance characteristics: for each replacement a new string allocated and the previous one is thrown away as garbage.

Each time, the Matches() method returns a match the Replace() method needs to find that exact substring again in the string held by the result variable. It means the Replace() method needs to scan the string from the beginning each time.

The implementation itself can also lead to strange edge cases as the result string and the regular expressions input string is kept separate. For now this is good, as it helps to avoid the recursive placeholder replacement.

A few last words on the regular expression: it explicitly looks for characters between the [% %] delimiter strings, that are lower- or uppercase letters or underscore character.

Let's see how this naive implementation performs on a test data:

|    Method |        Mean |     Error |      StdDev |  Gen 0 | Allocated |
|---------- |------------:|----------:|------------:|-------:|----------:|
| Solution0 |  7,494.9 ns | 298.81 ns |   866.91 ns | 3.6850 |     11 KB |

As we see it allocates 11 KB memory and it runs for ~7.5 us. The upcoming implementations will try to improve on these numbers.

Solution 1

This solution tries to improve on the Solution 0, but a reader looking at the initial implementation might feel controversial about it. Solution 1 tries to use the same building blocks as Solution 0. The same regular expression is used in the method. However, it avoids using the string.Replace() method, but rather uses the Replace() method of the regular expression: Regex.Replace(). Regular Expression's Replace() method has an overload which takes a delegate that can examines each match and return either the original matched string or a replacement string. Solution 1 implementation tries to be "smart" by setting a limit of one to maximize the number of times the replacement occurs. The rest of the Substitute() method uses recursion (instead of the foreach loop from Solution 0). This implementation also needs to pass along a ref int to avoid infinite loops, for example when the placeholder's value contains its own key. While such a placeholder is allowed, the code should not process it recursively. The ref int parameter helps to overcome the infinite loop problem by indicating where the next match should be searched for. Its value contains the end of the previous match.

Note, that string.Replace() replaces all occurances of the matched key and value in the string.

From performance point of view this solution performs significantly worse to Solution 0:

|    Method |        Mean |     Error |      StdDev |  Gen 0 | Allocated |
|---------- |------------:|----------:|------------:|-------:|----------:|
| Solution0 |  7,494.9 ns | 298.81 ns |   866.91 ns | 3.6850 |     11 KB |
| Solution1 | 24,029.0 ns | 763.83 ns | 2,252.18 ns | 8.4381 |     26 KB |

One other problem with this implementation is that this implementation has no clear stop condition for the recursion.

A careful reader might notice, that if we omit the limit for the number of times a replacement may happen, we could also remove the ref int start parameter and the our recursion as well. With these changes, the code would not only become smaller, but faster too. Repeating the same performance tests as before, it is easy to show, that not only the allocated memory, but the mean execution time got better. Even better, this implementation is now faster to Solution 0, while using only half of the allocated memory.

|     Method |     Mean |     Error |    StdDev |   Median |  Gen 0 | Allocated |
|----------- |---------:|----------:|----------:|---------:|-------:|----------:|
| Solution1* | 4.190 us | 0.1051 us | 0.3065 us | 4.093 us | 1.8234 |      6 KB |

Can we further improve the performance? The next solution will incorporate all these findings and builds on top of it.

Solution 2

Solution 2 takes the ideas of the Solution 1, and adds further improvements. This time, let's examine how and when the regular expression is instantiated. Solution 0 found that each time a new input string is used a new Regex object is created. Solution 2 implementation addresses this problem by having a constructor, which creates a single Regex object and stores it in a field. The regex creation is also finetuned by passing a Compiled argument. The Compiled option compiles the regular expression into IL code to improve the performance. In one hand this is a cheat because benchmarks exclude the constructor of the Solution X class, on the other hand this solution uses the realization that the regex creation may be amortized across all invocations.

Notice, that Solution 2 takes all the learnings from Solution 1, by removing the recursion and the limit when invoking the Regex.Replace() method.

Finally, this solution moves the placeholders map parameter from the Substitute() method to the constructor. This way the API of the class lends itself closer to the intent of use: use the same set of placeholders on multiple input strings.

Benchmarks:

|    Method |        Mean |     Error |      StdDev |  Gen 0 | Allocated |
|---------- |------------:|----------:|------------:|-------:|----------:|
| Solution0 |  7,494.9 ns | 298.81 ns |   866.91 ns | 3.6850 |     11 KB |
| Solution1 | 24,029.0 ns | 763.83 ns | 2,252.18 ns | 8.4381 |     26 KB |
| Solution2 |  1,681.4 ns |  79.69 ns |   234.97 ns | 0.8831 |      3 KB |

The changes applied in this solution further improves the performance and memory utilization. Notice, that with these minor changes, Solution 2 is ~4.5 times faster and allocates 3 times less memory compared to Solution 0. Does this improvement matter, after all we are talking about nanoseconds? In a cloud native application, such as an Azure Function which is billed by gigabyte seconds (GB-s), where a service generating messages based on templates, processing 4 times more request with third of the memory consumption can reduce costs.

Solution 3

The last implementation discussed in this post is using .NET 7 and Regex Generators. This is a new feature in .NET which generates C# code for the parsing the regular expression at build time. As this code is based on a pre-released API, for the final release of .NET 7. Today a static partial method should be defined with Regex return type. It also has to be attributed with [RegexGenerator], which instructs the System.Text.RegularExpressions.Generator analyzer to generate the source code file for the regular expression pattern defined in the attribute's argument.

Looking at the benchmarks, this approach yields a similar performance to Solution 2. In this particular benchmark the gap seems to be larger, but in repated benchmarks, Solution 2 and Solution 3 seems to have a very close performance characteristics related to the Replace() method call.

|    Method |        Mean |     Error |      StdDev |  Gen 0 | Allocated |
|---------- |------------:|----------:|------------:|-------:|----------:|
| Solution0 |  7,494.9 ns | 298.81 ns |   866.91 ns | 3.6850 |     11 KB |
| Solution1 | 24,029.0 ns | 763.83 ns | 2,252.18 ns | 8.4381 |     26 KB |
| Solution2 |  1,681.4 ns |  79.69 ns |   234.97 ns | 0.8831 |      3 KB |
| Solution3 |  1,047.4 ns |  12.76 ns |    11.94 ns | 0.8831 |      3 KB |

The benefit of Solution 3 over Solution 2 should be reflected in the performance of the classes' constructors. While Solution 2 emits the compiled code for parsing the regular expression at runtime, Solution 3 does that at compile time. This shows significant gain for both allocated memory as well as the mean execution time.

|    Method |         Mean |       Error |      StdDev |  Gen 0 | Allocated |
|---------- |-------------:|------------:|------------:|-------:|----------:|
| Solution2 | 8,637.163 ns | 172.7103 ns | 454.9879 ns | 2.5330 |   7,984 B |
| Solution3 |     4.301 ns |   0.1521 ns |   0.3240 ns | 0.0077 |      24 B |

An alternative to this approach would Solution 2 storing the Regex in a static field. With a static field we would only pay the cost of the regular expression creation at the first time the type is accessed.

Conclusion

This post covers all solution where I used regular expressions to solve the initial tasks. Throughtout the iterations, the goal has been to gradually improve the performance both from mean execution time and allocated memory's point of view. This post started from a very naive implementation and focused on how to replace the string, how to match the regular expression and how to instantiate the regular expression.

In the next post I will look into more exciting, better performing alternative solutions. The final post of this series will cover the same topic with the further requirement of handling recursive placeholders.