Module regex_automata::util::interpolate

source ยท
Expand description

Provides routines for interpolating capture group references.

That is, if a replacement string contains references like $foo or ${foo1}, then they are replaced with the corresponding capture values for the groups named foo and foo1, respectively. Similarly, syntax like $1 and ${1} is supported as well, with 1 corresponding to a capture group index and not a name.

This module provides the free functions string and bytes, which interpolate Rust Unicode strings and byte strings, respectively.

ยงFormat

These routines support two different kinds of capture references: unbraced and braced.

For the unbraced format, the format supported is $ref where name can be any character in the class [0-9A-Za-z_]. ref is always the longest possible parse. So for example, $1a corresponds to the capture group named 1a and not the capture group at index 1. If ref matches ^[0-9]+$, then it is treated as a capture group index itself and not a name.

For the braced format, the format supported is ${ref} where ref can be any sequence of bytes except for }. If no closing brace occurs, then it is not considered a capture reference. As with the unbraced format, if ref matches ^[0-9]+$, then it is treated as a capture group index and not a name.

The braced format is useful for exerting precise control over the name of the capture reference. For example, ${1}a corresponds to the capture group reference 1 followed by the letter a, where as $1a (as mentioned above) corresponds to the capture group reference 1a. The braced format is also useful for expressing capture group names that use characters not supported by the unbraced format. For example, ${foo[bar].baz} refers to the capture group named foo[bar].baz.

If a capture group reference is found and it does not refer to a valid capture group, then it will be replaced with the empty string.

To write a literal $, use $$.

To be clear, and as exhibited via the type signatures in the routines in this module, it is impossible for a replacement string to be invalid. A replacement string may not have the intended semantics, but the interpolation procedure itself can never fail.

Structsยง

  • CaptureRef ๐Ÿ”’
    CaptureRef represents a reference to a capture group inside some text. The reference is either a capture group name or a number.

Enumsยง

  • Ref ๐Ÿ”’
    A reference to a capture group in some text.

Functionsยง

  • Accepts a replacement byte string and interpolates capture references with their corresponding values.
  • find_cap_ref ๐Ÿ”’
    Parses a possible reference to a capture group name in the given text, starting at the beginning of replacement.
  • Looks for a braced reference, e.g., ${foo1}. This assumes that an opening brace has been found at i-1 in rep. This then looks for a closing brace and returns the capture reference within the brace.
  • Returns true if and only if the given byte is allowed in a capture name written in non-brace form.
  • Accepts a replacement string and interpolates capture references with their corresponding values.