Module regex_automata::util::interpolate
source ยท Expand description
Provides routines for interpolating capture group references.
That is, if a replacement string contains references like $foo
or ${foo1}
,
then they are replaced with the corresponding capture values for the groups
named foo
and foo1
, respectively. Similarly, syntax like $1
and ${1}
is supported as well, with 1
corresponding to a capture group index and not
a name.
This module provides the free functions string
and bytes
, which
interpolate Rust Unicode strings and byte strings, respectively.
ยงFormat
These routines support two different kinds of capture references: unbraced and braced.
For the unbraced format, the format supported is $ref
where name
can be
any character in the class [0-9A-Za-z_]
. ref
is always the longest
possible parse. So for example, $1a
corresponds to the capture group named
1a
and not the capture group at index 1
. If ref
matches ^[0-9]+$
, then
it is treated as a capture group index itself and not a name.
For the braced format, the format supported is ${ref}
where ref
can be any
sequence of bytes except for }
. If no closing brace occurs, then it is not
considered a capture reference. As with the unbraced format, if ref
matches
^[0-9]+$
, then it is treated as a capture group index and not a name.
The braced format is useful for exerting precise control over the name of the
capture reference. For example, ${1}a
corresponds to the capture group
reference 1
followed by the letter a
, where as $1a
(as mentioned above)
corresponds to the capture group reference 1a
. The braced format is also
useful for expressing capture group names that use characters not supported by
the unbraced format. For example, ${foo[bar].baz}
refers to the capture group
named foo[bar].baz
.
If a capture group reference is found and it does not refer to a valid capture group, then it will be replaced with the empty string.
To write a literal $
, use $$
.
To be clear, and as exhibited via the type signatures in the routines in this module, it is impossible for a replacement string to be invalid. A replacement string may not have the intended semantics, but the interpolation procedure itself can never fail.
Structsยง
- CaptureRef ๐
CaptureRef
represents a reference to a capture group inside some text. The reference is either a capture group name or a number.
Enumsยง
- Ref ๐A reference to a capture group in some text.
Functionsยง
- Accepts a replacement byte string and interpolates capture references with their corresponding values.
- find_cap_ref ๐Parses a possible reference to a capture group name in the given text, starting at the beginning of
replacement
. - find_cap_ref_braced ๐Looks for a braced reference, e.g.,
${foo1}
. This assumes that an opening brace has been found ati-1
inrep
. This then looks for a closing brace and returns the capture reference within the brace. - is_valid_cap_letter ๐Returns true if and only if the given byte is allowed in a capture name written in non-brace form.
- Accepts a replacement string and interpolates capture references with their corresponding values.