Module meta

Source
Expand description

Provides a regex matcher that composes several other regex matchers automatically.

This module is home to a meta Regex, which provides a convenient high level API for executing regular expressions in linear time.

Β§Comparison with the regex crate

A meta Regex is the implementation used directly by the regex crate. Indeed, the regex crate API is essentially just a light wrapper over a meta Regex. This means that if you need the full flexibility offered by this API, then you should be able to switch to using this API directly without any changes in match semantics or syntax. However, there are some API level differences:

  • The regex crate API returns match objects that include references to the haystack itself, which in turn makes it easy to access the matching strings without having to slice the haystack yourself. In contrast, a meta Regex returns match objects that only have offsets in them.
  • At time of writing, a meta Regex doesn’t have some of the convenience routines that the regex crate has, such as replacements. Note though that Captures::interpolate_string will handle the replacement string interpolation for you.
  • A meta Regex supports the Input abstraction, which provides a way to configure a search in more ways than is supported by the regex crate. For example, Input::anchored can be used to run an anchored search, regardless of whether the pattern is itself anchored with a ^.
  • A meta Regex supports multi-pattern searching everywhere. Indeed, every Match returned by the search APIs include a PatternID indicating which pattern matched. In the single pattern case, all matches correspond to PatternID::ZERO. In contrast, the regex crate has distinct Regex and a RegexSet APIs. The former only supports a single pattern, while the latter supports multiple patterns but cannot report the offsets of a match.
  • A meta Regex provides the explicit capability of bypassing its internal memory pool for automatically acquiring mutable scratch space required by its internal regex engines. Namely, a Cache can be explicitly provided to lower level routines such as Regex::search_with.

ModulesΒ§

error πŸ”’
limited πŸ”’
This module defines two bespoke reverse DFA searching routines. (One for the lazy DFA and one for the fully compiled DFA.) These routines differ from the usual ones by permitting the caller to specify a minimum starting position. That is, the search will begin at input.end() and will usually stop at input.start(), unless min_start > input.start(), in which case, the search will stop at min_start.
literal πŸ”’
regex πŸ”’
reverse_inner πŸ”’
A module dedicated to plucking inner literals out of a regex pattern, and then constructing a prefilter for them. We also include a regex pattern β€œprefix” that corresponds to the bits of the regex that need to match before the literals do. The reverse inner optimization then proceeds by looking for matches of the inner literal(s), and then doing a reverse search of the prefix from the start of the literal match to find the overall start position of the match.
stopat πŸ”’
This module defines two bespoke forward DFA search routines. One for the lazy DFA and one for the fully compiled DFA. These routines differ from the normal ones by reporting the position at which the search terminates when a match isn’t found.
strategy πŸ”’
wrappers πŸ”’
This module contains a boat load of wrappers around each of our internal regex engines. They encapsulate a few things:

StructsΒ§

BuildError
An error that occurs when construction of a Regex fails.
Builder
A builder for configuring and constructing a Regex.
Cache
Represents mutable scratch space used by regex engines during a search.
CapturesMatches
An iterator over all non-overlapping leftmost matches with their capturing groups.
Config
An object describing the configuration of a Regex.
FindMatches
An iterator over all non-overlapping matches.
Regex
A regex matcher that works by composing several other regex matchers automatically.
Split
Yields all substrings delimited by a regular expression match.
SplitN
Yields at most N spans delimited by a regular expression match.