Enum regex_automata::util::start::Start
source · pub(crate) enum Start {
NonWordByte = 0,
WordByte = 1,
Text = 2,
LineLF = 3,
LineCR = 4,
CustomLineTerminator = 5,
}
Expand description
Represents the six possible starting configurations of a DFA search.
The starting configuration is determined by inspecting the beginning of the haystack (up to 1 byte). Ultimately, this along with a pattern ID (if specified) and the type of search (anchored or not) is what selects the start state to use in a DFA.
As one example, if a DFA only supports unanchored searches and does not
support anchored searches for each pattern, then it will have at most 6
distinct start states. (Some start states may be reused if determinization
can determine that they will be equivalent.) If the DFA supports both
anchored and unanchored searches, then it will have a maximum of 12
distinct start states. Finally, if the DFA also supports anchored searches
for each pattern, then it can have up to 12 + (N * 6)
start states, where
N
is the number of patterns.
Handling each of these starting configurations in the context of DFA
determinization can be quite tricky and subtle. But the code is small
and can be found at crate::util::determinize::set_lookbehind_from_start
.
Variants§
NonWordByte = 0
This occurs when the starting position is not any of the ones below.
WordByte = 1
This occurs when the byte immediately preceding the start of the search is an ASCII word byte.
Text = 2
This occurs when the starting position of the search corresponds to the beginning of the haystack.
LineLF = 3
This occurs when the byte immediately preceding the start of the search
is a line terminator. Specifically, \n
.
LineCR = 4
This occurs when the byte immediately preceding the start of the search
is a line terminator. Specifically, \r
.
CustomLineTerminator = 5
This occurs when a custom line terminator has been set via a
LookMatcher
, and when that line terminator is neither a \r
or a
\n
.
If the custom line terminator is a word byte, then this start
configuration is still selected. DFAs that implement word boundary
assertions will likely need to check whether the custom line terminator
is a word byte, in which case, it should behave as if the byte
satisfies \b
in addition to multi-line anchors.
Implementations§
source§impl Start
impl Start
sourcepub(crate) fn from_usize(n: usize) -> Option<Start>
pub(crate) fn from_usize(n: usize) -> Option<Start>
Return the starting state corresponding to the given integer. If no starting state exists for the given integer, then None is returned.