Function regex_automata::hybrid::search::prefilter_restart

source ·
fn prefilter_restart(
    dfa: &DFA,
    cache: &mut Cache,
    input: &Input<'_>,
    at: usize,
) -> Result<LazyStateID, MatchError>
Expand description

Re-compute the starting state that a DFA should be in after finding a prefilter candidate match at the position at.

It is always correct to call this, but not always necessary. Namely, whenever the DFA has a universal start state, the DFA can remain in the start state that it was in when it ran the prefilter. Why? Because in that case, there is only one start state.

When does a DFA have a universal start state? In precisely cases where it has no look-around assertions in its prefix. So for example, \bfoo does not have a universal start state because the start state depends on whether the byte immediately before the start position is a word byte or not. However, foo\b does have a universal start state because the word boundary does not appear in the pattern’s prefix.

So… most cases don’t need this, but when a pattern doesn’t have a universal start state, then after a prefilter candidate has been found, the current state must be re-litigated as if computing the start state at the beginning of the search because it might change. That is, not all start states are created equal.

Why avoid it? Because while it’s not super expensive, it isn’t a trivial operation to compute the start state. It is much better to avoid it and just state in the current state if you know it to be correct.