struct State<'a> {
    fail: StateID,
    match_len: usize,
    trans: StateTrans<'a>,
}Expand description
The “in memory” representation a single dense or sparse state.
A State’s in memory representation is not ever actually materialized
during a search with a contiguous NFA. Doing so would be too slow. (Indeed,
the only time a State is actually constructed is in Debug impls.)
Instead, a State exposes a number of static methods for reading certain
things from the raw binary encoding of the state.
Fields§
§fail: StateIDThe state to transition to when ‘class_to_next’ yields a transition to the FAIL state.
match_len: usizeThe number of pattern IDs in this state. For a non-match state, this is always zero. Otherwise it is always bigger than zero.
trans: StateTrans<'a>The sparse or dense representation of the transitions for this state.
Implementations§
Source§impl<'a> State<'a>
 
impl<'a> State<'a>
Sourceconst KIND: usize = 0usize
 
const KIND: usize = 0usize
The offset of where the “kind” of a state is stored. If it isn’t one of the sentinel values below, then it’s a sparse state and the kind corresponds to the number of transitions in the state.
Sourceconst KIND_DENSE: u32 = 255u32
 
const KIND_DENSE: u32 = 255u32
A sentinel value indicating that the state uses a dense representation.
Sourceconst KIND_ONE: u32 = 254u32
 
const KIND_ONE: u32 = 254u32
A sentinel value indicating that the state uses a special “one transition” encoding. In practice, non-match states with one transition make up the overwhelming majority of all states in any given Aho-Corasick automaton, so we can specialize them using a very compact representation.
Sourceconst MAX_SPARSE_TRANSITIONS: usize = 127usize
 
const MAX_SPARSE_TRANSITIONS: usize = 127usize
The maximum number of transitions to encode as a sparse state. Usually states with a lot of transitions are either very rare, or occur near the start state. In the latter case, they are probably dense already anyway. In the former case, making them dense is fine because they’re rare.
This needs to be small enough to permit each of the sentinel values for ‘KIND’ above. Namely, a sparse state embeds the number of transitions into the ‘KIND’. Basically, “sparse” is a state kind too, but it’s the “else” branch.
N.B. There isn’t anything particularly magical about 127 here. I just picked it because I figured any sparse state with this many transitions is going to be exceptionally rare, and if it did have this many transitions, then it would be quite slow to do a linear scan on the transitions during a search anyway.
Sourcefn remap(
    alphabet_len: usize,
    old_to_new: &[StateID],
    state: &mut [u32],
) -> Result<(), BuildError>
 
fn remap( alphabet_len: usize, old_to_new: &[StateID], state: &mut [u32], ) -> Result<(), BuildError>
Remap state IDs in-place.
state should be the the raw binary encoding of a state. (The start
of the slice must correspond to the start of the state, but the slice
may extend past the end of the encoding of the state.)
Sourcefn len(alphabet_len: usize, is_match: bool, state: &[u32]) -> usize
 
fn len(alphabet_len: usize, is_match: bool, state: &[u32]) -> usize
Returns the length, in number of u32s, of this state.
This is useful for reading states consecutively, e.g., in the Debug impl without needing to store a separate map from state index to state identifier.
state should be the the raw binary encoding of a state. (The start
of the slice must correspond to the start of the state, but the slice
may extend past the end of the encoding of the state.)
Sourcefn sparse_trans_len(state: &[u32]) -> usize
 
fn sparse_trans_len(state: &[u32]) -> usize
Get the number of sparse transitions in this state. This can never be more than State::MAX_SPARSE_TRANSITIONS, as all states with more transitions are encoded as dense states.
state should be the the raw binary encoding of a sparse state. (The
start of the slice must correspond to the start of the state, but the
slice may extend past the end of the encoding of the state.) If this
isn’t a sparse state, then the return value is unspecified.
Do note that this is only legal to call on a sparse state. So for example, “one transition” state is not a sparse state, so it would not be legal to call this method on such a state.
Sourcefn match_len(alphabet_len: usize, state: &[u32]) -> usize
 
fn match_len(alphabet_len: usize, state: &[u32]) -> usize
Returns the total number of matching pattern IDs in this state. Calling this on a state that isn’t a match results in unspecified behavior. Thus, the returned number is never 0 for all correct calls.
state should be the the raw binary encoding of a state. (The start
of the slice must correspond to the start of the state, but the slice
may extend past the end of the encoding of the state.)
Sourcefn match_pattern(alphabet_len: usize, state: &[u32], index: usize) -> PatternID
 
fn match_pattern(alphabet_len: usize, state: &[u32], index: usize) -> PatternID
Returns the pattern ID corresponding to the given index for the state
given. The index provided must be less than the number of pattern IDs
in this state.
state should be the the raw binary encoding of a state. (The start of
the slice must correspond to the start of the state, but the slice may
extend past the end of the encoding of the state.)
If the given state is not a match state or if the index is out of bounds, then this has unspecified behavior.
Sourcefn read(alphabet_len: usize, is_match: bool, state: &'a [u32]) -> State<'a>
 
fn read(alphabet_len: usize, is_match: bool, state: &'a [u32]) -> State<'a>
Read a state’s binary encoding to its in-memory representation.
alphabet_len should be the total number of transitions defined for
dense states.
is_match should be true if this state is a match state and false
otherwise.
state should be the the raw binary encoding of a state. (The start
of the slice must correspond to the start of the state, but the slice
may extend past the end of the encoding of the state.)
Sourcefn write(
    nnfa: &NFA,
    oldsid: StateID,
    old: &State,
    classes: &ByteClasses,
    dst: &mut Vec<u32>,
    force_dense: bool,
) -> Result<StateID, BuildError>
 
fn write( nnfa: &NFA, oldsid: StateID, old: &State, classes: &ByteClasses, dst: &mut Vec<u32>, force_dense: bool, ) -> Result<StateID, BuildError>
Encode the “old” state from a noncontiguous NFA to its binary
representation to the given dst slice. classes should be the byte
classes computed for the noncontiguous NFA that the given state came
from.
This returns an error if dst became so big that StateIDs can no
longer be created for new states. Otherwise, it returns the state ID of
the new state created.
When force_dense is true, then the encoded state will always use a
dense format. Otherwise, the choice between dense and sparse will be
automatically chosen based on the old state.
Sourcefn write_sparse_trans(
    nnfa: &NFA,
    oldsid: StateID,
    classes: &ByteClasses,
    dst: &mut Vec<u32>,
) -> Result<(), BuildError>
 
fn write_sparse_trans( nnfa: &NFA, oldsid: StateID, classes: &ByteClasses, dst: &mut Vec<u32>, ) -> Result<(), BuildError>
Encode the “old” state transitions from a noncontiguous NFA to its
binary sparse representation to the given dst slice. classes should
be the byte classes computed for the noncontiguous NFA that the given
state came from.
This returns an error if dst became so big that StateIDs can no
longer be created for new states.
Sourcefn write_dense_trans(
    nnfa: &NFA,
    oldsid: StateID,
    classes: &ByteClasses,
    dst: &mut Vec<u32>,
) -> Result<(), BuildError>
 
fn write_dense_trans( nnfa: &NFA, oldsid: StateID, classes: &ByteClasses, dst: &mut Vec<u32>, ) -> Result<(), BuildError>
Encode the “old” state transitions from a noncontiguous NFA to its
binary dense representation to the given dst slice. classes should
be the byte classes computed for the noncontiguous NFA that the given
state came from.
This returns an error if dst became so big that StateIDs can no
longer be created for new states.
Sourcefn transitions(&self) -> impl Iterator<Item = (u8, StateID)> + '_
 
fn transitions(&self) -> impl Iterator<Item = (u8, StateID)> + '_
Return an iterator over every explicitly defined transition in this state.