Struct DFA

Help

pub struct DFA {Show 13 fields
    trans: Vec<StateID>,
    matches: Vec<Vec<PatternID>>,
    matches_memory_usage: usize,
    pattern_lens: Vec<SmallIndex>,
    prefilter: Option<Prefilter>,
    match_kind: MatchKind,
    state_len: usize,
    alphabet_len: usize,
    stride2: usize,
    byte_classes: ByteClasses,
    min_pattern_len: usize,
    max_pattern_len: usize,
    special: Special,
}

Expand description

A DFA implementation of Aho-Corasick.

When possible, prefer using AhoCorasick instead of this type directly. Using a DFA directly is typically only necessary when one needs access to the Automaton trait implementation.

This DFA can only be built by first constructing a noncontiguous::NFA. Both DFA::new and Builder::build do this for you automatically, but Builder::build_from_noncontiguous permits doing it explicitly.

A DFA provides the best possible search performance (in this crate) via two mechanisms:

All states use a dense representation for their transitions.
All failure transitions are pre-computed such that they are never explicitly handled at search time.

These two facts combined mean that every state transition is performed using a constant number of instructions. However, this comes at great cost. The memory usage of a DFA can be quite exorbitant. It is potentially multiple orders of magnitude greater than a contiguous::NFA for example. In exchange, a DFA will typically have better search speed than a contiguous::NFA, but not by orders of magnitude.

Unless you have a small number of patterns or memory usage is not a concern and search performance is critical, a DFA is usually not the best choice.

Moreover, unlike the NFAs in this crate, it is costly for a DFA to support for anchored and unanchored search configurations. Namely, since failure transitions are pre-computed, supporting both anchored and unanchored searches requires a duplication of the transition table, making the memory usage of such a DFA ever bigger. (The NFAs in this crate unconditionally support both anchored and unanchored searches because there is essentially no added cost for doing so.) It is for this reason that a DFA’s support for anchored and unanchored searches can be configured via Builder::start_kind. By default, a DFA only supports unanchored searches.

§Example

This example shows how to build an DFA directly and use it to execute Automaton::try_find:

use aho_corasick::{
    automaton::Automaton,
    dfa::DFA,
    Input, Match,
};

let patterns = &["b", "abc", "abcd"];
let haystack = "abcd";

let nfa = DFA::new(patterns).unwrap();
assert_eq!(
    Some(Match::must(0, 1..2)),
    nfa.try_find(&Input::new(haystack))?,
);

It is also possible to implement your own version of try_find. See the Automaton documentation for an example.

Fields§

§trans: Vec<StateID>

The DFA transition table. IDs in this table are pre-multiplied. So instead of the IDs being 0, 1, 2, 3, …, they are 0stride, 1stride, 2stride, 3stride, …

§matches: Vec<Vec<PatternID>>

The matches for every match state in this DFA. This is first indexed by state index (so that’s sid >> stride2) and then by order in which the matches are meant to occur.

§matches_memory_usage: usize

The amount of heap memory used, in bytes, by the inner Vecs of ‘matches’.

§pattern_lens: Vec<SmallIndex>

The length of each pattern. This is used to compute the start offset of a match.

§prefilter: Option<Prefilter>

A prefilter for accelerating searches, if one exists.

§match_kind: MatchKind

The match semantics built into this DFA.

§state_len: usize

The total number of states in this DFA.

§alphabet_len: usize

The alphabet size, or total number of equivalence classes, for this DFA. Note that the actual number of transitions in each state is stride=2^stride2, where stride is the smallest power of 2 greater than or equal to alphabet_len. We do things this way so that we can use bitshifting to go from a state ID to an index into ‘matches’.

§stride2: usize

The exponent with a base 2, such that stride=2^stride2. Given a state index ‘i’, its state identifier is ‘i << stride2’. Given a state identifier ‘sid’, its state index is ‘sid >> stride2’.

§byte_classes: ByteClasses

The equivalence classes for this DFA. All transitions are defined on equivalence classes and not on the 256 distinct byte values.

§min_pattern_len: usize

The length of the shortest pattern in this automaton.

§max_pattern_len: usize

The length of the longest pattern in this automaton.

§special: Special

The information required to deduce which states are “special” in this DFA.

Struct DFACopy item path

§Example

Fields§

Implementations§

impl DFA

pub fn new<I, P>(patterns: I) -> Result<DFA, BuildError>where I: IntoIterator<Item = P>, P: AsRef<[u8]>,

pub fn builder() -> Builder

impl DFA

const DEAD: StateID

fn set_matches(&mut self, sid: StateID, pids: impl Iterator<Item = PatternID>)

Trait Implementations§

impl Automaton for DFA

fn start_state(&self, anchored: Anchored) -> Result<StateID, MatchError>

fn next_state(&self, _anchored: Anchored, sid: StateID, byte: u8) -> StateID

fn is_special(&self, sid: StateID) -> bool

fn is_dead(&self, sid: StateID) -> bool

fn is_match(&self, sid: StateID) -> bool

fn is_start(&self, sid: StateID) -> bool

fn match_kind(&self) -> MatchKind

fn patterns_len(&self) -> usize

fn pattern_len(&self, pid: PatternID) -> usize

fn min_pattern_len(&self) -> usize

fn max_pattern_len(&self) -> usize

fn match_len(&self, sid: StateID) -> usize

fn match_pattern(&self, sid: StateID, index: usize) -> PatternID

fn memory_usage(&self) -> usize

fn prefilter(&self) -> Option<&Prefilter>

fn try_find(&self, input: &Input<'_>) -> Result<Option<Match>, MatchError>

fn try_find_overlapping( &self, input: &Input<'_>, state: &mut OverlappingState, ) -> Result<(), MatchError>

fn try_find_iter<'a, 'h>( &'a self, input: Input<'h>, ) -> Result<FindIter<'a, 'h, Self>, MatchError>where Self: Sized,

fn try_find_overlapping_iter<'a, 'h>( &'a self, input: Input<'h>, ) -> Result<FindOverlappingIter<'a, 'h, Self>, MatchError>where Self: Sized,

fn try_replace_all<B>( &self, haystack: &str, replace_with: &[B], ) -> Result<String, MatchError>where Self: Sized, B: AsRef<str>,

fn try_replace_all_bytes<B>( &self, haystack: &[u8], replace_with: &[B], ) -> Result<Vec<u8>, MatchError>where Self: Sized, B: AsRef<[u8]>,

fn try_replace_all_with<F>( &self, haystack: &str, dst: &mut String, replace_with: F, ) -> Result<(), MatchError>where Self: Sized, F: FnMut(&Match, &str, &mut String) -> bool,

fn try_replace_all_with_bytes<F>( &self, haystack: &[u8], dst: &mut Vec<u8>, replace_with: F, ) -> Result<(), MatchError>where Self: Sized, F: FnMut(&Match, &[u8], &mut Vec<u8>) -> bool,

fn try_stream_find_iter<'a, R: Read>( &'a self, rdr: R, ) -> Result<StreamFindIter<'a, Self, R>, MatchError>where Self: Sized,

fn try_stream_replace_all<R, W, B>( &self, rdr: R, wtr: W, replace_with: &[B], ) -> Result<()>where Self: Sized, R: Read, W: Write, B: AsRef<[u8]>,

fn try_stream_replace_all_with<R, W, F>( &self, rdr: R, wtr: W, replace_with: F, ) -> Result<()>where Self: Sized, R: Read, W: Write, F: FnMut(&Match, &[u8], &mut W) -> Result<()>,

impl Clone for DFA

fn clone(&self) -> DFA

fn clone_from(&mut self, source: &Self)

impl Debug for DFA

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl Sealed for DFA

Auto Trait Implementations§

impl Freeze for DFA

impl RefUnwindSafe for DFA

impl Send for DFA

impl Sync for DFA

impl Unpin for DFA

impl UnwindSafe for DFA

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dst: *mut u8)

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Struct DFA

pub fn new<I, P>(patterns: I) -> Result<DFA, BuildError>
where I: IntoIterator<Item = P>, P: AsRef<[u8]>,

fn try_find_iter<'a, 'h>( &'a self, input: Input<'h>, ) -> Result<FindIter<'a, 'h, Self>, MatchError>
where Self: Sized,

fn try_find_overlapping_iter<'a, 'h>( &'a self, input: Input<'h>, ) -> Result<FindOverlappingIter<'a, 'h, Self>, MatchError>
where Self: Sized,

fn try_replace_all<B>( &self, haystack: &str, replace_with: &[B], ) -> Result<String, MatchError>
where Self: Sized, B: AsRef<str>,

fn try_replace_all_bytes<B>( &self, haystack: &[u8], replace_with: &[B], ) -> Result<Vec<u8>, MatchError>
where Self: Sized, B: AsRef<[u8]>,

fn try_replace_all_with<F>( &self, haystack: &str, dst: &mut String, replace_with: F, ) -> Result<(), MatchError>
where Self: Sized, F: FnMut(&Match, &str, &mut String) -> bool,

fn try_replace_all_with_bytes<F>( &self, haystack: &[u8], dst: &mut Vec<u8>, replace_with: F, ) -> Result<(), MatchError>
where Self: Sized, F: FnMut(&Match, &[u8], &mut Vec<u8>) -> bool,

fn try_stream_find_iter<'a, R: Read>( &'a self, rdr: R, ) -> Result<StreamFindIter<'a, Self, R>, MatchError>
where Self: Sized,

fn try_stream_replace_all<R, W, B>( &self, rdr: R, wtr: W, replace_with: &[B], ) -> Result<()>
where Self: Sized, R: Read, W: Write, B: AsRef<[u8]>,

fn try_stream_replace_all_with<R, W, F>( &self, rdr: R, wtr: W, replace_with: F, ) -> Result<()>
where Self: Sized, R: Read, W: Write, F: FnMut(&Match, &[u8], &mut W) -> Result<()>,

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,