Struct NFA

Source

pub struct NFA {
    repr: Vec<u32>,
    pattern_lens: Vec<SmallIndex>,
    state_len: usize,
    prefilter: Option<Prefilter>,
    match_kind: MatchKind,
    alphabet_len: usize,
    byte_classes: ByteClasses,
    min_pattern_len: usize,
    max_pattern_len: usize,
    special: Special,
}

Expand description

A contiguous NFA implementation of Aho-Corasick.

When possible, prefer using AhoCorasick instead of this type directly. Using an NFA directly is typically only necessary when one needs access to the Automaton trait implementation.

This NFA can only be built by first constructing a noncontiguous::NFA. Both NFA::new and Builder::build do this for you automatically, but Builder::build_from_noncontiguous permits doing it explicitly.

The main difference between a noncontiguous NFA and a contiguous NFA is that the latter represents all of its states and transitions in a single allocation, where as the former uses a separate allocation for each state. Doing this at construction time while keeping a low memory footprint isn’t feasible, which is primarily why there are two different NFA types: one that does the least amount of work possible to build itself, and another that does a little extra work to compact itself and make state transitions faster by making some states use a dense representation.

Because a contiguous NFA uses a single allocation, there is a lot more opportunity for compression tricks to reduce the heap memory used. Indeed, it is not uncommon for a contiguous NFA to use an order of magnitude less heap memory than a noncontiguous NFA. Since building a contiguous NFA usually only takes a fraction of the time it takes to build a noncontiguous NFA, the overall build time is not much slower. Thus, in most cases, a contiguous NFA is the best choice.

Since a contiguous NFA uses various tricks for compression and to achieve faster state transitions, currently, its limit on the number of states is somewhat smaller than what a noncontiguous NFA can achieve. Generally speaking, you shouldn’t expect to run into this limit if the number of patterns is under 1 million. It is plausible that this limit will be increased in the future. If the limit is reached, building a contiguous NFA will return an error. Often, since building a contiguous NFA is relatively cheap, it can make sense to always try it even if you aren’t sure if it will fail or not. If it does, you can always fall back to a noncontiguous NFA. (Indeed, the main AhoCorasick type employs a strategy similar to this at construction time.)

§Example

This example shows how to build an NFA directly and use it to execute Automaton::try_find:

use aho_corasick::{
    automaton::Automaton,
    nfa::contiguous::NFA,
    Input, Match,
};

let patterns = &["b", "abc", "abcd"];
let haystack = "abcd";

let nfa = NFA::new(patterns).unwrap();
assert_eq!(
    Some(Match::must(0, 1..2)),
    nfa.try_find(&Input::new(haystack))?,
);

It is also possible to implement your own version of try_find. See the Automaton documentation for an example.

Fields§

§repr: Vec<u32>

The raw NFA representation. Each state is packed with a header (containing the format of the state, the failure transition and, for a sparse state, the number of transitions), its transitions and any matching pattern IDs for match states.

§pattern_lens: Vec<SmallIndex>

The length of each pattern. This is used to compute the start offset of a match.

§state_len: usize

The total number of states in this NFA.

§prefilter: Option<Prefilter>

A prefilter for accelerating searches, if one exists.

§match_kind: MatchKind

The match semantics built into this NFA.

§alphabet_len: usize

The alphabet size, or total number of equivalence classes, for this NFA. Dense states always have this many transitions.

§byte_classes: ByteClasses

The equivalence classes for this NFA. All transitions, dense and sparse, are defined on equivalence classes and not on the 256 distinct byte values.

§min_pattern_len: usize

The length of the shortest pattern in this automaton.

§max_pattern_len: usize

The length of the longest pattern in this automaton.

§special: Special

The information required to deduce which states are “special” in this NFA.

Struct NFA Copy item path

§Example

Fields§

Implementations§

impl NFA

pub fn new<I, P>(patterns: I) -> Result<NFA, BuildError>where I: IntoIterator<Item = P>, P: AsRef<[u8]>,

pub fn builder() -> Builder

impl NFA

const DEAD: StateID

const FAIL: StateID

Trait Implementations§

impl Automaton for NFA

fn start_state(&self, anchored: Anchored) -> Result<StateID, MatchError>

fn next_state(&self, anchored: Anchored, sid: StateID, byte: u8) -> StateID

fn is_special(&self, sid: StateID) -> bool

fn is_dead(&self, sid: StateID) -> bool

fn is_match(&self, sid: StateID) -> bool

fn is_start(&self, sid: StateID) -> bool

fn match_kind(&self) -> MatchKind

fn patterns_len(&self) -> usize

fn pattern_len(&self, pid: PatternID) -> usize

fn min_pattern_len(&self) -> usize

fn max_pattern_len(&self) -> usize

fn match_len(&self, sid: StateID) -> usize

fn match_pattern(&self, sid: StateID, index: usize) -> PatternID

fn memory_usage(&self) -> usize

fn prefilter(&self) -> Option<&Prefilter>

fn try_find(&self, input: &Input<'_>) -> Result<Option<Match>, MatchError>

fn try_find_overlapping( &self, input: &Input<'_>, state: &mut OverlappingState, ) -> Result<(), MatchError>

fn try_find_iter<'a, 'h>( &'a self, input: Input<'h>, ) -> Result<FindIter<'a, 'h, Self>, MatchError>where Self: Sized,

fn try_find_overlapping_iter<'a, 'h>( &'a self, input: Input<'h>, ) -> Result<FindOverlappingIter<'a, 'h, Self>, MatchError>where Self: Sized,

fn try_replace_all<B>( &self, haystack: &str, replace_with: &[B], ) -> Result<String, MatchError>where Self: Sized, B: AsRef<str>,

fn try_replace_all_bytes<B>( &self, haystack: &[u8], replace_with: &[B], ) -> Result<Vec<u8>, MatchError>where Self: Sized, B: AsRef<[u8]>,

fn try_replace_all_with<F>( &self, haystack: &str, dst: &mut String, replace_with: F, ) -> Result<(), MatchError>where Self: Sized, F: FnMut(&Match, &str, &mut String) -> bool,

fn try_replace_all_with_bytes<F>( &self, haystack: &[u8], dst: &mut Vec<u8>, replace_with: F, ) -> Result<(), MatchError>where Self: Sized, F: FnMut(&Match, &[u8], &mut Vec<u8>) -> bool,

fn try_stream_find_iter<'a, R: Read>( &'a self, rdr: R, ) -> Result<StreamFindIter<'a, Self, R>, MatchError>where Self: Sized,

fn try_stream_replace_all<R, W, B>( &self, rdr: R, wtr: W, replace_with: &[B], ) -> Result<()>where Self: Sized, R: Read, W: Write, B: AsRef<[u8]>,

fn try_stream_replace_all_with<R, W, F>( &self, rdr: R, wtr: W, replace_with: F, ) -> Result<()>where Self: Sized, R: Read, W: Write, F: FnMut(&Match, &[u8], &mut W) -> Result<()>,

impl Clone for NFA

fn clone(&self) -> NFA

fn clone_from(&mut self, source: &Self)

impl Debug for NFA

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl Sealed for NFA

Auto Trait Implementations§

impl Freeze for NFA

impl RefUnwindSafe for NFA

impl Send for NFA

impl Sync for NFA

impl Unpin for NFA

impl UnwindSafe for NFA

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Struct NFA

pub fn new<I, P>(patterns: I) -> Result<NFA, BuildError>
where I: IntoIterator<Item = P>, P: AsRef<[u8]>,

fn try_find_iter<'a, 'h>( &'a self, input: Input<'h>, ) -> Result<FindIter<'a, 'h, Self>, MatchError>
where Self: Sized,

fn try_find_overlapping_iter<'a, 'h>( &'a self, input: Input<'h>, ) -> Result<FindOverlappingIter<'a, 'h, Self>, MatchError>
where Self: Sized,

fn try_replace_all<B>( &self, haystack: &str, replace_with: &[B], ) -> Result<String, MatchError>
where Self: Sized, B: AsRef<str>,

fn try_replace_all_bytes<B>( &self, haystack: &[u8], replace_with: &[B], ) -> Result<Vec<u8>, MatchError>
where Self: Sized, B: AsRef<[u8]>,

fn try_replace_all_with<F>( &self, haystack: &str, dst: &mut String, replace_with: F, ) -> Result<(), MatchError>
where Self: Sized, F: FnMut(&Match, &str, &mut String) -> bool,

fn try_replace_all_with_bytes<F>( &self, haystack: &[u8], dst: &mut Vec<u8>, replace_with: F, ) -> Result<(), MatchError>
where Self: Sized, F: FnMut(&Match, &[u8], &mut Vec<u8>) -> bool,

fn try_stream_find_iter<'a, R: Read>( &'a self, rdr: R, ) -> Result<StreamFindIter<'a, Self, R>, MatchError>
where Self: Sized,

fn try_stream_replace_all<R, W, B>( &self, rdr: R, wtr: W, replace_with: &[B], ) -> Result<()>
where Self: Sized, R: Read, W: Write, B: AsRef<[u8]>,

fn try_stream_replace_all_with<R, W, F>( &self, rdr: R, wtr: W, replace_with: F, ) -> Result<()>
where Self: Sized, R: Read, W: Write, F: FnMut(&Match, &[u8], &mut W) -> Result<()>,

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,