Struct aho_corasick::packed::pattern::Patterns
source · pub(crate) struct Patterns {
kind: MatchKind,
by_id: Vec<Vec<u8>>,
order: Vec<PatternID>,
minimum_len: usize,
total_pattern_bytes: usize,
}
Expand description
A non-empty collection of non-empty patterns to search for.
This collection of patterns is what is passed around to both execute searches and to construct the searchers themselves. Namely, this permits searches to avoid copying all of the patterns, and allows us to keep only one copy throughout all packed searchers.
Note that this collection is not a set. The same pattern can appear more than once.
Fields§
§kind: MatchKind
The match semantics supported by this collection of patterns.
The match semantics determines the order of the iterator over patterns. For leftmost-first, patterns are provided in the same order as were provided by the caller. For leftmost-longest, patterns are provided in descending order of length, with ties broken by the order in which they were provided by the caller.
by_id: Vec<Vec<u8>>
The collection of patterns, indexed by their identifier.
order: Vec<PatternID>
The order of patterns defined for iteration, given by pattern
identifiers. The order of by_id
and order
is always the same for
leftmost-first semantics, but may be different for leftmost-longest
semantics.
minimum_len: usize
The length of the smallest pattern, in bytes.
total_pattern_bytes: usize
The total number of pattern bytes across the entire collection. This is used for reporting total heap usage in constant time.
Implementations§
source§impl Patterns
impl Patterns
sourcepub(crate) fn new() -> Patterns
pub(crate) fn new() -> Patterns
Create a new collection of patterns for the given match semantics. The
ID of each pattern is the index of the pattern at which it occurs in
the by_id
slice.
If any of the patterns in the slice given are empty, then this panics. Similarly, if the number of patterns given is zero, then this also panics.
sourcepub(crate) fn add(&mut self, bytes: &[u8])
pub(crate) fn add(&mut self, bytes: &[u8])
Add a pattern to this collection.
This panics if the pattern given is empty.
sourcepub(crate) fn set_match_kind(&mut self, kind: MatchKind)
pub(crate) fn set_match_kind(&mut self, kind: MatchKind)
Set the match kind semantics for this collection of patterns.
If the kind is not set, then the default is leftmost-first.
sourcepub(crate) fn len(&self) -> usize
pub(crate) fn len(&self) -> usize
Return the number of patterns in this collection.
This is guaranteed to be greater than zero.
sourcepub(crate) fn is_empty(&self) -> bool
pub(crate) fn is_empty(&self) -> bool
Returns true if and only if this collection of patterns is empty.
sourcepub(crate) fn memory_usage(&self) -> usize
pub(crate) fn memory_usage(&self) -> usize
Returns the approximate total amount of heap used by these patterns, in units of bytes.
sourcepub(crate) fn reset(&mut self)
pub(crate) fn reset(&mut self)
Clears all heap memory associated with this collection of patterns and resets all state such that it is a valid empty collection.
sourcepub(crate) fn minimum_len(&self) -> usize
pub(crate) fn minimum_len(&self) -> usize
Returns the length, in bytes, of the smallest pattern.
This is guaranteed to be at least one.
sourcepub(crate) fn match_kind(&self) -> &MatchKind
pub(crate) fn match_kind(&self) -> &MatchKind
Returns the match semantics used by these patterns.
sourcepub(crate) fn get(&self, id: PatternID) -> Pattern<'_>
pub(crate) fn get(&self, id: PatternID) -> Pattern<'_>
Return the pattern with the given identifier. If such a pattern does not exist, then this panics.
sourcepub(crate) unsafe fn get_unchecked(&self, id: PatternID) -> Pattern<'_>
pub(crate) unsafe fn get_unchecked(&self, id: PatternID) -> Pattern<'_>
Return the pattern with the given identifier without performing bounds checks.
§Safety
Callers must ensure that a pattern with the given identifier exists before using this method.
sourcepub(crate) fn iter(&self) -> PatternIter<'_> ⓘ
pub(crate) fn iter(&self) -> PatternIter<'_> ⓘ
Return an iterator over all the patterns in this collection, in the order in which they should be matched.
Specifically, in a naive multi-pattern matcher, the following is guaranteed to satisfy the match semantics of this collection of patterns:
for i in 0..haystack.len():
for p in patterns.iter():
if haystack[i..].starts_with(p.bytes()):
return Match(p.id(), i, i + p.bytes().len())
Namely, among the patterns in a collection, if they are matched in the order provided by this iterator, then the result is guaranteed to satisfy the correct match semantics. (Either leftmost-first or leftmost-longest.)