Struct regex_automata::util::search::PatternSet

source ·

pub struct PatternSet {
    len: usize,
    which: Box<[bool]>,
}

Expand description

A set of PatternIDs.

A set of pattern identifiers is useful for recording which patterns have matched a particular haystack. A pattern set only includes pattern identifiers. It does not include offset information.

Example

This shows basic usage of a set.

use regex_automata::{PatternID, PatternSet};

let pid1 = PatternID::must(5);
let pid2 = PatternID::must(8);
// Create a new empty set.
let mut set = PatternSet::new(10);
// Insert pattern IDs.
set.insert(pid1);
set.insert(pid2);
// Test membership.
assert!(set.contains(pid1));
assert!(set.contains(pid2));
// Get all members.
assert_eq!(
    vec![5, 8],
    set.iter().map(|p| p.as_usize()).collect::<Vec<usize>>(),
);
// Clear the set.
set.clear();
// Test that it is indeed empty.
assert!(set.is_empty());

Fields§

§len: usize

The number of patterns set to ‘true’ in this set.

§which: Box<[bool]>

A map from PatternID to boolean of whether a pattern matches or not.

This should probably be a bitset, but it’s probably unlikely to matter much in practice.

The main downside of this representation (and similarly for a bitset) is that iteration scales with the capacity of the set instead of the length of the set. This doesn’t seem likely to be a problem in practice.

Another alternative is to just use a ‘SparseSet’ for this. It does use more memory (quite a bit more), but that seems fine I think compared to the memory being used by the regex engine. The real hiccup with it is that it yields pattern IDs in the order they were inserted. Which is actually kind of nice, but at the time of writing, pattern IDs are yielded in ascending order in the regex crate RegexSet API. If we did change to ‘SparseSet’, we could provide an additional ‘iter_match_order’ iterator, but keep the ascending order one for compatibility.

Struct regex_automata::util::search::PatternSet

Fields§

Implementations§

impl PatternSet

pub fn new(capacity: usize) -> PatternSet

pub fn clear(&mut self)

pub fn contains(&self, pid: PatternID) -> bool

pub fn insert(&mut self, pid: PatternID) -> bool

pub fn try_insert( &mut self, pid: PatternID ) -> Result<bool, PatternSetInsertError>

pub fn is_empty(&self) -> bool

pub fn is_full(&self) -> bool

pub fn len(&self) -> usize

pub fn capacity(&self) -> usize

pub fn iter(&self) -> PatternSetIter<'_> ⓘ

Trait Implementations§

impl Clone for PatternSet

fn clone(&self) -> PatternSet

fn clone_from(&mut self, source: &Self)

impl Debug for PatternSet

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl PartialEq<PatternSet> for PatternSet

fn eq(&self, other: &PatternSet) -> bool

fn ne(&self, other: &Rhs) -> bool

impl Eq for PatternSet

impl StructuralEq for PatternSet

impl StructuralPartialEq for PatternSet

Auto Trait Implementations§

impl RefUnwindSafe for PatternSet

impl Send for PatternSet

impl Sync for PatternSet

impl Unpin for PatternSet

impl UnwindSafe for PatternSet

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>