Struct regex_automata::nfa::thompson::pikevm::PikeVM

source ·

pub struct PikeVM {
    config: Config,
    nfa: NFA,
}

Expand description

A virtual machine for executing regex searches with capturing groups.

§Infallible APIs

Unlike most other regex engines in this crate, a PikeVM never returns an error at search time. It supports all Anchored configurations, never quits and works on haystacks of arbitrary length.

There are two caveats to mention though:

If an invalid pattern ID is given to a search via Anchored::Pattern, then the PikeVM will report “no match.” This is consistent with all other regex engines in this crate.
When using PikeVM::which_overlapping_matches with a PatternSet that has insufficient capacity to store all valid pattern IDs, then if a match occurs for a PatternID that cannot be inserted, it is silently dropped as if it did not match.

§Advice

The PikeVM is generally the most “powerful” regex engine in this crate. “Powerful” in this context means that it can handle any regular expression that is parseable by regex-syntax and any size haystack. Regretably, the PikeVM is also simultaneously often the slowest regex engine in practice. This results in an annoying situation where one generally tries to pick any other regex engine (or perhaps none at all) before being forced to fall back to a PikeVM.

For example, a common strategy for dealing with capturing groups is to actually look for the overall match of the regex using a faster regex engine, like a lazy DFA. Once the overall match is found, one can then run the PikeVM on just the match span to find the spans of the capturing groups. In this way, the faster regex engine does the majority of the work, while the PikeVM only lends its power in a more limited role.

Unfortunately, this isn’t always possible because the faster regex engines don’t support all of the regex features in regex-syntax. This notably includes (and is currently limited to) Unicode word boundaries. So if your pattern has Unicode word boundaries, you typically can’t use a DFA-based regex engine at all (unless you enable heuristic support for it). (The one-pass DFA can handle Unicode word boundaries for anchored searches only, but in a cruel sort of joke, many Unicode features tend to result in making the regex not one-pass.)

§Example

This example shows that the PikeVM implements Unicode word boundaries correctly by default.

use regex_automata::{nfa::thompson::pikevm::PikeVM, Match};

let re = PikeVM::new(r"\b\w+\b")?;
let mut cache = re.create_cache();

let mut it = re.find_iter(&mut cache, "Шерлок Холмс");
assert_eq!(Some(Match::must(0, 0..12)), it.next());
assert_eq!(Some(Match::must(0, 13..23)), it.next());
assert_eq!(None, it.next());

Fields§

§config: Config§nfa: NFA

Struct regex_automata::nfa::thompson::pikevm::PikeVMCopy item path

§Infallible APIs

§Advice

§Example

Fields§

Implementations§

impl PikeVM

pub fn new(pattern: &str) -> Result<PikeVM, BuildError>

§Example

pub fn new_many<P: AsRef<str>>(patterns: &[P]) -> Result<PikeVM, BuildError>

§Example

pub fn new_from_nfa(nfa: NFA) -> Result<PikeVM, BuildError>

§Example

pub fn always_match() -> Result<PikeVM, BuildError>

§Example

pub fn never_match() -> Result<PikeVM, BuildError>

§Example

pub fn config() -> Config

§Example

pub fn builder() -> Builder

§Example

pub fn create_captures(&self) -> Captures

pub fn create_cache(&self) -> Cache

pub fn reset_cache(&self, cache: &mut Cache)

§Example

pub fn pattern_len(&self) -> usize

§Example

pub fn get_config(&self) -> &Config

pub fn get_nfa(&self) -> &NFA

impl PikeVM

pub fn is_match<'h, I: Into<Input<'h>>>( &self, cache: &mut Cache, input: I, ) -> bool

§Example

§Example: consistency with search APIs

pub fn find<'h, I: Into<Input<'h>>>( &self, cache: &mut Cache, input: I, ) -> Option<Match>

§Example

pub fn captures<'h, I: Into<Input<'h>>>( &self, cache: &mut Cache, input: I, caps: &mut Captures, )

§Example

pub fn find_iter<'r, 'c, 'h, I: Into<Input<'h>>>( &'r self, cache: &'c mut Cache, input: I, ) -> FindMatches<'r, 'c, 'h> ⓘ

§Example

pub fn captures_iter<'r, 'c, 'h, I: Into<Input<'h>>>( &'r self, cache: &'c mut Cache, input: I, ) -> CapturesMatches<'r, 'c, 'h> ⓘ

§Example

impl PikeVM

pub fn search(&self, cache: &mut Cache, input: &Input<'_>, caps: &mut Captures)

§Example: specific pattern search

§Example: specifying the bounds of a search

pub fn search_slots( &self, cache: &mut Cache, input: &Input<'_>, slots: &mut [Option<NonMaxUsize>], ) -> Option<PatternID>

§Example

fn search_slots_imp( &self, cache: &mut Cache, input: &Input<'_>, slots: &mut [Option<NonMaxUsize>], ) -> Option<HalfMatch>

pub fn which_overlapping_matches( &self, cache: &mut Cache, input: &Input<'_>, patset: &mut PatternSet, )

§Example

impl PikeVM

fn search_imp( &self, cache: &mut Cache, input: &Input<'_>, slots: &mut [Option<NonMaxUsize>], ) -> Option<HalfMatch>

fn which_overlapping_imp( &self, cache: &mut Cache, input: &Input<'_>, patset: &mut PatternSet, )

fn nexts( &self, stack: &mut Vec<FollowEpsilon>, curr: &mut ActiveStates, next: &mut ActiveStates, input: &Input<'_>, at: usize, slots: &mut [Option<NonMaxUsize>], ) -> Option<PatternID>

fn nexts_overlapping( &self, stack: &mut Vec<FollowEpsilon>, curr: &mut ActiveStates, next: &mut ActiveStates, input: &Input<'_>, at: usize, patset: &mut PatternSet, )

fn next( &self, stack: &mut Vec<FollowEpsilon>, curr_slot_table: &mut SlotTable, next: &mut ActiveStates, input: &Input<'_>, at: usize, sid: StateID, ) -> Option<PatternID>

fn epsilon_closure( &self, stack: &mut Vec<FollowEpsilon>, curr_slots: &mut [Option<NonMaxUsize>], next: &mut ActiveStates, input: &Input<'_>, at: usize, sid: StateID, )

fn epsilon_closure_explore( &self, stack: &mut Vec<FollowEpsilon>, curr_slots: &mut [Option<NonMaxUsize>], next: &mut ActiveStates, input: &Input<'_>, at: usize, sid: StateID, )

fn start_config(&self, input: &Input<'_>) -> Option<(bool, StateID)>

Trait Implementations§

impl Clone for PikeVM

fn clone(&self) -> PikeVM

fn clone_from(&mut self, source: &Self)

impl Debug for PikeVM

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Auto Trait Implementations§

impl Freeze for PikeVM

impl RefUnwindSafe for PikeVM

impl Send for PikeVM

impl Sync for PikeVM

impl Unpin for PikeVM

impl UnwindSafe for PikeVM

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

Struct regex_automata::nfa::thompson::pikevm::PikeVM

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,