Struct regex_automata::nfa::thompson::backtrack::Config

source ·
pub struct Config {
    pre: Option<Option<Prefilter>>,
    visited_capacity: Option<usize>,
}
Expand description

The configuration used for building a bounded backtracker.

A bounded backtracker configuration is a simple data object that is typically used with Builder::configure.

Fields§

§pre: Option<Option<Prefilter>>§visited_capacity: Option<usize>

Implementations§

source§

impl Config

source

pub fn new() -> Config

Return a new default regex configuration.

source

pub fn prefilter(self, pre: Option<Prefilter>) -> Config

Set a prefilter to be used whenever a start state is entered.

A Prefilter in this context is meant to accelerate searches by looking for literal prefixes that every match for the corresponding pattern (or patterns) must start with. Once a prefilter produces a match, the underlying search routine continues on to try and confirm the match.

Be warned that setting a prefilter does not guarantee that the search will be faster. While it’s usually a good bet, if the prefilter produces a lot of false positive candidates (i.e., positions matched by the prefilter but not by the regex), then the overall result can be slower than if you had just executed the regex engine without any prefilters.

By default no prefilter is set.

§Example
use regex_automata::{
    nfa::thompson::backtrack::BoundedBacktracker,
    util::prefilter::Prefilter,
    Input, Match, MatchKind,
};

let pre = Prefilter::new(MatchKind::LeftmostFirst, &["foo", "bar"]);
let re = BoundedBacktracker::builder()
    .configure(BoundedBacktracker::config().prefilter(pre))
    .build(r"(foo|bar)[a-z]+")?;
let mut cache = re.create_cache();
let input = Input::new("foo1 barfox bar");
assert_eq!(
    Some(Match::must(0, 5..11)),
    re.try_find(&mut cache, input)?,
);

Be warned though that an incorrect prefilter can lead to incorrect results!

use regex_automata::{
    nfa::thompson::backtrack::BoundedBacktracker,
    util::prefilter::Prefilter,
    Input, HalfMatch, MatchKind,
};

let pre = Prefilter::new(MatchKind::LeftmostFirst, &["foo", "car"]);
let re = BoundedBacktracker::builder()
    .configure(BoundedBacktracker::config().prefilter(pre))
    .build(r"(foo|bar)[a-z]+")?;
let mut cache = re.create_cache();
let input = Input::new("foo1 barfox bar");
// No match reported even though there clearly is one!
assert_eq!(None, re.try_find(&mut cache, input)?);
source

pub fn visited_capacity(self, capacity: usize) -> Config

Set the visited capacity used to bound backtracking.

The visited capacity represents the amount of heap memory (in bytes) to allocate toward tracking which parts of the backtracking search have been done before. The heap memory needed for any particular search is proportional to haystack.len() * nfa.states().len(), which an be quite large. Therefore, the bounded backtracker is typically only able to run on shorter haystacks.

For a given regex, increasing the visited capacity means that the maximum haystack length that can be searched is increased. The BoundedBacktracker::max_haystack_len method returns that maximum.

The default capacity is a reasonable but empirically chosen size.

§Example

As with other regex engines, Unicode is what tends to make the bounded backtracker less useful by making the maximum haystack length quite small. If necessary, increasing the visited capacity using this routine will increase the maximum haystack length at the cost of using more memory.

Note though that the specific maximum values here are not an API guarantee. The default visited capacity is subject to change and not covered by semver.

use regex_automata::nfa::thompson::backtrack::BoundedBacktracker;

// Unicode inflates the size of the underlying NFA quite a bit, and
// thus means that the backtracker can only handle smaller haystacks,
// assuming that the visited capacity remains unchanged.
let re = BoundedBacktracker::new(r"\w+")?;
assert!(re.max_haystack_len() <= 7_000);
// But we can increase the visited capacity to handle bigger haystacks!
let re = BoundedBacktracker::builder()
    .configure(BoundedBacktracker::config().visited_capacity(1<<20))
    .build(r"\w+")?;
assert!(re.max_haystack_len() >= 25_000);
assert!(re.max_haystack_len() <= 28_000);
source

pub fn get_prefilter(&self) -> Option<&Prefilter>

Returns the prefilter set in this configuration, if one at all.

source

pub fn get_visited_capacity(&self) -> usize

Returns the configured visited capacity.

Note that the actual capacity used may be slightly bigger than the configured capacity.

source

pub(crate) fn overwrite(&self, o: Config) -> Config

Overwrite the default configuration such that the options in o are always used. If an option in o is not set, then the corresponding option in self is used. If it’s not set in self either, then it remains not set.

Trait Implementations§

source§

impl Clone for Config

source§

fn clone(&self) -> Config

Returns a copy of the value. Read more
1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
source§

impl Debug for Config

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
source§

impl Default for Config

source§

fn default() -> Config

Returns the “default value” for a type. Read more

Auto Trait Implementations§

§

impl Freeze for Config

§

impl RefUnwindSafe for Config

§

impl Send for Config

§

impl Sync for Config

§

impl Unpin for Config

§

impl UnwindSafe for Config

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> CloneToUninit for T
where T: Clone,

source§

unsafe fn clone_to_uninit(&self, dst: *mut T)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> ToOwned for T
where T: Clone,

source§

type Owned = T

The resulting type after obtaining ownership.
source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

source§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.