regex_automata::hybrid::dfa

Struct Lazy

Help

struct Lazy<'i, 'c> {
    dfa: &'i DFA,
    cache: &'c mut Cache,
}

Expand description

A type that groups methods that require the base NFA/DFA and writable access to the cache.

Fields§

§dfa: &'i DFA§cache: &'c mut Cache

Implementations§

impl<'i, 'c> Lazy<'i, 'c>

fn new(dfa: &'i DFA, cache: &'c mut Cache) -> Lazy<'i, 'c>

Creates a new ‘Lazy’ wrapper for a DFA and its corresponding cache.

fn as_ref<'a>(&'a self) -> LazyRef<'i, 'a>

Return an immutable view by downgrading a writable cache to a read-only cache.

fn cache_next_state( &mut self, current: LazyStateID, unit: Unit, ) -> Result<LazyStateID, CacheError>

This is marked as ‘inline(never)’ to avoid bloating methods on ‘DFA’ like ‘next_state’ and ‘next_eoi_state’ that are called in critical areas. The idea is to let the optimizer focus on the other areas of those methods as the hot path.

Here’s an example that justifies ‘inline(never)’

regex-cli find match hybrid \
  --cache-capacity 100000000 \
  -p '\pL{100}'
  all-codepoints-utf8-100x

Where ‘all-codepoints-utf8-100x’ is the UTF-8 encoding of every codepoint, in sequence, repeated 100 times.

With ‘inline(never)’ hyperfine reports 1.1s per run. With ‘inline(always)’, hyperfine reports 1.23s. So that’s a 10% improvement.

fn cache_start_group( &mut self, anchored: Anchored, start: Start, ) -> Result<LazyStateID, StartError>

Compute and cache the starting state for the given pattern ID (if present) and the starting configuration.

This panics if a pattern ID is given and the DFA isn’t configured to build anchored start states for each pattern.

This will never return an unknown lazy state ID.

If caching this state would otherwise result in a cache that has been cleared too many times, then an error is returned.

fn cache_start_one( &mut self, nfa_start_id: NFAStateID, start: Start, ) -> Result<LazyStateID, CacheError>

Compute and cache the starting state for the given NFA state ID and the starting configuration. The NFA state ID might be one of the following:

An unanchored start state to match any pattern.
An anchored start state to match any pattern.
An anchored start state for a particular pattern.

This will never return an unknown lazy state ID.

If caching this state would otherwise result in a cache that has been cleared too many times, then an error is returned.

fn add_builder_state( &mut self, builder: StateBuilderNFA, idmap: impl Fn(LazyStateID) -> LazyStateID, ) -> Result<LazyStateID, CacheError>

Either add the given builder state to this cache, or return an ID to an equivalent state already in this cache.

In the case where no equivalent state exists, the idmap function given may be used to transform the identifier allocated. This is useful if the caller needs to tag the ID with additional information.

This will never return an unknown lazy state ID.

If caching this state would otherwise result in a cache that has been cleared too many times, then an error is returned.

fn add_state( &mut self, state: State, idmap: impl Fn(LazyStateID) -> LazyStateID, ) -> Result<LazyStateID, CacheError>

Allocate a new state ID and add the given state to this cache.

The idmap function given may be used to transform the identifier allocated. This is useful if the caller needs to tag the ID with additional information.

This will never return an unknown lazy state ID.

If caching this state would otherwise result in a cache that has been cleared too many times, then an error is returned.

fn next_state_id(&mut self) -> Result<LazyStateID, CacheError>

Allocate a new state ID.

This will never return an unknown lazy state ID.

If caching this state would otherwise result in a cache that has been cleared too many times, then an error is returned.

fn try_clear_cache(&mut self) -> Result<(), CacheError>

Attempt to clear the cache used by this lazy DFA.

If clearing the cache exceeds the minimum number of required cache clearings, then this will return a cache error. In this case, callers should bubble this up as the cache can’t be used until it is reset. Implementations of search should convert this error into a MatchError::gave_up.

If ‘self.state_saver’ is set to save a state, then this state is persisted through cache clearing. Otherwise, the cache is returned to its state after initialization with two exceptions: its clear count is incremented and some of its memory likely has additional capacity. That is, clearing a cache does not release memory.

Otherwise, any lazy state ID generated by the cache prior to resetting it is invalid after the reset.

fn reset_cache(&mut self)

Clears and resets the cache. Resetting the cache means that no states are persisted and the clear count is reset to 0. No heap memory is released.

Note that the caller may reset a cache with a different DFA than what it was created from. In which case, the cache can now be used with the new DFA (and not the old DFA).

fn clear_cache(&mut self)

Clear the cache used by this lazy DFA.

If ‘self.state_saver’ is set to save a state, then this state is persisted through cache clearing. Otherwise, the cache is returned to its state after initialization with two exceptions: its clear count is incremented and some of its memory likely has additional capacity. That is, clearing a cache does not release memory.

Otherwise, any lazy state ID generated by the cache prior to resetting it is invalid after the reset.

fn init_cache(&mut self)

Initialize this cache from emptiness to a place where it can be used for search.

This is called both at cache creation time and after the cache has been cleared.

Primarily, this adds the three sentinel states and allocates some initial memory.

fn save_state(&mut self, id: LazyStateID)

Save the state corresponding to the ID given such that the state persists through a cache clearing.

While the state may persist, the ID may not. In order to discover the new state ID, one must call ‘saved_state_id’ after a cache clearing.

fn saved_state_id(&mut self) -> LazyStateID

Returns the updated lazy state ID for a state that was persisted through a cache clearing.

It is only correct to call this routine when both a state has been saved and the cache has just been cleared. Otherwise, this panics.

fn set_all_transitions(&mut self, from: LazyStateID, to: LazyStateID)

Set all transitions on the state ‘from’ to ‘to’.

fn set_transition(&mut self, from: LazyStateID, unit: Unit, to: LazyStateID)

Set the transition on ‘from’ for ‘unit’ to ‘to’.

This panics if either ‘from’ or ‘to’ is invalid.

All unit values are OK.

fn set_start_state(&mut self, anchored: Anchored, start: Start, id: LazyStateID)

Set the start ID for the given pattern ID (if given) and starting configuration to the ID given.

This panics if ‘id’ is not valid or if a pattern ID is given and ‘starts_for_each_pattern’ is not enabled.

fn get_state_builder(&mut self) -> StateBuilderEmpty

Returns a state builder from this DFA that might have existing capacity. This helps avoid allocs in cases where a state is built that turns out to already be cached.

Callers must put the state builder back with ‘put_state_builder’, otherwise the allocation reuse won’t work.

fn put_state_builder(&mut self, builder: StateBuilderNFA)

Puts the given state builder back into this DFA for reuse.

Note that building a ‘State’ from a builder always creates a new alloc, so callers should always put the builder back.

Trait Implementations§

impl<'i, 'c> Debug for Lazy<'i, 'c>

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

impl<'i, 'c> Freeze for Lazy<'i, 'c>

impl<'i, 'c> RefUnwindSafe for Lazy<'i, 'c>

impl<'i, 'c> Send for Lazy<'i, 'c>

impl<'i, 'c> Sync for Lazy<'i, 'c>

impl<'i, 'c> Unpin for Lazy<'i, 'c>

impl<'i, 'c> !UnwindSafe for Lazy<'i, 'c>

Blanket Implementations§

impl<T> Any for T
where T: 'static + ?Sized,

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

impl<T> Borrow<T> for T
where T: ?Sized,

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

impl<T> BorrowMut<T> for T
where T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

impl<T> From<T> for T

fn from(t: T) -> T

Returns the argument unchanged.

impl<T, U> Into<U> for T
where U: From<T>,

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

impl<T, U> TryFrom<U> for T
where U: Into<T>,

type Error = Infallible

The type returned in the event of a conversion error.

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.