regex_automata::hybrid::dfa

Struct Lazy

source
struct Lazy<'i, 'c> {
    dfa: &'i DFA,
    cache: &'c mut Cache,
}
Expand description

A type that groups methods that require the base NFA/DFA and writable access to the cache.

Fields§

§dfa: &'i DFA§cache: &'c mut Cache

Implementations§

source§

impl<'i, 'c> Lazy<'i, 'c>

source

fn new(dfa: &'i DFA, cache: &'c mut Cache) -> Lazy<'i, 'c>

Creates a new ‘Lazy’ wrapper for a DFA and its corresponding cache.

source

fn as_ref<'a>(&'a self) -> LazyRef<'i, 'a>

Return an immutable view by downgrading a writable cache to a read-only cache.

source

fn cache_next_state( &mut self, current: LazyStateID, unit: Unit, ) -> Result<LazyStateID, CacheError>

This is marked as ‘inline(never)’ to avoid bloating methods on ‘DFA’ like ‘next_state’ and ‘next_eoi_state’ that are called in critical areas. The idea is to let the optimizer focus on the other areas of those methods as the hot path.

Here’s an example that justifies ‘inline(never)’

regex-cli find match hybrid \
  --cache-capacity 100000000 \
  -p '\pL{100}'
  all-codepoints-utf8-100x

Where ‘all-codepoints-utf8-100x’ is the UTF-8 encoding of every codepoint, in sequence, repeated 100 times.

With ‘inline(never)’ hyperfine reports 1.1s per run. With ‘inline(always)’, hyperfine reports 1.23s. So that’s a 10% improvement.

source

fn cache_start_group( &mut self, anchored: Anchored, start: Start, ) -> Result<LazyStateID, StartError>

Compute and cache the starting state for the given pattern ID (if present) and the starting configuration.

This panics if a pattern ID is given and the DFA isn’t configured to build anchored start states for each pattern.

This will never return an unknown lazy state ID.

If caching this state would otherwise result in a cache that has been cleared too many times, then an error is returned.

source

fn cache_start_one( &mut self, nfa_start_id: NFAStateID, start: Start, ) -> Result<LazyStateID, CacheError>

Compute and cache the starting state for the given NFA state ID and the starting configuration. The NFA state ID might be one of the following:

  1. An unanchored start state to match any pattern.
  2. An anchored start state to match any pattern.
  3. An anchored start state for a particular pattern.

This will never return an unknown lazy state ID.

If caching this state would otherwise result in a cache that has been cleared too many times, then an error is returned.

source

fn add_builder_state( &mut self, builder: StateBuilderNFA, idmap: impl Fn(LazyStateID) -> LazyStateID, ) -> Result<LazyStateID, CacheError>

Either add the given builder state to this cache, or return an ID to an equivalent state already in this cache.

In the case where no equivalent state exists, the idmap function given may be used to transform the identifier allocated. This is useful if the caller needs to tag the ID with additional information.

This will never return an unknown lazy state ID.

If caching this state would otherwise result in a cache that has been cleared too many times, then an error is returned.

source

fn add_state( &mut self, state: State, idmap: impl Fn(LazyStateID) -> LazyStateID, ) -> Result<LazyStateID, CacheError>

Allocate a new state ID and add the given state to this cache.

The idmap function given may be used to transform the identifier allocated. This is useful if the caller needs to tag the ID with additional information.

This will never return an unknown lazy state ID.

If caching this state would otherwise result in a cache that has been cleared too many times, then an error is returned.

source

fn next_state_id(&mut self) -> Result<LazyStateID, CacheError>

Allocate a new state ID.

This will never return an unknown lazy state ID.

If caching this state would otherwise result in a cache that has been cleared too many times, then an error is returned.

source

fn try_clear_cache(&mut self) -> Result<(), CacheError>

Attempt to clear the cache used by this lazy DFA.

If clearing the cache exceeds the minimum number of required cache clearings, then this will return a cache error. In this case, callers should bubble this up as the cache can’t be used until it is reset. Implementations of search should convert this error into a MatchError::gave_up.

If ‘self.state_saver’ is set to save a state, then this state is persisted through cache clearing. Otherwise, the cache is returned to its state after initialization with two exceptions: its clear count is incremented and some of its memory likely has additional capacity. That is, clearing a cache does not release memory.

Otherwise, any lazy state ID generated by the cache prior to resetting it is invalid after the reset.

source

fn reset_cache(&mut self)

Clears and resets the cache. Resetting the cache means that no states are persisted and the clear count is reset to 0. No heap memory is released.

Note that the caller may reset a cache with a different DFA than what it was created from. In which case, the cache can now be used with the new DFA (and not the old DFA).

source

fn clear_cache(&mut self)

Clear the cache used by this lazy DFA.

If ‘self.state_saver’ is set to save a state, then this state is persisted through cache clearing. Otherwise, the cache is returned to its state after initialization with two exceptions: its clear count is incremented and some of its memory likely has additional capacity. That is, clearing a cache does not release memory.

Otherwise, any lazy state ID generated by the cache prior to resetting it is invalid after the reset.

source

fn init_cache(&mut self)

Initialize this cache from emptiness to a place where it can be used for search.

This is called both at cache creation time and after the cache has been cleared.

Primarily, this adds the three sentinel states and allocates some initial memory.

source

fn save_state(&mut self, id: LazyStateID)

Save the state corresponding to the ID given such that the state persists through a cache clearing.

While the state may persist, the ID may not. In order to discover the new state ID, one must call ‘saved_state_id’ after a cache clearing.

source

fn saved_state_id(&mut self) -> LazyStateID

Returns the updated lazy state ID for a state that was persisted through a cache clearing.

It is only correct to call this routine when both a state has been saved and the cache has just been cleared. Otherwise, this panics.

source

fn set_all_transitions(&mut self, from: LazyStateID, to: LazyStateID)

Set all transitions on the state ‘from’ to ‘to’.

source

fn set_transition(&mut self, from: LazyStateID, unit: Unit, to: LazyStateID)

Set the transition on ‘from’ for ‘unit’ to ‘to’.

This panics if either ‘from’ or ‘to’ is invalid.

All unit values are OK.

source

fn set_start_state(&mut self, anchored: Anchored, start: Start, id: LazyStateID)

Set the start ID for the given pattern ID (if given) and starting configuration to the ID given.

This panics if ‘id’ is not valid or if a pattern ID is given and ‘starts_for_each_pattern’ is not enabled.

source

fn get_state_builder(&mut self) -> StateBuilderEmpty

Returns a state builder from this DFA that might have existing capacity. This helps avoid allocs in cases where a state is built that turns out to already be cached.

Callers must put the state builder back with ‘put_state_builder’, otherwise the allocation reuse won’t work.

source

fn put_state_builder(&mut self, builder: StateBuilderNFA)

Puts the given state builder back into this DFA for reuse.

Note that building a ‘State’ from a builder always creates a new alloc, so callers should always put the builder back.

Trait Implementations§

source§

impl<'i, 'c> Debug for Lazy<'i, 'c>

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl<'i, 'c> Freeze for Lazy<'i, 'c>

§

impl<'i, 'c> RefUnwindSafe for Lazy<'i, 'c>

§

impl<'i, 'c> Send for Lazy<'i, 'c>

§

impl<'i, 'c> Sync for Lazy<'i, 'c>

§

impl<'i, 'c> Unpin for Lazy<'i, 'c>

§

impl<'i, 'c> !UnwindSafe for Lazy<'i, 'c>

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

source§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.