Struct regex_automata::hybrid::dfa::Lazy
source · struct Lazy<'i, 'c> {
dfa: &'i DFA,
cache: &'c mut Cache,
}
Expand description
A type that groups methods that require the base NFA/DFA and writable access to the cache.
Fields§
§dfa: &'i DFA
§cache: &'c mut Cache
Implementations§
source§impl<'i, 'c> Lazy<'i, 'c>
impl<'i, 'c> Lazy<'i, 'c>
sourcefn new(dfa: &'i DFA, cache: &'c mut Cache) -> Lazy<'i, 'c>
fn new(dfa: &'i DFA, cache: &'c mut Cache) -> Lazy<'i, 'c>
Creates a new ‘Lazy’ wrapper for a DFA and its corresponding cache.
sourcefn as_ref<'a>(&'a self) -> LazyRef<'i, 'a>
fn as_ref<'a>(&'a self) -> LazyRef<'i, 'a>
Return an immutable view by downgrading a writable cache to a read-only cache.
sourcefn cache_next_state(
&mut self,
current: LazyStateID,
unit: Unit,
) -> Result<LazyStateID, CacheError>
fn cache_next_state( &mut self, current: LazyStateID, unit: Unit, ) -> Result<LazyStateID, CacheError>
This is marked as ‘inline(never)’ to avoid bloating methods on ‘DFA’ like ‘next_state’ and ‘next_eoi_state’ that are called in critical areas. The idea is to let the optimizer focus on the other areas of those methods as the hot path.
Here’s an example that justifies ‘inline(never)’
regex-cli find match hybrid \
--cache-capacity 100000000 \
-p '\pL{100}'
all-codepoints-utf8-100x
Where ‘all-codepoints-utf8-100x’ is the UTF-8 encoding of every codepoint, in sequence, repeated 100 times.
With ‘inline(never)’ hyperfine reports 1.1s per run. With ‘inline(always)’, hyperfine reports 1.23s. So that’s a 10% improvement.
sourcefn cache_start_group(
&mut self,
anchored: Anchored,
start: Start,
) -> Result<LazyStateID, StartError>
fn cache_start_group( &mut self, anchored: Anchored, start: Start, ) -> Result<LazyStateID, StartError>
Compute and cache the starting state for the given pattern ID (if present) and the starting configuration.
This panics if a pattern ID is given and the DFA isn’t configured to build anchored start states for each pattern.
This will never return an unknown lazy state ID.
If caching this state would otherwise result in a cache that has been cleared too many times, then an error is returned.
sourcefn cache_start_one(
&mut self,
nfa_start_id: NFAStateID,
start: Start,
) -> Result<LazyStateID, CacheError>
fn cache_start_one( &mut self, nfa_start_id: NFAStateID, start: Start, ) -> Result<LazyStateID, CacheError>
Compute and cache the starting state for the given NFA state ID and the starting configuration. The NFA state ID might be one of the following:
- An unanchored start state to match any pattern.
- An anchored start state to match any pattern.
- An anchored start state for a particular pattern.
This will never return an unknown lazy state ID.
If caching this state would otherwise result in a cache that has been cleared too many times, then an error is returned.
sourcefn add_builder_state(
&mut self,
builder: StateBuilderNFA,
idmap: impl Fn(LazyStateID) -> LazyStateID,
) -> Result<LazyStateID, CacheError>
fn add_builder_state( &mut self, builder: StateBuilderNFA, idmap: impl Fn(LazyStateID) -> LazyStateID, ) -> Result<LazyStateID, CacheError>
Either add the given builder state to this cache, or return an ID to an equivalent state already in this cache.
In the case where no equivalent state exists, the idmap function given may be used to transform the identifier allocated. This is useful if the caller needs to tag the ID with additional information.
This will never return an unknown lazy state ID.
If caching this state would otherwise result in a cache that has been cleared too many times, then an error is returned.
sourcefn add_state(
&mut self,
state: State,
idmap: impl Fn(LazyStateID) -> LazyStateID,
) -> Result<LazyStateID, CacheError>
fn add_state( &mut self, state: State, idmap: impl Fn(LazyStateID) -> LazyStateID, ) -> Result<LazyStateID, CacheError>
Allocate a new state ID and add the given state to this cache.
The idmap function given may be used to transform the identifier allocated. This is useful if the caller needs to tag the ID with additional information.
This will never return an unknown lazy state ID.
If caching this state would otherwise result in a cache that has been cleared too many times, then an error is returned.
sourcefn next_state_id(&mut self) -> Result<LazyStateID, CacheError>
fn next_state_id(&mut self) -> Result<LazyStateID, CacheError>
Allocate a new state ID.
This will never return an unknown lazy state ID.
If caching this state would otherwise result in a cache that has been cleared too many times, then an error is returned.
sourcefn try_clear_cache(&mut self) -> Result<(), CacheError>
fn try_clear_cache(&mut self) -> Result<(), CacheError>
Attempt to clear the cache used by this lazy DFA.
If clearing the cache exceeds the minimum number of required cache
clearings, then this will return a cache error. In this case,
callers should bubble this up as the cache can’t be used until it is
reset. Implementations of search should convert this error into a
MatchError::gave_up
.
If ‘self.state_saver’ is set to save a state, then this state is persisted through cache clearing. Otherwise, the cache is returned to its state after initialization with two exceptions: its clear count is incremented and some of its memory likely has additional capacity. That is, clearing a cache does not release memory.
Otherwise, any lazy state ID generated by the cache prior to resetting it is invalid after the reset.
sourcefn reset_cache(&mut self)
fn reset_cache(&mut self)
Clears and resets the cache. Resetting the cache means that no states are persisted and the clear count is reset to 0. No heap memory is released.
Note that the caller may reset a cache with a different DFA than what it was created from. In which case, the cache can now be used with the new DFA (and not the old DFA).
sourcefn clear_cache(&mut self)
fn clear_cache(&mut self)
Clear the cache used by this lazy DFA.
If ‘self.state_saver’ is set to save a state, then this state is persisted through cache clearing. Otherwise, the cache is returned to its state after initialization with two exceptions: its clear count is incremented and some of its memory likely has additional capacity. That is, clearing a cache does not release memory.
Otherwise, any lazy state ID generated by the cache prior to resetting it is invalid after the reset.
sourcefn init_cache(&mut self)
fn init_cache(&mut self)
Initialize this cache from emptiness to a place where it can be used for search.
This is called both at cache creation time and after the cache has been cleared.
Primarily, this adds the three sentinel states and allocates some initial memory.
sourcefn save_state(&mut self, id: LazyStateID)
fn save_state(&mut self, id: LazyStateID)
Save the state corresponding to the ID given such that the state persists through a cache clearing.
While the state may persist, the ID may not. In order to discover the new state ID, one must call ‘saved_state_id’ after a cache clearing.
sourcefn saved_state_id(&mut self) -> LazyStateID
fn saved_state_id(&mut self) -> LazyStateID
Returns the updated lazy state ID for a state that was persisted through a cache clearing.
It is only correct to call this routine when both a state has been saved and the cache has just been cleared. Otherwise, this panics.
sourcefn set_all_transitions(&mut self, from: LazyStateID, to: LazyStateID)
fn set_all_transitions(&mut self, from: LazyStateID, to: LazyStateID)
Set all transitions on the state ‘from’ to ‘to’.
sourcefn set_transition(&mut self, from: LazyStateID, unit: Unit, to: LazyStateID)
fn set_transition(&mut self, from: LazyStateID, unit: Unit, to: LazyStateID)
Set the transition on ‘from’ for ‘unit’ to ‘to’.
This panics if either ‘from’ or ‘to’ is invalid.
All unit values are OK.
sourcefn set_start_state(&mut self, anchored: Anchored, start: Start, id: LazyStateID)
fn set_start_state(&mut self, anchored: Anchored, start: Start, id: LazyStateID)
Set the start ID for the given pattern ID (if given) and starting configuration to the ID given.
This panics if ‘id’ is not valid or if a pattern ID is given and ‘starts_for_each_pattern’ is not enabled.
sourcefn get_state_builder(&mut self) -> StateBuilderEmpty
fn get_state_builder(&mut self) -> StateBuilderEmpty
Returns a state builder from this DFA that might have existing capacity. This helps avoid allocs in cases where a state is built that turns out to already be cached.
Callers must put the state builder back with ‘put_state_builder’, otherwise the allocation reuse won’t work.
sourcefn put_state_builder(&mut self, builder: StateBuilderNFA)
fn put_state_builder(&mut self, builder: StateBuilderNFA)
Puts the given state builder back into this DFA for reuse.
Note that building a ‘State’ from a builder always creates a new alloc, so callers should always put the builder back.