Struct regex_automata::nfa::thompson::range_trie::RangeTrie

source ·
pub struct RangeTrie {
    states: Vec<State>,
    free: Vec<State>,
    iter_stack: RefCell<Vec<NextIter>>,
    iter_ranges: RefCell<Vec<Utf8Range>>,
    dupe_stack: Vec<NextDupe>,
    insert_stack: Vec<NextInsert>,
}
Expand description

A range trie represents an ordered set of sequences of bytes.

A range trie accepts as input a sequence of byte ranges and merges them into the existing set such that the trie can produce a sorted non-overlapping sequence of byte ranges. The sequence emitted corresponds precisely to the sequence of bytes matched by the given keys, although the byte ranges themselves may be split at different boundaries.

The order complexity of this data structure seems difficult to analyze. If the size of a byte is held as a constant, then insertion is clearly O(n) where n is the number of byte ranges in the input key. However, if k=256 is our alphabet size, then insertion could be O(k^2 * n). In particular it seems possible for pathological inputs to cause insertion to do a lot of work. However, for what we use this data structure for, there should be no pathological inputs since the ultimate source is always a sorted set of Unicode scalar value ranges.

Internally, this trie is setup like a finite state machine. Note though that it is acyclic.

Fields§

§states: Vec<State>

The states in this trie. The first is always the shared final state. The second is always the root state. Otherwise, there is no particular order.

§free: Vec<State>

A free-list of states. When a range trie is cleared, all of its states are added to this list. Creating a new state reuses states from this list before allocating a new one.

§iter_stack: RefCell<Vec<NextIter>>

A stack for traversing this trie to yield sequences of byte ranges in lexicographic order.

§iter_ranges: RefCell<Vec<Utf8Range>>

A buffer that stores the current sequence during iteration.

§dupe_stack: Vec<NextDupe>

A stack used for traversing the trie in order to (deeply) duplicate a state. States are recursively duplicated when ranges are split.

§insert_stack: Vec<NextInsert>

A stack used for traversing the trie during insertion of a new sequence of byte ranges.

Implementations§

source§

impl RangeTrie

source

pub fn new() -> RangeTrie

Create a new empty range trie.

source

pub fn clear(&mut self)

Clear this range trie such that it is empty. Clearing a range trie and reusing it can beneficial because this may reuse allocations.

source

pub fn iter<E, F: FnMut(&[Utf8Range]) -> Result<(), E>>( &self, f: F, ) -> Result<(), E>

Iterate over all of the sequences of byte ranges in this trie, and call the provided function for each sequence. Iteration occurs in lexicographic order.

source

pub fn insert(&mut self, ranges: &[Utf8Range])

Inserts a new sequence of ranges into this trie.

The sequence given must be non-empty and must not have a length exceeding 4.

source

pub fn add_empty(&mut self) -> StateID

source

fn duplicate(&mut self, old_id: StateID) -> StateID

Performs a deep clone of the given state and returns the duplicate’s state ID.

A “deep clone” in this context means that the state given along with recursively all states that it points to are copied. Once complete, the given state ID and the returned state ID share nothing.

This is useful during range trie insertion when a new range overlaps with an existing range that is bigger than the new one. The part of the existing range that does not overlap with the new one is duplicated so that adding the new range to the overlap doesn’t disturb the non-overlapping portion.

There’s one exception: if old_id is the final state, then it is not duplicated and the same final state is returned. This is because all final states in this trie are equivalent.

source

fn add_transition( &mut self, from_id: StateID, range: Utf8Range, next_id: StateID, )

Adds the given transition to the given state.

Callers must ensure that all previous transitions in this state are lexicographically smaller than the given range.

source

fn add_transition_at( &mut self, i: usize, from_id: StateID, range: Utf8Range, next_id: StateID, )

Like add_transition, except this inserts the transition just before the ith transition.

source

fn set_transition_at( &mut self, i: usize, from_id: StateID, range: Utf8Range, next_id: StateID, )

Overwrites the transition at position i with the given transition.

source

fn state(&self, id: StateID) -> &State

Return an immutable borrow for the state with the given ID.

source

fn state_mut(&mut self, id: StateID) -> &mut State

Return a mutable borrow for the state with the given ID.

Trait Implementations§

source§

impl Clone for RangeTrie

source§

fn clone(&self) -> RangeTrie

Returns a copy of the value. Read more
1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
source§

impl Debug for RangeTrie

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> ToOwned for T
where T: Clone,

§

type Owned = T

The resulting type after obtaining ownership.
source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.