struct SlotTable {
table: Vec<Option<NonMaxUsize>>,
slots_per_state: usize,
slots_for_captures: usize,
}
Expand description
A table of slots, where each row represent a state in an NFA. Thus, the table has room for storing slots for every single state in an NFA.
This table is represented with a single contiguous allocation. In general, the notion of “capturing group” doesn’t really exist at this level of abstraction, hence the name “slot” instead. (Indeed, every capturing group maps to a pair of slots, one for the start offset and one for the end offset.) Slots are indexed by the ‘Captures’ NFA state.
N.B. Not every state actually needs a row of slots. Namely, states that only have epsilon transitions currently never have anything written to their rows in this table. Thus, the table is somewhat wasteful in its heap usage. However, it is important to maintain fast random access by state ID, which means one giant table tends to work well. RE2 takes a different approach here and allocates each row as its own reference counted thing. I explored such a strategy at one point here, but couldn’t get it to work well using entirely safe code. (To the ambitious reader: I encourage you to re-litigate that experiment.) I very much wanted to stick to safe code, but could be convinced otherwise if there was a solid argument and the safety was encapsulated well.
Fields§
§table: Vec<Option<NonMaxUsize>>
The actual table of offsets.
slots_per_state: usize
The number of slots per state, i.e., the table’s stride or the length of each row.
slots_for_captures: usize
The number of slots in the caller-provided ‘Captures’ value for the current search. Setting this to ‘slots_per_state’ is always correct, but may be wasteful.
Implementations§
source§impl SlotTable
impl SlotTable
sourcefn new() -> SlotTable
fn new() -> SlotTable
Create a new slot table.
One should call ‘reset’ with the corresponding PikeVM before use.
sourcefn reset(&mut self, re: &PikeVM)
fn reset(&mut self, re: &PikeVM)
Reset this slot table such that it can be used with the given PikeVM (and only that PikeVM).
sourcefn memory_usage(&self) -> usize
fn memory_usage(&self) -> usize
Return the heap memory usage, in bytes, used by this slot table.
This does not include the stack size of this value.
sourcefn setup_search(&mut self, captures_slot_len: usize)
fn setup_search(&mut self, captures_slot_len: usize)
Perform any per-search setup for this slot table.
In particular, this sets the length of the number of slots used in the ‘Captures’ given by the caller (if any at all). This number may be smaller than the total number of slots available, e.g., when the caller is only interested in tracking the overall match and not the spans of every matching capturing group. Only tracking the overall match can save a substantial amount of time copying capturing spans during a search.
sourcefn for_state(&mut self, sid: StateID) -> &mut [Option<NonMaxUsize>]
fn for_state(&mut self, sid: StateID) -> &mut [Option<NonMaxUsize>]
Return a mutable slice of the slots for the given state.
Note that the length of the slice returned may be less than the total number of slots available for this state. In particular, the length always matches the number of slots indicated via ‘setup_search’.
sourcefn all_absent(&mut self) -> &mut [Option<NonMaxUsize>]
fn all_absent(&mut self) -> &mut [Option<NonMaxUsize>]
Return a slice of slots of appropriate length where every slot offset is guaranteed to be absent. This is useful in cases where you need to compute an epsilon closure outside of the user supplied regex, and thus never want it to have any capturing slots set.