trait SearcherT: Debug + Send + Sync + UnwindSafe + RefUnwindSafe + 'static {
// Required method
unsafe fn find(&self, start: *const u8, end: *const u8) -> Option<Match>;
}
Expand description
A trait that provides dynamic dispatch over the different possible Teddy variants on the same algorithm.
On x86_64
for example, it isn’t known until runtime which of 12 possible
variants will be used. One might use one of the four slim 128-bit vector
variants, or one of the four 256-bit vector variants or even one of the
four fat 256-bit vector variants.
Since this choice is generally made when the Teddy searcher is constructed and this choice is based on the patterns given and what the current CPU supports, it follows that there must be some kind of indirection at search time that “selects” the variant chosen at build time.
There are a few different ways to go about this. One approach is to use an enum. It works fine, but in my experiments, this generally results in worse codegen. Another approach, which is what we use here, is dynamic dispatch via a trait object. We basically implement this trait for each possible variant, select the variant we want at build time and convert it to a trait object for use at search time.
Another approach is to use function pointers and stick each of the possible
variants into a union. This is essentially isomorphic to the dynamic
dispatch approach, but doesn’t require any allocations. Since this crate
requires alloc
, there’s no real reason (AFAIK) to go down this path. (The
memchr
crate does this.)
Required Methods§
sourceunsafe fn find(&self, start: *const u8, end: *const u8) -> Option<Match>
unsafe fn find(&self, start: *const u8, end: *const u8) -> Option<Match>
Execute a search on the given haystack (identified by start
and end
raw pointers).
§Safety
Essentially, the start
and end
pointers must be valid and point
to a haystack one can read. As long as you derive them from, for
example, a &[u8]
, they should automatically satisfy all of the safety
obligations:
- Both
start
andend
must be valid for reads. - Both
start
andend
must point to an initialized value. - Both
start
andend
must point to the same allocated object and must either be in bounds or at most one byte past the end of the allocated object. - Both
start
andend
must be derived from a pointer to the same object. - The distance between
start
andend
must not overflowisize
. - The distance being in bounds must not rely on “wrapping around” the address space.
- It must be the case that
start <= end
. end - start
must be greater than the minimum length for this searcher.
Also, it is expected that implementations of this trait will tag this
method with a target_feature
attribute. Callers must ensure that
they are executing this method in an environment where that attribute
is valid.