struct Prefilter {
call: unsafe fn(strat: &Prefilter, haystack: &[u8]) -> Option<usize>,
kind: PrefilterKind,
rarest_byte: u8,
rarest_offset: u8,
}
Expand description
The implementation of a prefilter.
This type encapsulates dispatch to one of several possible choices for a prefilter. Generally speaking, all prefilters have the same approximate algorithm: they choose a couple of bytes from the needle that are believed to be rare, use a fast vector algorithm to look for those bytes and return positions as candidates for some substring search algorithm (currently only Two-Way) to confirm as a match or not.
The differences between the algorithms are actually at the vector implementation level. Namely, we need different routines based on both which target architecture we’re on and what CPU features are supported.
The straight-forwardly obvious approach here is to use an enum, and make
Prefilter::find
do case analysis to determine which algorithm was
selected and invoke it. However, I’ve observed that this leads to poor
codegen in some cases, especially in latency sensitive benchmarks. That is,
this approach comes with overhead that I wasn’t able to eliminate.
The second obvious approach is to use dynamic dispatch with traits. Doing
that in this context where Prefilter
owns the selection generally
requires heap allocation, and this code is designed to run in core-only
environments.
So we settle on using a union (that’s PrefilterKind
) and a function
pointer (that’s PrefilterKindFn
). We select the right function pointer
based on which field in the union we set, and that function in turn
knows which field of the union to access. The downside of this approach
is that it forces us to think about safety, but the upside is that
there are some nice latency improvements to benchmarks. (Especially the
memmem/sliceslice/short
benchmark.)
In cases where we’ve selected a vector algorithm and the haystack given
is too short, we fallback to the scalar version of memchr
on the
rarest_byte
. (The scalar version of memchr
is still better than a naive
byte-at-a-time loop because it will read in usize
-sized chunks at a
time.)
Fields§
§call: unsafe fn(strat: &Prefilter, haystack: &[u8]) -> Option<usize>
§kind: PrefilterKind
§rarest_byte: u8
§rarest_offset: u8
Implementations§
source§impl Prefilter
impl Prefilter
sourcefn fallback<R: HeuristicFrequencyRank>(
ranker: R,
pair: Pair,
needle: &[u8],
) -> Option<Prefilter>
fn fallback<R: HeuristicFrequencyRank>( ranker: R, pair: Pair, needle: &[u8], ) -> Option<Prefilter>
Return a “fallback” prefilter, but only if it is believed to be effective.
sourcefn sse2(finder: Finder, needle: &[u8]) -> Prefilter
fn sse2(finder: Finder, needle: &[u8]) -> Prefilter
Return a prefilter using a x86_64 SSE2 vector algorithm.
sourcefn avx2(finder: Finder, needle: &[u8]) -> Prefilter
fn avx2(finder: Finder, needle: &[u8]) -> Prefilter
Return a prefilter using a x86_64 AVX2 vector algorithm.
sourcefn find(&self, haystack: &[u8]) -> Option<usize>
fn find(&self, haystack: &[u8]) -> Option<usize>
Return a candidate position for a match.
When this returns an offset, it implies that a match could begin at that offset, but it may not. That is, it is possible for a false positive to be returned.
When None
is returned, then it is guaranteed that there are no
matches for the needle in the given haystack. That is, it is impossible
for a false negative to be returned.
The purpose of this routine is to look for candidate matching positions as quickly as possible before running a (likely) slower confirmation step.
sourcefn find_simple(&self, haystack: &[u8]) -> Option<usize>
fn find_simple(&self, haystack: &[u8]) -> Option<usize>
A “simple” prefilter that just looks for the occurrence of the rarest byte from the needle. This is generally only used for very small haystacks.