pub(crate) struct Finder<V> {
pair: Pair,
v1: V,
v2: V,
min_haystack_len: usize,
}
Expand description
A generic architecture dependent “packed pair” finder.
This finder picks two bytes that it believes have high predictive power
for indicating an overall match of a needle. Depending on whether
Finder::find
or Finder::find_prefilter
is used, it reports offsets
where the needle matches or could match. In the prefilter case, candidates
are reported whenever the Pair
of bytes given matches.
This is architecture dependent because it uses specific vector operations to look for occurrences of the pair of bytes.
This type is not meant to be exported and is instead meant to be used as
the implementation for architecture specific facades. Why? Because it’s a
bit of a quirky API that requires inline(always)
annotations. And pretty
much everything has safety obligations due (at least) to the caller needing
to inline calls into routines marked with
#[target_feature(enable = "...")]
.
Fields§
§pair: Pair
§v1: V
§v2: V
§min_haystack_len: usize
Implementations§
source§impl<V: Vector> Finder<V>
impl<V: Vector> Finder<V>
sourcepub(crate) unsafe fn new(needle: &[u8], pair: Pair) -> Finder<V>
pub(crate) unsafe fn new(needle: &[u8], pair: Pair) -> Finder<V>
Create a new pair searcher. The searcher returned can either report
exact matches of needle
or act as a prefilter and report candidate
positions of needle
.
§Safety
Callers must ensure that whatever vector type this routine is called with is supported by the current environment.
Callers must also ensure that needle.len() >= 2
.
sourcepub(crate) unsafe fn find(
&self,
haystack: &[u8],
needle: &[u8],
) -> Option<usize>
pub(crate) unsafe fn find( &self, haystack: &[u8], needle: &[u8], ) -> Option<usize>
Searches the given haystack for the given needle. The needle given should be the same as the needle that this finder was initialized with.
§Panics
When haystack.len()
is less than Finder::min_haystack_len
.
§Safety
Since this is meant to be used with vector functions, callers need to
specialize this inside of a function with a target_feature
attribute.
Therefore, callers must ensure that whatever target feature is being
used supports the vector functions that this function is specialized
for. (For the specific vector functions used, see the Vector trait
implementations.)
sourcepub(crate) unsafe fn find_prefilter(&self, haystack: &[u8]) -> Option<usize>
pub(crate) unsafe fn find_prefilter(&self, haystack: &[u8]) -> Option<usize>
Searches the given haystack for offsets that represent candidate
matches of the needle
given to this finder’s constructor. The offsets
returned, if they are a match, correspond to the starting offset of
needle
in the given haystack
.
§Panics
When haystack.len()
is less than Finder::min_haystack_len
.
§Safety
Since this is meant to be used with vector functions, callers need to
specialize this inside of a function with a target_feature
attribute.
Therefore, callers must ensure that whatever target feature is being
used supports the vector functions that this function is specialized
for. (For the specific vector functions used, see the Vector trait
implementations.)
sourceunsafe fn find_in_chunk(
&self,
needle: &[u8],
cur: *const u8,
end: *const u8,
mask: V::Mask,
) -> Option<usize>
unsafe fn find_in_chunk( &self, needle: &[u8], cur: *const u8, end: *const u8, mask: V::Mask, ) -> Option<usize>
Search for an occurrence of our byte pair from the needle in the chunk pointed to by cur, with the end of the haystack pointed to by end. When an occurrence is found, memcmp is run to check if a match occurs at the corresponding position.
mask
should have bits set corresponding the positions in the chunk
in which matches are considered. This is only used for the last vector
load where the beginning of the vector might have overlapped with the
last load in the main loop. The mask lets us avoid visiting positions
that have already been discarded as matches.
§Safety
It must be safe to do an unaligned read of size(V) bytes starting at both (cur + self.index1) and (cur + self.index2). It must also be safe to do unaligned loads on cur up to (end - needle.len()).
sourceunsafe fn find_prefilter_in_chunk(&self, cur: *const u8) -> Option<usize>
unsafe fn find_prefilter_in_chunk(&self, cur: *const u8) -> Option<usize>
Search for an occurrence of our byte pair from the needle in the chunk pointed to by cur, with the end of the haystack pointed to by end. When an occurrence is found, memcmp is run to check if a match occurs at the corresponding position.
§Safety
It must be safe to do an unaligned read of size(V) bytes starting at both (cur + self.index1) and (cur + self.index2). It must also be safe to do unaligned reads on cur up to (end - needle.len()).
sourcepub(crate) fn pair(&self) -> &Pair
pub(crate) fn pair(&self) -> &Pair
Returns the pair of offsets (into the needle) used to check as a predicate before confirming whether a needle exists at a particular position.
sourcepub(crate) fn min_haystack_len(&self) -> usize
pub(crate) fn min_haystack_len(&self) -> usize
Returns the minimum haystack length that this Finder
can search.
Providing a haystack to this Finder
shorter than this length is
guaranteed to result in a panic.