pub struct Finder {
hash: Hash,
hash_2pow: u32,
}
Expand description
A forward substring searcher using the Rabin-Karp algorithm.
Note that, as a lower level API, a Finder
does not have access to the
needle it was constructed with. For this reason, executing a search
with a Finder
requires passing both the needle and the haystack,
where the needle is exactly equivalent to the one given to the Finder
at construction time. This design was chosen so that callers can have
more precise control over where and how many times a needle is stored.
For example, in cases where Rabin-Karp is just one of several possible
substring search algorithms.
Fields§
§hash: Hash
The actual hash.
hash_2pow: u32
The factor needed to multiply a byte by in order to subtract it from the hash. It is defined to be 2^(n-1) (using wrapping exponentiation), where n is the length of the needle. This is how we “remove” a byte from the hash once the hash window rolls past it.
Implementations§
source§impl Finder
impl Finder
sourcepub fn new(needle: &[u8]) -> Finder
pub fn new(needle: &[u8]) -> Finder
Create a new Rabin-Karp forward searcher for the given needle
.
The needle may be empty. The empty needle matches at every byte offset.
Note that callers must pass the same needle to all search calls using
this Finder
.
sourcepub fn find(&self, haystack: &[u8], needle: &[u8]) -> Option<usize>
pub fn find(&self, haystack: &[u8], needle: &[u8]) -> Option<usize>
Return the first occurrence of the needle
in the haystack
given. If no such occurrence exists, then None
is returned.
The needle
provided must match the needle given to this finder at
construction time.
The maximum value this can return is haystack.len()
, which can only
occur when the needle and haystack both have length zero. Otherwise,
for non-empty haystacks, the maximum value is haystack.len() - 1
.
sourcepub unsafe fn find_raw(
&self,
hstart: *const u8,
hend: *const u8,
nstart: *const u8,
nend: *const u8,
) -> Option<*const u8>
pub unsafe fn find_raw( &self, hstart: *const u8, hend: *const u8, nstart: *const u8, nend: *const u8, ) -> Option<*const u8>
Like find
, but accepts and returns raw pointers.
When a match is found, the pointer returned is guaranteed to be
>= start
and <= end
. The pointer returned is only ever equivalent
to end
when both the needle and haystack are empty. (That is, the
empty string matches the empty string.)
This routine is useful if you’re already using raw pointers and would like to avoid converting back to a slice before executing a search.
§Safety
Note that start
and end
below refer to both pairs of pointers given
to this routine. That is, the conditions apply to both hstart
/hend
and nstart
/nend
.
- Both
start
andend
must be valid for reads. - Both
start
andend
must point to an initialized value. - Both
start
andend
must point to the same allocated object and must either be in bounds or at most one byte past the end of the allocated object. - Both
start
andend
must be derived from a pointer to the same object. - The distance between
start
andend
must not overflowisize
. - The distance being in bounds must not rely on “wrapping around” the address space.
- It must be the case that
start <= end
.