Expand description
An implementation of the Two-Way substring search algorithm.
Finder
can be built for forward searches, while FinderRev
can be built
for reverse searches.
Two-Way makes for a nice general purpose substring search algorithm because of
its time and space complexity properties. It also performs well in practice.
Namely, with m = len(needle)
and n = len(haystack)
, Two-Way takes O(m)
time to create a finder, O(1)
space and O(n)
search time. In other words,
the preprocessing step is quick, doesnβt require any heap memory and the worst
case search time is guaranteed to be linear in the haystack regardless of the
size of the needle.
While vector algorithms will usually beat Two-Way handedly, vector algorithms also usually have pathological or edge cases that are better handled by Two-Way. Moreover, not all targets support vector algorithms or implementations for them simply may not exist yet.
Two-Way can be found in the memmem
implementations in at least GNU libc and
musl.
Structs§
- Approximate
Byte πSet A bitset used to track whether a particular byte exists in a needle or not. - A forward substring searcher that uses the Two-Way algorithm.
- A reverse substring searcher that uses the Two-Way algorithm.
- Suffix πA suffix extracted from a needle along with its period.
- TwoWay πAn implementation of the TwoWay substring search algorithm.
Enums§
- Shift πA representation of the amount weβre allowed to shift by during Two-Way search.
- Suffix
Kind πThe kind of suffix to extract. - Suffix
Ordering πThe result of comparing corresponding bytes between two suffixes.