enum Shift {
Small {
period: usize,
},
Large {
shift: usize,
},
}
Expand description
A representation of the amount we’re allowed to shift by during Two-Way search.
When computing a critical factorization of the needle, we find the position of the critical factorization by finding the needle’s maximal (or minimal) suffix, along with the period of that suffix. It turns out that the period of that suffix is a lower bound on the period of the needle itself.
This lower bound is equivalent to the actual period of the needle in
some cases. To describe that case, we denote the needle as x
where
x = uv
and v
is the lexicographic maximal suffix of v
. The lower
bound given here is always the period of v
, which is <= period(x)
. The
case where period(v) == period(x)
occurs when len(u) < (len(x) / 2)
and
where u
is a suffix of v[0..period(v)]
.
This case is important because the search algorithm for when the
periods are equivalent is slightly different than the search algorithm
for when the periods are not equivalent. In particular, when they aren’t
equivalent, we know that the period of the needle is no less than half its
length. In this case, we shift by an amount less than or equal to the
period of the needle (determined by the maximum length of the components
of the critical factorization of x
, i.e., max(len(u), len(v))
)..
The above two cases are represented by the variants below. Each entails a different instantiation of the Two-Way search algorithm.
N.B. If we could find a way to compute the exact period in all cases, then we could collapse this case analysis and simplify the algorithm. The Two-Way paper suggests this is possible, but more reading is required to grok why the authors didn’t pursue that path.
Variants§
Implementations§
source§impl Shift
impl Shift
sourcefn forward(
needle: &[u8],
period_lower_bound: usize,
critical_pos: usize,
) -> Shift
fn forward( needle: &[u8], period_lower_bound: usize, critical_pos: usize, ) -> Shift
Compute the shift for a given needle in the forward direction.
This requires a lower bound on the period and a critical position. These can be computed by extracting both the minimal and maximal lexicographic suffixes, and choosing the right-most starting position. The lower bound on the period is then the period of the chosen suffix.
sourcefn reverse(
needle: &[u8],
period_lower_bound: usize,
critical_pos: usize,
) -> Shift
fn reverse( needle: &[u8], period_lower_bound: usize, critical_pos: usize, ) -> Shift
Compute the shift for a given needle in the reverse direction.
This requires a lower bound on the period and a critical position. These can be computed by extracting both the minimal and maximal lexicographic suffixes, and choosing the left-most starting position. The lower bound on the period is then the period of the chosen suffix.