Struct aho_corasick::util::buffer::Buffer
source · pub(crate) struct Buffer {
buf: Vec<u8>,
min: usize,
end: usize,
}
Expand description
A fairly simple roll buffer for supporting stream searches.
This buffer acts as a temporary place to store a fixed amount of data when
reading from a stream. Its central purpose is to allow “rolling” some
suffix of the data to the beginning of the buffer before refilling it with
more data from the stream. For example, let’s say we are trying to match
“foobar” on a stream. When we report the match, we’d like to not only
report the correct offsets at which the match occurs, but also the matching
bytes themselves. So let’s say our stream is a file with the following
contents: test test foobar test test
. Now assume that we happen to read
the aforementioned file in two chunks: test test foo
and bar test test
.
Naively, it would not be possible to report a single contiguous foobar
match, but this roll buffer allows us to do that. Namely, after the second
read, the contents of the buffer should be st foobar test test
, where the
search should ultimately resume immediately after foo
. (The prefix st
is included because the roll buffer saves N bytes at the end of the buffer,
where N is the maximum possible length of a match.)
A lot of the logic for dealing with this is unfortunately split out between
this roll buffer and the StreamChunkIter
.
Note also that this buffer is not actually required to just report matches.
Because a Match
is just some offsets. But it is required for supporting
things like try_stream_replace_all
because that needs some mechanism for
knowing which bytes in the stream correspond to a match and which don’t. So
when a match occurs across two read
calls, something needs to retain
the bytes from the previous read
call because you don’t know before the
second read call whether a match exists or not.
Fields§
§buf: Vec<u8>
The raw buffer contents. This has a fixed size and never increases.
min: usize
The minimum size of the buffer, which is equivalent to the maximum possible length of a match. This corresponds to the amount that we roll
end: usize
The end of the contents of this buffer.
Implementations§
source§impl Buffer
impl Buffer
sourcepub(crate) fn new(min_buffer_len: usize) -> Buffer
pub(crate) fn new(min_buffer_len: usize) -> Buffer
Create a new buffer for stream searching. The minimum buffer length given should be the size of the maximum possible match length.
sourcepub(crate) fn min_buffer_len(&self) -> usize
pub(crate) fn min_buffer_len(&self) -> usize
Return the minimum size of the buffer. The only way a buffer may be smaller than this is if the stream itself contains less than the minimum buffer amount.
sourcefn free_buffer(&mut self) -> &mut [u8] ⓘ
fn free_buffer(&mut self) -> &mut [u8] ⓘ
Return all free capacity in this buffer.
sourcepub(crate) fn fill<R: Read>(&mut self, rdr: R) -> Result<bool>
pub(crate) fn fill<R: Read>(&mut self, rdr: R) -> Result<bool>
Refill the contents of this buffer by reading as much as possible into this buffer’s free capacity. If no more bytes could be read, then this returns false. Otherwise, this reads until it has filled the buffer past the minimum amount.
sourcepub(crate) fn roll(&mut self)
pub(crate) fn roll(&mut self)
Roll the contents of the buffer so that the suffix of this buffer is moved to the front and all other contents are dropped. The size of the suffix corresponds precisely to the minimum buffer length.
This should only be called when the entire contents of this buffer have been searched.