Struct aho_corasick::automaton::StreamChunkIter

source ·
struct StreamChunkIter<'a, A, R> {
    aut: &'a A,
    rdr: R,
    buf: Buffer,
    start: StateID,
    sid: StateID,
    absolute_pos: usize,
    buffer_pos: usize,
    buffer_reported_pos: usize,
}
Expand description

An iterator that reports matches in a stream.

(This doesn’t actually implement the Iterator trait because it returns something with a lifetime attached to a buffer it owns, but that’s OK. It still has a next method and is iterator-like enough to be fine.)

This iterator yields elements of type io::Result<StreamChunk>, where an error is reported if there was a problem reading from the underlying stream. The iterator terminates only when the underlying stream reaches EOF.

The idea here is that each chunk represents either a match or a non-match, and if you concatenated all of the chunks together, you’d reproduce the entire contents of the stream, byte-for-byte.

This chunk machinery is a bit complicated and it isn’t strictly required for a stream searcher that just reports matches. But we do need something like this to deal with the “replacement” API, which needs to know which chunks it can copy and which it needs to replace.

Fields§

§aut: &'a A

The underlying automaton to do the search.

§rdr: R

The source of bytes we read from.

§buf: Buffer

A roll buffer for managing bytes from rdr. Basically, this is used to handle the case of a match that is split by two different calls to rdr.read(). This isn’t strictly needed if all we needed to do was report matches, but here we are reporting chunks of non-matches and matches and in order to do that, we really just cannot treat our stream as non-overlapping blocks of bytes. We need to permit some overlap while we retain bytes from a previous read call in memory.

§start: StateID

The unanchored starting state of this automaton.

§sid: StateID

The state of the automaton.

§absolute_pos: usize

The absolute position over the entire stream.

§buffer_pos: usize

The position we’re currently at within buf.

§buffer_reported_pos: usize

The buffer position of the end of the bytes that we last returned to the caller. Basically, whenever we find a match, we look to see if there is a difference between where the match started and the position of the last byte we returned to the caller. If there’s a difference, then we need to return a ‘NonMatch’ chunk.

Implementations§

source§

impl<'a, A: Automaton, R: Read> StreamChunkIter<'a, A, R>

source

fn new(aut: &'a A, rdr: R) -> Result<StreamChunkIter<'a, A, R>, MatchError>

source

fn next(&mut self) -> Option<Result<StreamChunk<'_>>>

source

fn get_match_chunk(&self, mat: Match) -> Range<usize>

Return a match chunk for the given match. It is assumed that the match ends at the current buffer_pos.

source

fn get_non_match_chunk(&self, mat: Match) -> Option<Range<usize>>

Return a non-match chunk, if necessary, just before reporting a match. This returns None if there is nothing to report. Otherwise, this assumes that the given match ends at the current buffer_pos.

source

fn get_pre_roll_non_match_chunk(&self) -> Option<Range<usize>>

Look for any bytes that should be reported as a non-match just before rolling the buffer.

Note that this only reports bytes up to buffer.len() - min_buffer_len, as it’s not possible to know whether the bytes following that will participate in a match or not.

source

fn get_eof_non_match_chunk(&self) -> Option<Range<usize>>

Return any unreported bytes as a non-match up to the end of the buffer.

This should only be called when the entire contents of the buffer have been searched and EOF has been hit when trying to fill the buffer.

source

fn get_match(&self) -> Match

Return the match at the current position for the current state.

This panics if self.aut.is_match(self.sid) isn’t true.

Trait Implementations§

source§

impl<'a, A: Debug, R: Debug> Debug for StreamChunkIter<'a, A, R>

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl<'a, A, R> Freeze for StreamChunkIter<'a, A, R>
where R: Freeze,

§

impl<'a, A, R> RefUnwindSafe for StreamChunkIter<'a, A, R>

§

impl<'a, A, R> Send for StreamChunkIter<'a, A, R>
where R: Send, A: Sync,

§

impl<'a, A, R> Sync for StreamChunkIter<'a, A, R>
where R: Sync, A: Sync,

§

impl<'a, A, R> Unpin for StreamChunkIter<'a, A, R>
where R: Unpin,

§

impl<'a, A, R> UnwindSafe for StreamChunkIter<'a, A, R>

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.