Struct regex_automata::util::look::LookMatcher
source · pub struct LookMatcher {
lineterm: DebugByte,
}
Expand description
A matcher for look-around assertions.
This matcher permits configuring aspects of how look-around assertions are matched.
§Example
A LookMatcher
can change the line terminator used for matching multi-line
anchors such as (?m:^)
and (?m:$)
.
use regex_automata::{
nfa::thompson::{self, pikevm::PikeVM},
util::look::LookMatcher,
Match, Input,
};
let mut lookm = LookMatcher::new();
lookm.set_line_terminator(b'\x00');
let re = PikeVM::builder()
.thompson(thompson::Config::new().look_matcher(lookm))
.build(r"(?m)^[a-z]+$")?;
let mut cache = re.create_cache();
// Multi-line assertions now use NUL as a terminator.
assert_eq!(
Some(Match::must(0, 1..4)),
re.find(&mut cache, b"\x00abc\x00"),
);
// ... and \n is no longer recognized as a terminator.
assert_eq!(
None,
re.find(&mut cache, b"\nabc\n"),
);
Fields§
§lineterm: DebugByte
Implementations§
source§impl LookMatcher
impl LookMatcher
sourcepub fn new() -> LookMatcher
pub fn new() -> LookMatcher
Creates a new default matcher for look-around assertions.
sourcepub fn set_line_terminator(&mut self, byte: u8) -> &mut LookMatcher
pub fn set_line_terminator(&mut self, byte: u8) -> &mut LookMatcher
Sets the line terminator for use with (?m:^)
and (?m:$)
.
Namely, instead of ^
matching after \n
and $
matching immediately
before a \n
, this will cause it to match after and before the byte
given.
It can occasionally be useful to use this to configure the line terminator to the NUL byte when searching binary data.
Note that this does not apply to CRLF-aware line anchors such as
(?Rm:^)
and (?Rm:$)
. CRLF-aware line anchors are hard-coded to
use \r
and \n
.
sourcepub fn get_line_terminator(&self) -> u8
pub fn get_line_terminator(&self) -> u8
Returns the line terminator that was configured for this matcher.
If no line terminator was configured, then this returns \n
.
Note that the line terminator should only be used for matching (?m:^)
and (?m:$)
assertions. It specifically should not be used for
matching the CRLF aware assertions (?Rm:^)
and (?Rm:$)
.
sourcepub fn matches(&self, look: Look, haystack: &[u8], at: usize) -> bool
pub fn matches(&self, look: Look, haystack: &[u8], at: usize) -> bool
Returns true when the position at
in haystack
satisfies the given
look-around assertion.
§Panics
This panics when testing any Unicode word boundary assertion in this
set and when the Unicode word data is not available. Specifically, this
only occurs when the unicode-word-boundary
feature is not enabled.
Since it’s generally expected that this routine is called inside of a matching engine, callers should check the error condition when building the matching engine. If there is a Unicode word boundary in the matcher and the data isn’t available, then the matcher should fail to build.
Callers can check the error condition with LookSet::available
.
This also may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
sourcepub(crate) fn matches_inline(
&self,
look: Look,
haystack: &[u8],
at: usize,
) -> bool
pub(crate) fn matches_inline( &self, look: Look, haystack: &[u8], at: usize, ) -> bool
Like matches
, but forcefully inlined.
§Panics
This panics when testing any Unicode word boundary assertion in this
set and when the Unicode word data is not available. Specifically, this
only occurs when the unicode-word-boundary
feature is not enabled.
Since it’s generally expected that this routine is called inside of a matching engine, callers should check the error condition when building the matching engine. If there is a Unicode word boundary in the matcher and the data isn’t available, then the matcher should fail to build.
Callers can check the error condition with LookSet::available
.
This also may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
sourcepub fn matches_set(&self, set: LookSet, haystack: &[u8], at: usize) -> bool
pub fn matches_set(&self, set: LookSet, haystack: &[u8], at: usize) -> bool
Returns true when all of the assertions in the given set match at the given position in the haystack.
§Panics
This panics when testing any Unicode word boundary assertion in this
set and when the Unicode word data is not available. Specifically, this
only occurs when the unicode-word-boundary
feature is not enabled.
Since it’s generally expected that this routine is called inside of a matching engine, callers should check the error condition when building the matching engine. If there is a Unicode word boundary in the matcher and the data isn’t available, then the matcher should fail to build.
Callers can check the error condition with LookSet::available
.
This also may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
sourcepub(crate) fn matches_set_inline(
&self,
set: LookSet,
haystack: &[u8],
at: usize,
) -> bool
pub(crate) fn matches_set_inline( &self, set: LookSet, haystack: &[u8], at: usize, ) -> bool
Like LookSet::matches
, but forcefully inlined for perf.
sourcepub(crate) fn add_to_byteset(&self, look: Look, set: &mut ByteClassSet)
pub(crate) fn add_to_byteset(&self, look: Look, set: &mut ByteClassSet)
Split up the given byte classes into equivalence classes in a way that is consistent with this look-around assertion.
sourcepub fn is_start(&self, _haystack: &[u8], at: usize) -> bool
pub fn is_start(&self, _haystack: &[u8], at: usize) -> bool
Returns true when Look::Start
is satisfied at
the given position
in haystack
.
§Panics
This may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
sourcepub fn is_start_lf(&self, haystack: &[u8], at: usize) -> bool
pub fn is_start_lf(&self, haystack: &[u8], at: usize) -> bool
Returns true when Look::StartLF
is satisfied at
the given
position in haystack
.
§Panics
This may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
sourcepub fn is_end_lf(&self, haystack: &[u8], at: usize) -> bool
pub fn is_end_lf(&self, haystack: &[u8], at: usize) -> bool
Returns true when Look::EndLF
is satisfied at
the given position
in haystack
.
§Panics
This may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
sourcepub fn is_start_crlf(&self, haystack: &[u8], at: usize) -> bool
pub fn is_start_crlf(&self, haystack: &[u8], at: usize) -> bool
Returns true when Look::StartCRLF
is satisfied at
the given
position in haystack
.
§Panics
This may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
sourcepub fn is_end_crlf(&self, haystack: &[u8], at: usize) -> bool
pub fn is_end_crlf(&self, haystack: &[u8], at: usize) -> bool
Returns true when Look::EndCRLF
is satisfied at
the given
position in haystack
.
§Panics
This may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
sourcepub fn is_word_ascii(&self, haystack: &[u8], at: usize) -> bool
pub fn is_word_ascii(&self, haystack: &[u8], at: usize) -> bool
Returns true when Look::WordAscii
is satisfied at
the given
position in haystack
.
§Panics
This may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
sourcepub fn is_word_ascii_negate(&self, haystack: &[u8], at: usize) -> bool
pub fn is_word_ascii_negate(&self, haystack: &[u8], at: usize) -> bool
Returns true when Look::WordAsciiNegate
is satisfied at
the given
position in haystack
.
§Panics
This may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
sourcepub fn is_word_unicode(
&self,
haystack: &[u8],
at: usize,
) -> Result<bool, UnicodeWordBoundaryError>
pub fn is_word_unicode( &self, haystack: &[u8], at: usize, ) -> Result<bool, UnicodeWordBoundaryError>
Returns true when Look::WordUnicode
is satisfied at
the given
position in haystack
.
§Panics
This may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
§Errors
This returns an error when Unicode word boundary tables
are not available. Specifically, this only occurs when the
unicode-word-boundary
feature is not enabled.
sourcepub fn is_word_unicode_negate(
&self,
haystack: &[u8],
at: usize,
) -> Result<bool, UnicodeWordBoundaryError>
pub fn is_word_unicode_negate( &self, haystack: &[u8], at: usize, ) -> Result<bool, UnicodeWordBoundaryError>
Returns true when Look::WordUnicodeNegate
is satisfied at
the
given position in haystack
.
§Panics
This may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
§Errors
This returns an error when Unicode word boundary tables
are not available. Specifically, this only occurs when the
unicode-word-boundary
feature is not enabled.
sourcepub fn is_word_start_ascii(&self, haystack: &[u8], at: usize) -> bool
pub fn is_word_start_ascii(&self, haystack: &[u8], at: usize) -> bool
Returns true when Look::WordStartAscii
is satisfied at
the given
position in haystack
.
§Panics
This may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
sourcepub fn is_word_end_ascii(&self, haystack: &[u8], at: usize) -> bool
pub fn is_word_end_ascii(&self, haystack: &[u8], at: usize) -> bool
Returns true when Look::WordEndAscii
is satisfied at
the given
position in haystack
.
§Panics
This may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
sourcepub fn is_word_start_unicode(
&self,
haystack: &[u8],
at: usize,
) -> Result<bool, UnicodeWordBoundaryError>
pub fn is_word_start_unicode( &self, haystack: &[u8], at: usize, ) -> Result<bool, UnicodeWordBoundaryError>
Returns true when Look::WordStartUnicode
is satisfied at
the
given position in haystack
.
§Panics
This may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
§Errors
This returns an error when Unicode word boundary tables
are not available. Specifically, this only occurs when the
unicode-word-boundary
feature is not enabled.
sourcepub fn is_word_end_unicode(
&self,
haystack: &[u8],
at: usize,
) -> Result<bool, UnicodeWordBoundaryError>
pub fn is_word_end_unicode( &self, haystack: &[u8], at: usize, ) -> Result<bool, UnicodeWordBoundaryError>
Returns true when Look::WordEndUnicode
is satisfied at
the
given position in haystack
.
§Panics
This may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
§Errors
This returns an error when Unicode word boundary tables
are not available. Specifically, this only occurs when the
unicode-word-boundary
feature is not enabled.
sourcepub fn is_word_start_half_ascii(&self, haystack: &[u8], at: usize) -> bool
pub fn is_word_start_half_ascii(&self, haystack: &[u8], at: usize) -> bool
Returns true when Look::WordStartHalfAscii
is satisfied at
the
given position in haystack
.
§Panics
This may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
sourcepub fn is_word_end_half_ascii(&self, haystack: &[u8], at: usize) -> bool
pub fn is_word_end_half_ascii(&self, haystack: &[u8], at: usize) -> bool
Returns true when Look::WordEndHalfAscii
is satisfied at
the
given position in haystack
.
§Panics
This may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
sourcepub fn is_word_start_half_unicode(
&self,
haystack: &[u8],
at: usize,
) -> Result<bool, UnicodeWordBoundaryError>
pub fn is_word_start_half_unicode( &self, haystack: &[u8], at: usize, ) -> Result<bool, UnicodeWordBoundaryError>
Returns true when Look::WordStartHalfUnicode
is satisfied at
the
given position in haystack
.
§Panics
This may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
§Errors
This returns an error when Unicode word boundary tables
are not available. Specifically, this only occurs when the
unicode-word-boundary
feature is not enabled.
sourcepub fn is_word_end_half_unicode(
&self,
haystack: &[u8],
at: usize,
) -> Result<bool, UnicodeWordBoundaryError>
pub fn is_word_end_half_unicode( &self, haystack: &[u8], at: usize, ) -> Result<bool, UnicodeWordBoundaryError>
Returns true when Look::WordEndHalfUnicode
is satisfied at
the
given position in haystack
.
§Panics
This may panic when at > haystack.len()
. Note that at == haystack.len()
is legal and guaranteed not to panic.
§Errors
This returns an error when Unicode word boundary tables
are not available. Specifically, this only occurs when the
unicode-word-boundary
feature is not enabled.
Trait Implementations§
source§impl Clone for LookMatcher
impl Clone for LookMatcher
source§fn clone(&self) -> LookMatcher
fn clone(&self) -> LookMatcher
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read more