Struct Input

Source

pub struct Input<'h> {
    haystack: &'h [u8],
    span: Span,
    anchored: Anchored,
    earliest: bool,
}

Expand description

The parameters for a regex search including the haystack to search.

It turns out that regex searches have a few parameters, and in most cases, those parameters have defaults that work in the vast majority of cases. This Input type exists to make that common case seamless while also providing an avenue for changing the parameters of a search. In particular, this type enables doing so without a combinatorial explosion of different methods and/or superfluous parameters in the common cases.

An Input permits configuring the following things:

Search only a substring of a haystack, while taking the broader context into account for resolving look-around assertions.
Indicating whether to search for all patterns in a regex, or to only search for one pattern in particular.
Whether to perform an anchored on unanchored search.
Whether to report a match as early as possible.

All of these parameters, except for the haystack, have sensible default values. This means that the minimal search configuration is simply a call to Input::new with your haystack. Setting any other parameter is optional.

Moreover, for any H that implements AsRef<[u8]>, there exists a From<H> for Input implementation. This is useful because many of the search APIs in this crate accept an Into<Input>. This means you can provide string or byte strings to these routines directly, and they’ll automatically get converted into an Input for you.

The lifetime parameter 'h refers to the lifetime of the haystack.

§Organization

The API of Input is split into a few different parts:

A builder-like API that transforms a Input by value. Examples: Input::span and Input::anchored.
A setter API that permits mutating parameters in place. Examples: Input::set_span and Input::set_anchored.
A getter API that permits retrieving any of the search parameters. Examples: Input::get_span and Input::get_anchored.
A few convenience getter routines that don’t conform to the above naming pattern due to how common they are. Examples: Input::haystack, Input::start and Input::end.
Miscellaneous predicates and other helper routines that are useful in some contexts. Examples: Input::is_char_boundary.

A Input exposes so much because it is meant to be used by both callers of regex engines and implementors of regex engines. A constraining factor is that regex engines should accept a &Input as its lowest level API, which means that implementors should only use the “getter” APIs of a Input.

§Valid bounds and search termination

An Input permits setting the bounds of a search via either Input::span or Input::range. The bounds set must be valid, or else a panic will occur. Bounds are valid if and only if:

The bounds represent a valid range into the input’s haystack.
or the end bound is a valid ending bound for the haystack and the start bound is exactly one greater than the start bound.

In the latter case, Input::is_done will return true and indicates any search receiving such an input should immediately return with no match.

Note that while Input is used for reverse searches in this crate, the Input::is_done predicate assumes a forward search. Because unsigned offsets are used internally, there is no way to tell from only the offsets whether a reverse search is done or not.

§Regex engine support

Any regex engine accepting an Input must support at least the following things:

Searching a &[u8] for matches.
Searching a substring of &[u8] for a match, such that any match reported must appear entirely within that substring.
For a forwards search, a match should never be reported when Input::is_done returns true. (For reverse searches, termination should be handled outside of Input.)

Supporting other aspects of an Input are optional, but regex engines should handle aspects they don’t support gracefully. How this is done is generally up to the regex engine. This crate generally treats unsupported anchored modes as an error to report for example, but for simplicity, in the meta regex engine, trying to search with an invalid pattern ID just results in no match being reported.

Fields§

§haystack: &'h [u8]§span: Span§anchored: Anchored§earliest: bool

Struct Input Copy item path

§Organization

§Valid bounds and search termination

§Regex engine support

Fields§

Implementations§

impl<'h> Input<'h>

pub fn new<H: ?Sized + AsRef<[u8]>>(haystack: &'h H) -> Input<'h>

pub fn span<S: Into<Span>>(self, span: S) -> Input<'h>

§Panics

§Example

pub fn range<R: RangeBounds<usize>>(self, range: R) -> Input<'h>

§Panics

§Example

pub fn anchored(self, mode: Anchored) -> Input<'h>

§Example

pub fn earliest(self, yes: bool) -> Input<'h>

§Example

pub fn set_span<S: Into<Span>>(&mut self, span: S)

§Panics

§Example

pub fn set_range<R: RangeBounds<usize>>(&mut self, range: R)

§Panics

§Example

pub fn set_start(&mut self, start: usize)

§Panics

§Example

pub fn set_end(&mut self, end: usize)

§Panics

§Example

pub fn set_anchored(&mut self, mode: Anchored)

§Example

pub fn set_earliest(&mut self, yes: bool)

§Example

pub fn haystack(&self) -> &'h [u8] ⓘ

§Example

pub fn start(&self) -> usize

§Example

pub fn end(&self) -> usize

§Example

pub fn get_span(&self) -> Span

§Example

pub fn get_range(&self) -> Range<usize>

§Example

pub fn get_anchored(&self) -> Anchored

§Example

pub fn get_earliest(&self) -> bool

§Example

pub fn is_done(&self) -> bool

§Example

pub fn is_char_boundary(&self, offset: usize) -> bool

§Example

Trait Implementations§

impl<'h> Clone for Input<'h>

fn clone(&self) -> Input<'h>

fn clone_from(&mut self, source: &Self)

impl<'h> Debug for Input<'h>

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl<'h, H: ?Sized + AsRef<[u8]>> From<&'h H> for Input<'h>

fn from(haystack: &'h H) -> Input<'h>

Auto Trait Implementations§

impl<'h> Freeze for Input<'h>

impl<'h> RefUnwindSafe for Input<'h>

impl<'h> Send for Input<'h>

impl<'h> Sync for Input<'h>

impl<'h> Unpin for Input<'h>

impl<'h> UnsafeUnpin for Input<'h>

impl<'h> UnwindSafe for Input<'h>

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

Struct Input

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,