Struct regex_syntax::utf8::Utf8Sequences
source · pub struct Utf8Sequences {
range_stack: Vec<ScalarRange>,
}
Expand description
An iterator over ranges of matching UTF-8 byte sequences.
The iteration represents an alternation of comprehensive byte sequences that match precisely the set of UTF-8 encoded scalar values.
A byte sequence corresponds to one of the scalar values in the range given if and only if it completely matches exactly one of the sequences of byte ranges produced by this iterator.
Each sequence of byte ranges matches a unique set of bytes. That is, no two sequences will match the same bytes.
§Example
This shows how to match an arbitrary byte sequence against a range of scalar values.
use regex_syntax::utf8::{Utf8Sequences, Utf8Sequence};
fn matches(seqs: &[Utf8Sequence], bytes: &[u8]) -> bool {
for range in seqs {
if range.matches(bytes) {
return true;
}
}
false
}
// Test the basic multilingual plane.
let seqs: Vec<_> = Utf8Sequences::new('\u{0}', '\u{FFFF}').collect();
// UTF-8 encoding of 'a'.
assert!(matches(&seqs, &[0x61]));
// UTF-8 encoding of '☃' (`\u{2603}`).
assert!(matches(&seqs, &[0xE2, 0x98, 0x83]));
// UTF-8 encoding of `\u{10348}` (outside the BMP).
assert!(!matches(&seqs, &[0xF0, 0x90, 0x8D, 0x88]));
// Tries to match against a UTF-8 encoding of a surrogate codepoint,
// which is invalid UTF-8, and therefore fails, despite the fact that
// the corresponding codepoint (0xD800) falls in the range given.
assert!(!matches(&seqs, &[0xED, 0xA0, 0x80]));
// And fails against plain old invalid UTF-8.
assert!(!matches(&seqs, &[0xFF, 0xFF]));
If this example seems circuitous, that’s because it is! It’s meant to be illustrative. In practice, you could just try to decode your byte sequence and compare it with the scalar value range directly. However, this is not always possible (for example, in a byte based automaton).
Fields§
§range_stack: Vec<ScalarRange>
Implementations§
Trait Implementations§
source§impl Debug for Utf8Sequences
impl Debug for Utf8Sequences
source§impl Iterator for Utf8Sequences
impl Iterator for Utf8Sequences
§type Item = Utf8Sequence
type Item = Utf8Sequence
source§fn next(&mut self) -> Option<Self::Item>
fn next(&mut self) -> Option<Self::Item>
source§fn next_chunk<const N: usize>(
&mut self,
) -> Result<[Self::Item; N], IntoIter<Self::Item, N>>where
Self: Sized,
fn next_chunk<const N: usize>(
&mut self,
) -> Result<[Self::Item; N], IntoIter<Self::Item, N>>where
Self: Sized,
iter_next_chunk
)N
values. Read more1.0.0 · source§fn size_hint(&self) -> (usize, Option<usize>)
fn size_hint(&self) -> (usize, Option<usize>)
1.0.0 · source§fn count(self) -> usizewhere
Self: Sized,
fn count(self) -> usizewhere
Self: Sized,
1.0.0 · source§fn last(self) -> Option<Self::Item>where
Self: Sized,
fn last(self) -> Option<Self::Item>where
Self: Sized,
source§fn advance_by(&mut self, n: usize) -> Result<(), NonZero<usize>>
fn advance_by(&mut self, n: usize) -> Result<(), NonZero<usize>>
iter_advance_by
)n
elements. Read more1.0.0 · source§fn nth(&mut self, n: usize) -> Option<Self::Item>
fn nth(&mut self, n: usize) -> Option<Self::Item>
n
th element of the iterator. Read more1.28.0 · source§fn step_by(self, step: usize) -> StepBy<Self>where
Self: Sized,
fn step_by(self, step: usize) -> StepBy<Self>where
Self: Sized,
1.0.0 · source§fn chain<U>(self, other: U) -> Chain<Self, <U as IntoIterator>::IntoIter>
fn chain<U>(self, other: U) -> Chain<Self, <U as IntoIterator>::IntoIter>
1.0.0 · source§fn zip<U>(self, other: U) -> Zip<Self, <U as IntoIterator>::IntoIter>where
Self: Sized,
U: IntoIterator,
fn zip<U>(self, other: U) -> Zip<Self, <U as IntoIterator>::IntoIter>where
Self: Sized,
U: IntoIterator,
source§fn intersperse(self, separator: Self::Item) -> Intersperse<Self>
fn intersperse(self, separator: Self::Item) -> Intersperse<Self>
iter_intersperse
)separator
between adjacent
items of the original iterator. Read moresource§fn intersperse_with<G>(self, separator: G) -> IntersperseWith<Self, G>
fn intersperse_with<G>(self, separator: G) -> IntersperseWith<Self, G>
iter_intersperse
)separator
between adjacent items of the original iterator. Read more1.0.0 · source§fn map<B, F>(self, f: F) -> Map<Self, F>
fn map<B, F>(self, f: F) -> Map<Self, F>
1.0.0 · source§fn filter<P>(self, predicate: P) -> Filter<Self, P>
fn filter<P>(self, predicate: P) -> Filter<Self, P>
1.0.0 · source§fn filter_map<B, F>(self, f: F) -> FilterMap<Self, F>
fn filter_map<B, F>(self, f: F) -> FilterMap<Self, F>
1.0.0 · source§fn enumerate(self) -> Enumerate<Self>where
Self: Sized,
fn enumerate(self) -> Enumerate<Self>where
Self: Sized,
1.0.0 · source§fn skip_while<P>(self, predicate: P) -> SkipWhile<Self, P>
fn skip_while<P>(self, predicate: P) -> SkipWhile<Self, P>
1.0.0 · source§fn take_while<P>(self, predicate: P) -> TakeWhile<Self, P>
fn take_while<P>(self, predicate: P) -> TakeWhile<Self, P>
1.57.0 · source§fn map_while<B, P>(self, predicate: P) -> MapWhile<Self, P>
fn map_while<B, P>(self, predicate: P) -> MapWhile<Self, P>
1.0.0 · source§fn skip(self, n: usize) -> Skip<Self>where
Self: Sized,
fn skip(self, n: usize) -> Skip<Self>where
Self: Sized,
n
elements. Read more1.0.0 · source§fn take(self, n: usize) -> Take<Self>where
Self: Sized,
fn take(self, n: usize) -> Take<Self>where
Self: Sized,
n
elements, or fewer
if the underlying iterator ends sooner. Read more1.0.0 · source§fn flat_map<U, F>(self, f: F) -> FlatMap<Self, U, F>
fn flat_map<U, F>(self, f: F) -> FlatMap<Self, U, F>
source§fn map_windows<F, R, const N: usize>(self, f: F) -> MapWindows<Self, F, N>
fn map_windows<F, R, const N: usize>(self, f: F) -> MapWindows<Self, F, N>
iter_map_windows
)f
for each contiguous window of size N
over
self
and returns an iterator over the outputs of f
. Like slice::windows()
,
the windows during mapping overlap as well. Read more1.0.0 · source§fn inspect<F>(self, f: F) -> Inspect<Self, F>
fn inspect<F>(self, f: F) -> Inspect<Self, F>
1.0.0 · source§fn by_ref(&mut self) -> &mut Selfwhere
Self: Sized,
fn by_ref(&mut self) -> &mut Selfwhere
Self: Sized,
source§fn collect_into<E>(self, collection: &mut E) -> &mut E
fn collect_into<E>(self, collection: &mut E) -> &mut E
iter_collect_into
)1.0.0 · source§fn partition<B, F>(self, f: F) -> (B, B)
fn partition<B, F>(self, f: F) -> (B, B)
source§fn is_partitioned<P>(self, predicate: P) -> bool
fn is_partitioned<P>(self, predicate: P) -> bool
iter_is_partitioned
)true
precede all those that return false
. Read more1.27.0 · source§fn try_fold<B, F, R>(&mut self, init: B, f: F) -> R
fn try_fold<B, F, R>(&mut self, init: B, f: F) -> R
1.27.0 · source§fn try_for_each<F, R>(&mut self, f: F) -> R
fn try_for_each<F, R>(&mut self, f: F) -> R
1.0.0 · source§fn fold<B, F>(self, init: B, f: F) -> B
fn fold<B, F>(self, init: B, f: F) -> B
1.51.0 · source§fn reduce<F>(self, f: F) -> Option<Self::Item>
fn reduce<F>(self, f: F) -> Option<Self::Item>
source§fn try_reduce<F, R>(
&mut self,
f: F,
) -> <<R as Try>::Residual as Residual<Option<<R as Try>::Output>>>::TryType
fn try_reduce<F, R>( &mut self, f: F, ) -> <<R as Try>::Residual as Residual<Option<<R as Try>::Output>>>::TryType
iterator_try_reduce
)1.0.0 · source§fn all<F>(&mut self, f: F) -> bool
fn all<F>(&mut self, f: F) -> bool
1.0.0 · source§fn any<F>(&mut self, f: F) -> bool
fn any<F>(&mut self, f: F) -> bool
1.0.0 · source§fn find<P>(&mut self, predicate: P) -> Option<Self::Item>
fn find<P>(&mut self, predicate: P) -> Option<Self::Item>
1.30.0 · source§fn find_map<B, F>(&mut self, f: F) -> Option<B>
fn find_map<B, F>(&mut self, f: F) -> Option<B>
source§fn try_find<F, R>(
&mut self,
f: F,
) -> <<R as Try>::Residual as Residual<Option<Self::Item>>>::TryType
fn try_find<F, R>( &mut self, f: F, ) -> <<R as Try>::Residual as Residual<Option<Self::Item>>>::TryType
try_find
)1.0.0 · source§fn position<P>(&mut self, predicate: P) -> Option<usize>
fn position<P>(&mut self, predicate: P) -> Option<usize>
1.0.0 · source§fn max(self) -> Option<Self::Item>
fn max(self) -> Option<Self::Item>
1.0.0 · source§fn min(self) -> Option<Self::Item>
fn min(self) -> Option<Self::Item>
1.6.0 · source§fn max_by_key<B, F>(self, f: F) -> Option<Self::Item>
fn max_by_key<B, F>(self, f: F) -> Option<Self::Item>
1.15.0 · source§fn max_by<F>(self, compare: F) -> Option<Self::Item>
fn max_by<F>(self, compare: F) -> Option<Self::Item>
1.6.0 · source§fn min_by_key<B, F>(self, f: F) -> Option<Self::Item>
fn min_by_key<B, F>(self, f: F) -> Option<Self::Item>
1.15.0 · source§fn min_by<F>(self, compare: F) -> Option<Self::Item>
fn min_by<F>(self, compare: F) -> Option<Self::Item>
1.0.0 · source§fn unzip<A, B, FromA, FromB>(self) -> (FromA, FromB)
fn unzip<A, B, FromA, FromB>(self) -> (FromA, FromB)
1.36.0 · source§fn copied<'a, T>(self) -> Copied<Self>
fn copied<'a, T>(self) -> Copied<Self>
source§fn array_chunks<const N: usize>(self) -> ArrayChunks<Self, N>where
Self: Sized,
fn array_chunks<const N: usize>(self) -> ArrayChunks<Self, N>where
Self: Sized,
iter_array_chunks
)N
elements of the iterator at a time. Read more1.11.0 · source§fn product<P>(self) -> P
fn product<P>(self) -> P
source§fn cmp_by<I, F>(self, other: I, cmp: F) -> Ordering
fn cmp_by<I, F>(self, other: I, cmp: F) -> Ordering
iter_order_by
)Iterator
with those
of another with respect to the specified comparison function. Read more1.5.0 · source§fn partial_cmp<I>(self, other: I) -> Option<Ordering>
fn partial_cmp<I>(self, other: I) -> Option<Ordering>
PartialOrd
elements of
this Iterator
with those of another. The comparison works like short-circuit
evaluation, returning a result without comparing the remaining elements.
As soon as an order can be determined, the evaluation stops and a result is returned. Read moresource§fn partial_cmp_by<I, F>(self, other: I, partial_cmp: F) -> Option<Ordering>where
Self: Sized,
I: IntoIterator,
F: FnMut(Self::Item, <I as IntoIterator>::Item) -> Option<Ordering>,
fn partial_cmp_by<I, F>(self, other: I, partial_cmp: F) -> Option<Ordering>where
Self: Sized,
I: IntoIterator,
F: FnMut(Self::Item, <I as IntoIterator>::Item) -> Option<Ordering>,
iter_order_by
)Iterator
with those
of another with respect to the specified comparison function. Read moresource§fn eq_by<I, F>(self, other: I, eq: F) -> bool
fn eq_by<I, F>(self, other: I, eq: F) -> bool
iter_order_by
)1.5.0 · source§fn lt<I>(self, other: I) -> bool
fn lt<I>(self, other: I) -> bool
Iterator
are lexicographically
less than those of another. Read more1.5.0 · source§fn le<I>(self, other: I) -> bool
fn le<I>(self, other: I) -> bool
Iterator
are lexicographically
less or equal to those of another. Read more1.5.0 · source§fn gt<I>(self, other: I) -> bool
fn gt<I>(self, other: I) -> bool
Iterator
are lexicographically
greater than those of another. Read more1.5.0 · source§fn ge<I>(self, other: I) -> bool
fn ge<I>(self, other: I) -> bool
Iterator
are lexicographically
greater than or equal to those of another. Read moresource§fn is_sorted(self) -> bool
fn is_sorted(self) -> bool
is_sorted
)source§fn is_sorted_by<F>(self, compare: F) -> bool
fn is_sorted_by<F>(self, compare: F) -> bool
is_sorted
)source§fn is_sorted_by_key<F, K>(self, f: F) -> bool
fn is_sorted_by_key<F, K>(self, f: F) -> bool
is_sorted
)