Struct icu_collator::elements::CollationElements

source ·
pub(crate) struct CollationElements<'data, I>
where I: Iterator<Item = char>,
{
Show 15 fields iter: I, pending: SmallVec<[CollationElement; 6]>, pending_pos: usize, prefix: [char; 2], upcoming: SmallVec<[CharacterAndClassAndTrieValue; 10]>, root: &'data CollationDataV1<'data>, tailoring: &'data CollationDataV1<'data>, jamo: &'data [<u32 as AsULE>::ULE; 256], diacritics: &'data ZeroSlice<u16>, trie: &'data CodePointTrie<'data, u32>, scalars16: &'data ZeroSlice<u16>, scalars32: &'data ZeroSlice<char>, numeric_primary: Option<u8>, lithuanian_dot_above: bool, iter_exhausted: bool,
}
Expand description

Iterator that transforms an iterator over char into an iterator over CollationElement with a tailoring. Not a real Rust iterator: Instead of None uses NO_CE to indicate end of iteration to optimize comparison.

Fields§

§iter: I§pending: SmallVec<[CollationElement; 6]>

Already computed but not yet returned CollationElements.

§pending_pos: usize

The index of the next item to be returned from pending. The purpose of this index is to avoid moving the rest of the items.

§prefix: [char; 2]

The characters most previously seen (or never-matching placeholders) CLDR, as of 40, has two kinds of prefixes: Prefixes that contain a single starter Prefixes that contain a starter followed by either U+3099 or U+309A Last-pushed is at index 0 and previously-pushed at index 1

§upcoming: SmallVec<[CharacterAndClassAndTrieValue; 10]>

upcoming holds the characters that have already been read from iter but haven’t yet been mapped to CollationElements.

Typically, upcoming holds one character and corresponds semantically to pending_unnormalized_starter in icu::normalizer::Decomposition. This is why there isn’t a move avoidance optimization similar to pending_pos above for this buffer. A complex decomposition, a Hangul syllable followed by a non-starter, or lookahead can cause pending to hold more than one char.

Invariant: upcoming is allowed to become empty only after iter has been exhausted.

Invariant: (Checked by debug_assert!) At the start of next() call, if upcoming isn’t empty (with iter having been exhausted), the first char in upcoming must have its decomposition start with a starter.

§root: &'data CollationDataV1<'data>

The root collation data.

§tailoring: &'data CollationDataV1<'data>

Tailoring if applicable.

§jamo: &'data [<u32 as AsULE>::ULE; 256]

The CollationElement32 mapping for the Hangul Jamo block.

Note: in ICU4C the jamo table contains only modern jamo. Here, the jamo table contains the whole Unicode block.

§diacritics: &'data ZeroSlice<u16>

The CollationElement32 mapping for the Combining Diacritical Marks block.

§trie: &'data CodePointTrie<'data, u32>

NFD main trie.

§scalars16: &'data ZeroSlice<u16>

NFD complex decompositions on the BMP

§scalars32: &'data ZeroSlice<char>

NFD complex decompositions on supplementary planes

§numeric_primary: Option<u8>

If numeric mode is enabled, the 8 high bits of the numeric primary. None if disabled.

§lithuanian_dot_above: bool

Whether the Lithuanian combining dot above handling is enabled.

§iter_exhausted: bool

Whether iter has been exhausted

Implementations§

source§

impl<'data, I> CollationElements<'data, I>
where I: Iterator<Item = char>,

source

pub fn new( delegate: I, root: &'data CollationDataV1<'_>, tailoring: &'data CollationDataV1<'_>, jamo: &'data [<u32 as AsULE>::ULE; 256], diacritics: &'data ZeroSlice<u16>, decompositions: &'data DecompositionDataV1<'_>, tables: &'data DecompositionTablesV1<'_>, numeric_primary: Option<u8>, lithuanian_dot_above: bool, ) -> Self

source

fn iter_next(&mut self) -> Option<CharacterAndClassAndTrieValue>

source

fn next_internal(&mut self) -> Option<CharacterAndClassAndTrieValue>

source

fn maybe_gather_combining(&mut self)

source

fn push_decomposed_combining( &mut self, c: CharacterAndClassAndTrieValue, ) -> usize

source

fn push_decomposed_and_gather_combining( &mut self, c: CharacterAndClassAndTrieValue, )

source

fn look_ahead(&mut self, pos: usize) -> Option<CharacterAndClassAndTrieValue>

source

fn is_next_decomposition_starts_with_starter(&self) -> bool

source

fn prepend_and_sort_non_starter_prefix_of_suffix( &mut self, c: CharacterAndClassAndTrieValue, )

source

fn prefix_push(&mut self, c: char)

source

fn mark_prefix_unmatchable(&mut self)

Micro optimization for doing a simpler write when we know the most recent character was a non-starter that is not a kana voicing mark.

source

pub fn next(&mut self) -> CollationElement

source

fn collect_combining( &mut self, combining_characters: &mut SmallVec<[CharacterAndClass; 7]>, )

Auto Trait Implementations§

§

impl<'data, I> Freeze for CollationElements<'data, I>
where I: Freeze,

§

impl<'data, I> RefUnwindSafe for CollationElements<'data, I>
where I: RefUnwindSafe,

§

impl<'data, I> Send for CollationElements<'data, I>
where I: Send,

§

impl<'data, I> Sync for CollationElements<'data, I>
where I: Sync,

§

impl<'data, I> Unpin for CollationElements<'data, I>
where I: Unpin,

§

impl<'data, I> UnwindSafe for CollationElements<'data, I>
where I: UnwindSafe,

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
source§

impl<T> ErasedDestructor for T
where T: 'static,

source§

impl<T> MaybeSendSync for T