Enum icu_collator::elements::Tag

source ·
#[repr(u8)]
pub(crate) enum Tag {
Show 16 variants Fallback = 0, LongPrimary = 1, LongSecondary = 2, Reserved3 = 3, LatinExpansion = 4, Expansion32 = 5, Expansion = 6, BuilderData = 7, Prefix = 8, Contraction = 9, Digit = 10, U0000 = 11, Hangul = 12, LeadSurrogate = 13, Offset = 14, Implicit = 15,
}
Expand description

Special-CE32 tags, from bits 3..0 of a special 32-bit CE. Bits 31..8 are available for tag-specific data. Bits 5..4: Reserved. May be used in the future to indicate lccc!=0 and tccc!=0.

Variants§

§

Fallback = 0

Fall back to the base collator. This is the tag value in SPECIAL_CE32_LOW_BYTE and FALLBACK_CE32. Bits 31..8: Unused, 0.

§

LongPrimary = 1

Long-primary CE with COMMON_SEC_AND_TER_CE. Bits 31..8: Three-byte primary.

§

LongSecondary = 2

Long-secondary CE with zero primary. Bits 31..16: Secondary weight. Bits 15.. 8: Tertiary weight.

§

Reserved3 = 3

Unused. May be used in the future for single-byte secondary CEs (SHORT_SECONDARY_TAG), storing the secondary in bits 31..24, the ccc in bits 23..16, and the tertiary in bits 15..8.

§

LatinExpansion = 4

Latin mini expansions of two simple CEs [pp, 05, tt] [00, ss, 05]. Bits 31..24: Single-byte primary weight pp of the first CE. Bits 23..16: Tertiary weight tt of the first CE. Bits 15.. 8: Secondary weight ss of the second CE. Unused by ICU4X, may get repurposed for jamo expansions is Korean search.

§

Expansion32 = 5

Points to one or more simple/long-primary/long-secondary 32-bit CE32s. Bits 31..13: Index into uint32_t table. Bits 12.. 8: Length=1..31.

§

Expansion = 6

Points to one or more 64-bit CEs. Bits 31..13: Index into CE table. Bits 12.. 8: Length=1..31.

§

BuilderData = 7

Builder data, used only in the CollationDataBuilder, not in runtime data.

If bit 8 is 0: Builder context, points to a list of context-sensitive mappings. Bits 31..13: Index to the builder’s list of ConditionalCE32 for this character. Bits 12.. 9: Unused, 0.

If bit 8 is 1 (IS_BUILDER_JAMO_CE32): Builder-only jamoCE32 value. The builder fetches the Jamo CE32 from the trie. Bits 31..13: Jamo code point. Bits 12.. 9: Unused, 0.

§

Prefix = 8

Points to prefix trie. Bits 31..13: Index into prefix/contraction data. Bits 12.. 8: Unused, 0.

§

Contraction = 9

Points to contraction data. Bits 31..13: Index into prefix/contraction data. Bits 12..11: Unused, 0. Bit 10: CONTRACT_TRAILING_CCC flag. Bit 9: CONTRACT_NEXT_CCC flag. Bit 8: CONTRACT_SINGLE_CP_NO_MATCH flag.

§

Digit = 10

Decimal digit. Bits 31..13: Index into uint32_t table for non-numeric-collation CE32. Bit 12: Unused, 0. Bits 11.. 8: Digit value 0..9.

§

U0000 = 11

Tag for U+0000, for moving the NUL-termination handling from the regular fastpath into specials-handling code. Bits 31..8: Unused, 0. Not used by ICU4X.

§

Hangul = 12

Tag for a Hangul syllable. Bits 31..9: Unused, 0. Bit 8: HANGUL_NO_SPECIAL_JAMO flag. Not used by ICU4X, may get reused for compressing Hanja expansions.

§

LeadSurrogate = 13

Tag for a lead surrogate code unit. Optional optimization for UTF-16 string processing. Bits 31..10: Unused, 0. 9.. 8: =0: All associated supplementary code points are unassigned-implicit. =1: All associated supplementary code points fall back to the base data. else: (Normally 2) Look up the data for the supplementary code point. Not used by ICU4X.

§

Offset = 14

Tag for CEs with primary weights in code point order. Bits 31..13: Index into CE table, for one data “CE”. Bits 12.. 8: Unused, 0.

This data “CE” has the following bit fields: Bits 63..32: Three-byte primary pppppp00. 31.. 8: Start/base code point of the in-order range. 7: Flag isCompressible primary. 6.. 0: Per-code point primary-weight increment.

§

Implicit = 15

Implicit CE tag. Compute an unassigned-implicit CE. All bits are set (UNASSIGNED_CE32=0xffffffff).

Trait Implementations§

source§

impl Debug for Tag

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
source§

impl PartialEq for Tag

source§

fn eq(&self, other: &Tag) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
source§

impl Eq for Tag

source§

impl StructuralPartialEq for Tag

Auto Trait Implementations§

§

impl Freeze for Tag

§

impl RefUnwindSafe for Tag

§

impl Send for Tag

§

impl Sync for Tag

§

impl Unpin for Tag

§

impl UnwindSafe for Tag

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

source§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
source§

impl<T> ErasedDestructor for T
where T: 'static,

source§

impl<T> MaybeSendSync for T