Module icu_collator::elements

source ·
Expand description

This module holds the 64-bit CollationElement struct used for the actual comparison, the 32-bit CollationElement32 struct that’s used for storage. (Strictly speaking, the storage is RawBytesULE<4>.) And the CollationElements iterator adapter that turns an iterator over char into an iterator over CollationElement. (To match the structure of ICU4C, this isn’t a real Rust Iterator. Instead of signaling end by returning None, it signals end by returning NO_CE.)

This module also declares various constants that are also used by the comparison module.

Structs§

  • Pack a char and a CanonicalCombiningClass in 32 bits (the former in the lower 24 bits and the latter in the high 8 bits). The latter can be initialized to 0xFF upon creation, in which case it can be actually set later by calling set_ccc_from_trie_if_not_already_set. This is a micro optimization to avoid the Canonical Combining Class trie lookup when there is only one combining character in a sequence. This type is intentionally non-Copy to get compiler help in making sure that the class is set on the instance on which it is intended to be set and not on a temporary copy.
  • This struct makes the handling of the upcoming buffer easily so that trie lookups are done at most once. However, when upcoming[0] is an undecomposed starter, we don’t need the ccc yet, and when lookahead has already done the trie lookups, we don’t need trie_value, as it is implied by ccc.
  • A collation element is a 64-bit value.
  • A compressed form of a collation element as stored in the collation data.
  • Iterator that transforms an iterator over char into an iterator over CollationElement with a tailoring. Not a real Rust iterator: Instead of None uses NO_CE to indicate end of iteration to optimize comparison.
  • NonPrimary 🔒
    The purpose of grouping the non-primary bits into a struct is to allow for a future optimization that specializes code over whether storage for primary weights is needed or not. (I.e. whether to specialize on CollationElement or NonPrimary.)

Enums§

  • Tag 🔒
    Special-CE32 tags, from bits 3..0 of a special 32-bit CE. Bits 31..8 are available for tag-specific data. Bits 5..4: Reserved. May be used in the future to indicate lccc!=0 and tccc!=0.

Constants§

Functions§