Module regex_syntax::unicode

source Β·

Structs§

  • An error that occurs when Unicode-aware simple case folding fails.
  • A state oriented traverser of the simple case folding table.
  • An error that occurs when the Unicode-aware \w class is unavailable.

Enums§

  • Like ClassQuery, but its parameters have been canonicalized. This also differentiates binary properties from flattened general categories and scripts.
  • A query for finding a character class defined by Unicode. This supports either use of a property name directly, or lookup by property value. The former generally refers to Binary properties (see UTS#44, Table 8), but as a special exception (see UTS#18, Section 1.2) both general categories (an enumeration) and scripts (a catalog) are supported as if each of their possible values were a binary property.
  • An error that occurs when dealing with Unicode.

Functions§

  • ages πŸ”’
    Returns an iterator over Unicode Age sets. Each item corresponds to a set of codepoints that were added in a particular revision of Unicode. The iterator yields items in chronological order.
  • bool_property πŸ”’
    Returns the Unicode HIR class corresponding to the given Unicode boolean property.
  • canonical_gencat πŸ”’
  • canonical_prop πŸ”’
    Find the canonical property name for the given normalized property name.
  • canonical_script πŸ”’
  • canonical_value πŸ”’
    Find the canonical property value for the given normalized property value.
  • Looks up a Unicode class given a query. If one doesn’t exist, then None is returned.
  • gcb πŸ”’
    Returns the Unicode HIR class corresponding to the given grapheme cluster break property.
  • gencat πŸ”’
    Returns the Unicode HIR class corresponding to the given general category.
  • Build a Unicode HIR class from a sequence of Unicode scalar value ranges.
  • Returns true only if the given codepoint is in the \w character class.
  • Returns a Unicode aware class for \d.
  • Returns a Unicode aware class for \s.
  • Returns a Unicode aware class for \w.
  • property_set πŸ”’
  • property_values πŸ”’
    Return the table of property values for the given property name.
  • sb πŸ”’
    Returns the Unicode HIR class corresponding to the given sentence break property.
  • script πŸ”’
    Returns the Unicode HIR class corresponding to the given script.
  • script_extension πŸ”’
    Returns the Unicode HIR class corresponding to the given script extension.
  • Like symbolic_name_normalize_bytes, but operates on a string.
  • Normalize the given symbolic name in place according to UAX44-LM3.
  • wb πŸ”’
    Returns the Unicode HIR class corresponding to the given word break property.

Type Aliases§

  • PropertyValues πŸ”’
    A mapping of property values for a specific property.
  • Range πŸ”’
    An inclusive range of codepoints from a generated file (hence the static lifetime).