regex_syntax

Module unicode

Source

Structs§

  • An error that occurs when Unicode-aware simple case folding fails.
  • A state oriented traverser of the simple case folding table.
  • An error that occurs when the Unicode-aware \w class is unavailable.

Enums§

  • Like ClassQuery, but its parameters have been canonicalized. This also differentiates binary properties from flattened general categories and scripts.
  • A query for finding a character class defined by Unicode. This supports either use of a property name directly, or lookup by property value. The former generally refers to Binary properties (see UTS#44, Table 8), but as a special exception (see UTS#18, Section 1.2) both general categories (an enumeration) and scripts (a catalog) are supported as if each of their possible values were a binary property.
  • An error that occurs when dealing with Unicode.

Functions§

  • ages 🔒
    Returns an iterator over Unicode Age sets. Each item corresponds to a set of codepoints that were added in a particular revision of Unicode. The iterator yields items in chronological order.
  • Returns the Unicode HIR class corresponding to the given Unicode boolean property.
  • Find the canonical property name for the given normalized property name.
  • Find the canonical property value for the given normalized property value.
  • Looks up a Unicode class given a query. If one doesn’t exist, then None is returned.
  • gcb 🔒
    Returns the Unicode HIR class corresponding to the given grapheme cluster break property.
  • gencat 🔒
    Returns the Unicode HIR class corresponding to the given general category.
  • Build a Unicode HIR class from a sequence of Unicode scalar value ranges.
  • Returns true only if the given codepoint is in the \w character class.
  • Returns a Unicode aware class for \d.
  • Returns a Unicode aware class for \s.
  • Returns a Unicode aware class for \w.
  • Return the table of property values for the given property name.
  • sb 🔒
    Returns the Unicode HIR class corresponding to the given sentence break property.
  • script 🔒
    Returns the Unicode HIR class corresponding to the given script.
  • Returns the Unicode HIR class corresponding to the given script extension.
  • Like symbolic_name_normalize_bytes, but operates on a string.
  • Normalize the given symbolic name in place according to UAX44-LM3.
  • wb 🔒
    Returns the Unicode HIR class corresponding to the given word break property.

Type Aliases§

  • A mapping of property values for a specific property.
  • Range 🔒
    An inclusive range of codepoints from a generated file (hence the static lifetime).