Module unicode

Source

StructsΒ§

CaseFoldError
An error that occurs when Unicode-aware simple case folding fails.
SimpleCaseFolder
A state oriented traverser of the simple case folding table.
UnicodeWordError
An error that occurs when the Unicode-aware \w class is unavailable.

EnumsΒ§

CanonicalClassQuery πŸ”’
Like ClassQuery, but its parameters have been canonicalized. This also differentiates binary properties from flattened general categories and scripts.
ClassQuery
A query for finding a character class defined by Unicode. This supports either use of a property name directly, or lookup by property value. The former generally refers to Binary properties (see UTS#44, Table 8), but as a special exception (see UTS#18, Section 1.2) both general categories (an enumeration) and scripts (a catalog) are supported as if each of their possible values were a binary property.
Error
An error that occurs when dealing with Unicode.

FunctionsΒ§

ages πŸ”’
Returns an iterator over Unicode Age sets. Each item corresponds to a set of codepoints that were added in a particular revision of Unicode. The iterator yields items in chronological order.
bool_property πŸ”’
Returns the Unicode HIR class corresponding to the given Unicode boolean property.
canonical_gencat πŸ”’
canonical_prop πŸ”’
Find the canonical property name for the given normalized property name.
canonical_script πŸ”’
canonical_value πŸ”’
Find the canonical property value for the given normalized property value.
class
Looks up a Unicode class given a query. If one doesn’t exist, then None is returned.
gcb πŸ”’
Returns the Unicode HIR class corresponding to the given grapheme cluster break property.
gencat πŸ”’
Returns the Unicode HIR class corresponding to the given general category.
hir_class
Build a Unicode HIR class from a sequence of Unicode scalar value ranges.
is_word_character
Returns true only if the given codepoint is in the \w character class.
perl_digit
Returns a Unicode aware class for \d.
perl_space
Returns a Unicode aware class for \s.
perl_word
Returns a Unicode aware class for \w.
property_set πŸ”’
property_values πŸ”’
Return the table of property values for the given property name.
sb πŸ”’
Returns the Unicode HIR class corresponding to the given sentence break property.
script πŸ”’
Returns the Unicode HIR class corresponding to the given script.
script_extension πŸ”’
Returns the Unicode HIR class corresponding to the given script extension.
symbolic_name_normalize πŸ”’
Like symbolic_name_normalize_bytes, but operates on a string.
symbolic_name_normalize_bytes πŸ”’
Normalize the given symbolic name in place according to UAX44-LM3.
wb πŸ”’
Returns the Unicode HIR class corresponding to the given word break property.

Type AliasesΒ§

PropertyValues πŸ”’
A mapping of property values for a specific property.
Range πŸ”’
An inclusive range of codepoints from a generated file (hence the static lifetime).