Expand description
This module provides the lower-level API for UTS 46.
Uts46::process
is the core that the other convenience
methods build on.
UTS 46 flags map to this API as follows:
- CheckHyphens - true:
Hyphens::Check
, false:Hyphens::Allow
; the WHATWG URL Standard sets this to false for normal (non-conformance-checker) user agents. - CheckBidi - Always true; cannot be configured, since this flag is true even when WHATWG URL Standard beStrict is false.
- CheckJoiners - Always true; cannot be configured, since this flag is true even when WHATWG URL Standard beStrict is false.
- UseSTD3ASCIIRules - true:
AsciiDenyList::STD3
, false:AsciiDenyList::EMPTY
; however, the check the WHATWG URL Standard performs right after the UTS 46 invocation corresponds toAsciiDenyList::URL
. - Transitional_Processing - Always false but could be implemented as a preprocessing step. This flag is deprecated and for Web purposes the transition is over in the sense that all of Firefox, Safari, or Chrome set this flag to false.
- VerifyDnsLength - true:
DnsLength::Verify
, false:DnsLength::Ignore
; the WHATWG URL Standard sets this to false for normal (non-conformance-checker) user agents. - IgnoreInvalidPunycode - Always false; cannot be configured. (Not yet covered by the WHATWG URL Standard, but 2 out of 3 major browser clearly behave as if this was false).
Structsยง
- The ASCII deny list to be applied.
- An implementation of UTS #46.
Enumsยง
- Already
Ascii ๐Label - The UTS 46 VerifyDNSLength flag.
- Policy for customizing behavior in case of an error.
- The CheckHyphens mode.
- The failure outcome of
Uts46::process
- The success outcome of
Uts46::process
- RtlNumeral
State ๐For keeping track of what kind of numerals have been seen in an RTL label.
Constantsยง
- DOT_
MASK ๐The mask for the ASCII dot. - GLYPHLESS_
MASK ๐Bit set for glyphless ASCII. - ICU4C-compatible constraint. https://unicode-org.atlassian.net/browse/ICU-13727
- ICU4C-compatible constraint. (Note: ICU4C measures UTF-16 and we measure UTF-32. This means that we allow longer non-BMP inputs. For this implementation, the denial-of-service scaling does not depend on BMP vs. non-BMP: only the scalar values matter.)
- PUNYCODE_
PREFIX ๐ - PUNYCODE_
PREFIX_ ๐MASK - UPPER_
CASE_ ๐MASK Bit set for upper-case ASCII.
Functionsยง
- check_
hyphens ๐ - glyphless_
mask ๐Computes the mask for glyphless ASCII. - has_
punycode_ ๐prefix - in_
inclusive_ ๐range8 - is_
ascii ๐ - ldh_
mask ๐Computes the ASCII deny list for STD3 ASCII rules. - upper_
case_ ๐mask Computes the mask for upper-case ASCII. - Performs the VerifyDNSLength check on the output of the ToASCII operation.
- write_
punycode_ ๐label