encoding_c_mem

Function encoding_mem_is_utf16_code_unit_bidi

source
#[no_mangle]
pub unsafe extern "C" fn encoding_mem_is_utf16_code_unit_bidi(
    u: u16,
) -> bool
Expand description

Checks whether a UTF-16 code unit triggers right-to-left processing.

The check is done on a Unicode block basis without regard to assigned vs. unassigned code points in the block. Hebrew presentation forms in the Alphabetic Presentation Forms block are treated as if they formed a block on their own (i.e. it treated as right-to-left). Additionally, the four RIGHT-TO-LEFT FOO controls in General Punctuation are checked for. Control characters that are technically bidi controls but do not cause right-to-left behavior without the presence of right-to-left characters or right-to-left controls are not checked for. As a special case, U+FEFF is excluded from Arabic Presentation Forms-B.

Since supplementary-plane right-to-left blocks are identifiable from the high surrogate without examining the low surrogate, this function returns true for such high surrogates making the function suitable for handling supplementary-plane text without decoding surrogate pairs to scalar values. Obviously, such high surrogates are then reported as right-to-left even if actually unpaired.