Crate unicode_bidi
source ·Expand description
This crate implements the Unicode Bidirectional Algorithm for display of mixed right-to-left and left-to-right text. It is written in safe Rust, compatible with the current stable release.
§Example
use unicode_bidi::BidiInfo;
// This example text is defined using `concat!` because some browsers
// and text editors have trouble displaying bidi strings.
let text = concat![
"א",
"ב",
"ג",
"a",
"b",
"c",
];
// Resolve embedding levels within the text. Pass `None` to detect the
// paragraph level automatically.
let bidi_info = BidiInfo::new(&text, None);
// This paragraph has embedding level 1 because its first strong character is RTL.
assert_eq!(bidi_info.paragraphs.len(), 1);
let para = &bidi_info.paragraphs[0];
assert_eq!(para.level.number(), 1);
assert_eq!(para.level.is_rtl(), true);
// Re-ordering is done after wrapping each paragraph into a sequence of
// lines. For this example, I'll just use a single line that spans the
// entire paragraph.
let line = para.range.clone();
let display = bidi_info.reorder_line(para, line);
assert_eq!(display, concat![
"a",
"b",
"c",
"ג",
"ב",
"א",
]);
§Features
std
: Enabled by default, but can be disabled to makeunicode_bidi
#![no_std]
+alloc
compatible.hardcoded-data
: Enabled by default. Includes hardcoded Unicode bidi data and more convenient APIs.serde
: Addsserde::Serialize
andserde::Deserialize
implementations to relevant types.
Re-exports§
pub use crate::data_source::BidiDataSource;
pub use crate::level::Level;
pub use crate::level::LTR_LEVEL;
pub use crate::level::RTL_LEVEL;
Modules§
- Accessor for
Bidi_Class
property from Unicode Character Database (UCD) - This module holds deprecated assets only.
- explicit 🔒3.3.2 Explicit Levels and Directions
- Directional Formatting Characters
- implicit 🔒3.3.4 - 3.3.6. Resolve implicit levels and types.
- Bidi Embedding Level
- prepare 🔒3.3.3 Preparations for Implicit Processing
- private 🔒
Structs§
- Bidi information of the text.
- Hardcoded Bidi data that ships with the unicode-bidi crate.
- Initial bidi information of the text.
- Extended version of InitialInfo (not public API).
- Contains a reference of
BidiInfo
and one of itsparagraphs
. And it supports all operation in theParagraph
that needs also itsBidiInfo
such asdirection
. - Bidi information of text treated as a single paragraph.
- Bidi information about a single paragraph
- Iterator over (UTF-8) string slices returning (index, char_len) tuple.
Enums§
- Represents values of the Unicode character property
Bidi_Class
, also known as the bidirectional character type.
Constants§
- The Unicode version of data
Traits§
- Trait that abstracts over a text source for use by the bidi algorithms. We implement this for str (UTF-8) and for u16 (UTF-16, native-endian). (For internal unicode-bidi use; API may be unstable.) This trait is sealed and cannot be implemented for types outside this crate.
Functions§
- Assign levels to characters removed by rule X9.
- Find the
BidiClass
of a single char. - The core of BidiInfo initialization, factored out into a function that both the utf-8 and utf-16 versions of BidiInfo can use.
- Implementation of initial-info computation for both BidiInfo and ParagraphBidiInfo. To treat the text as (potentially) multiple paragraphs, the caller should pass the pair of optional outparam arrays to receive the ParagraphInfo and pure-ltr flags for each paragraph. Passing None for split_paragraphs will ignore any paragraph- separator characters in the text, treating it just as a single paragraph. Returns the array of BidiClass values for each code unit of the text, along with the embedding level and pure-ltr flag for the last (or only) paragraph.
- Get the base direction of the text provided according to the Unicode Bidirectional Algorithm.
- Get the base direction of the text provided according to the Unicode Bidirectional Algorithm, considering the full text if the first paragraph is all-neutral.
- Return the directionality of the paragraph (Left, Right or Mixed) from its levels.
- Produce the levels for this paragraph as needed for reordering, one level per code unit in the paragraph. The returned vector includes code units that are not included in the
line
, but will not adjust them. - Return a line of the text in display order based on resolved levels.
- Reorders pre-calculated levels of a sequence of characters.
- Find the level runs within a line and return them in visual order.
Type Aliases§
- A maximal substring of characters with the same embedding level.