Expand description
This crate implements the Unicode Bidirectional Algorithm for display of mixed right-to-left and left-to-right text. It is written in safe Rust, compatible with the current stable release.
§Example
use unicode_bidi::BidiInfo;
// This example text is defined using `concat!` because some browsers
// and text editors have trouble displaying bidi strings.
let text = concat![
"א",
"ב",
"ג",
"a",
"b",
"c",
];
// Resolve embedding levels within the text. Pass `None` to detect the
// paragraph level automatically.
let bidi_info = BidiInfo::new(&text, None);
// This paragraph has embedding level 1 because its first strong character is RTL.
assert_eq!(bidi_info.paragraphs.len(), 1);
let para = &bidi_info.paragraphs[0];
assert_eq!(para.level.number(), 1);
assert_eq!(para.level.is_rtl(), true);
// Re-ordering is done after wrapping each paragraph into a sequence of
// lines. For this example, I'll just use a single line that spans the
// entire paragraph.
let line = para.range.clone();
let display = bidi_info.reorder_line(para, line);
assert_eq!(display, concat![
"a",
"b",
"c",
"ג",
"ב",
"א",
]);
§Features
std
: Enabled by default, but can be disabled to makeunicode_bidi
#![no_std]
+alloc
compatible.hardcoded-data
: Enabled by default. Includes hardcoded Unicode bidi data and more convenient APIs.serde
: Adds [serde::Serialize
] and [serde::Deserialize
] implementations to relevant types.
Re-exports§
pub use crate::data_source::BidiDataSource;
pub use crate::level::Level;
pub use crate::level::LTR_LEVEL;
pub use crate::level::RTL_LEVEL;
Modules§
- char_
data 🔒 - Accessor for
Bidi_Class
property from Unicode Character Database (UCD) - data_
source - deprecated
- This module holds deprecated assets only.
- explicit 🔒
- 3.3.2 Explicit Levels and Directions
- format_
chars - Directional Formatting Characters
- implicit 🔒
- 3.3.4 - 3.3.6. Resolve implicit levels and types.
- level
- Bidi Embedding Level
- prepare 🔒
- 3.3.3 Preparations for Implicit Processing
- private 🔒
- utf16
Structs§
- Bidi
Info - Bidi information of the text.
- Hardcoded
Bidi Data - Hardcoded Bidi data that ships with the unicode-bidi crate.
- Initial
Info - Initial bidi information of the text.
- Initial
Info 🔒Ext - Extended version of InitialInfo (not public API).
- Paragraph
- Contains a reference of
BidiInfo
and one of itsparagraphs
. And it supports all operation in theParagraph
that needs also itsBidiInfo
such asdirection
. - Paragraph
Bidi Info - Bidi information of text treated as a single paragraph.
- Paragraph
Info - Bidi information about a single paragraph
- Paragraph
Info 🔒Flags - Utf8
Index LenIter - Iterator over (UTF-8) string slices returning (index, char_len) tuple.
Enums§
- Bidi
Class - Represents values of the Unicode character property
Bidi_Class
, also known as the bidirectional character type. - Direction
Constants§
- UNICODE_
VERSION - The Unicode version of data
Traits§
- Text
Source - Trait that abstracts over a text source for use by the bidi algorithms. We implement this for str (UTF-8) and for u16 (UTF-16, native-endian). (For internal unicode-bidi use; API may be unstable.) This trait is sealed and cannot be implemented for types outside this crate.
Functions§
- assign_
levels_ 🔒to_ removed_ chars - Assign levels to characters removed by rule X9.
- bidi_
class - Find the
BidiClass
of a single char. - compute_
bidi_ 🔒info_ for_ para - The core of BidiInfo initialization, factored out into a function that both the utf-8 and utf-16 versions of BidiInfo can use.
- compute_
initial_ 🔒info - Implementation of initial-info computation for both BidiInfo and ParagraphBidiInfo. To treat the text as (potentially) multiple paragraphs, the caller should pass the pair of optional outparam arrays to receive the ParagraphInfo and pure-ltr flags for each paragraph. Passing None for split_paragraphs will ignore any paragraph- separator characters in the text, treating it just as a single paragraph. Returns the array of BidiClass values for each code unit of the text, along with the embedding level and pure-ltr flag for the last (or only) paragraph.
- get_
base_ direction - Get the base direction of the text provided according to the Unicode Bidirectional Algorithm.
- get_
base_ direction_ full - Get the base direction of the text provided according to the Unicode Bidirectional Algorithm, considering the full text if the first paragraph is all-neutral.
- get_
base_ direction_ full_ with_ data_ source - get_
base_ 🔒direction_ impl - get_
base_ direction_ with_ data_ source - para_
direction 🔒 - Return the directionality of the paragraph (Left, Right or Mixed) from its levels.
- reorder_
levels 🔒 - Produce the levels for this paragraph as needed for reordering, one level per code unit
in the paragraph. The returned vector includes code units that are not included
in the
line
, but will not adjust them. - reorder_
line 🔒 - Return a line of the text in display order based on resolved levels.
- reorder_
visual 🔒 - Reorders pre-calculated levels of a sequence of characters.
- visual_
runs_ 🔒for_ line - Find the level runs within a line and return them in visual order.
Type Aliases§
- Level
Run - A maximal substring of characters with the same embedding level.
- Level
RunVec