Crate unicode_bidi

source ·
Expand description

This crate implements the Unicode Bidirectional Algorithm for display of mixed right-to-left and left-to-right text. It is written in safe Rust, compatible with the current stable release.

Example

use unicode_bidi::BidiInfo;

// This example text is defined using `concat!` because some browsers
// and text editors have trouble displaying bidi strings.
let text = concat![
  "א",
  "ב",
  "ג",
  "a",
  "b",
  "c",
];

// Resolve embedding levels within the text.  Pass `None` to detect the
// paragraph level automatically.
let bidi_info = BidiInfo::new(&text, None);

// This paragraph has embedding level 1 because its first strong character is RTL.
assert_eq!(bidi_info.paragraphs.len(), 1);
let para = &bidi_info.paragraphs[0];
assert_eq!(para.level.number(), 1);
assert_eq!(para.level.is_rtl(), true);

// Re-ordering is done after wrapping each paragraph into a sequence of
// lines. For this example, I'll just use a single line that spans the
// entire paragraph.
let line = para.range.clone();

let display = bidi_info.reorder_line(para, line);
assert_eq!(display, concat![
  "a",
  "b",
  "c",
  "ג",
  "ב",
  "א",
]);

Features

  • std: Enabled by default, but can be disabled to make unicode_bidi #![no_std] + alloc compatible.
  • hardcoded-data: Enabled by default. Includes hardcoded Unicode bidi data and more convenient APIs.
  • serde: Adds serde::Serialize and serde::Deserialize implementations to relevant types.

Re-exports

Modules

Structs

  • Bidi information of the text.
  • Hardcoded Bidi data that ships with the unicode-bidi crate.
  • Initial bidi information of the text.
  • Extended version of InitialInfo (not public API).
  • Contains a reference of BidiInfo and one of its paragraphs. And it supports all operation in the Paragraph that needs also its BidiInfo such as direction.
  • Bidi information of text treated as a single paragraph.
  • Bidi information about a single paragraph
  • Iterator over (UTF-8) string slices returning (index, char_len) tuple.

Enums

Constants

Traits

  • Trait that abstracts over a text source for use by the bidi algorithms. We implement this for str (UTF-8) and for u16 (UTF-16, native-endian). (For internal unicode-bidi use; API may be unstable.) This trait is sealed and cannot be implemented for types outside this crate.

Functions

  • Assign levels to characters removed by rule X9.
  • Find the BidiClass of a single char.
  • The core of BidiInfo initialization, factored out into a function that both the utf-8 and utf-16 versions of BidiInfo can use.
  • Implementation of initial-info computation for both BidiInfo and ParagraphBidiInfo. To treat the text as (potentially) multiple paragraphs, the caller should pass the pair of optional outparam arrays to receive the ParagraphInfo and pure-ltr flags for each paragraph. Passing None for split_paragraphs will ignore any paragraph- separator characters in the text, treating it just as a single paragraph. Returns the array of BidiClass values for each code unit of the text, along with the embedding level and pure-ltr flag for the last (or only) paragraph.
  • Get the base direction of the text provided according to the Unicode Bidirectional Algorithm.
  • Get the base direction of the text provided according to the Unicode Bidirectional Algorithm, considering the full text if the first paragraph is all-neutral.
  • Return the directionality of the paragraph (Left, Right or Mixed) from its levels.
  • Produce the levels for this paragraph as needed for reordering, one level per code unit in the paragraph. The returned vector includes code units that are not included in the line, but will not adjust them.
  • Return a line of the text in display order based on resolved levels.
  • Reorders pre-calculated levels of a sequence of characters.
  • Find the level runs within a line and return them in visual order.

Type Aliases

  • A maximal substring of characters with the same embedding level.