Struct icu_casemap::titlecase::TitlecaseMapper
source · pub struct TitlecaseMapper<CM> {
cm: CM,
gc: CodePointMapData<GeneralCategory>,
}
Expand description
A wrapper around CaseMapper
that can compute titlecasing stuff, and is able to load additional data
to support the non-legacy “head adjustment” behavior.
By default, Self::titlecase_segment()
and Self::titlecase_segment_to_string()
perform “leading adjustment”,
where they wait till the first relevant character to begin titlecasing. For example, in the string 'twixt
, the apostrophe
is ignored because the word starts at the first “t”, which will get titlecased (producing 'Twixt
). Other punctuation will
also be ignored, like in the string «hello»
, which will get titlecased to «Hello»
.
This is a separate type from CaseMapper
because it loads the additional data
required by LeadingAdjustment::Auto
to perform the best possible leading adjustment.
If you are planning on only using LeadingAdjustment::None
or LeadingAdjustment::ToCased
, consider using CaseMapper
directly; this
type will have no additional behavior.
§Examples
Basic casemapping behavior:
use icu::casemap::TitlecaseMapper;
use icu::locid::langid;
let cm = TitlecaseMapper::new();
let root = langid!("und");
let default_options = Default::default();
// note that the subsequent words are not titlecased, this function assumes
// that the entire string is a single segment and only titlecases at the beginning.
assert_eq!(cm.titlecase_segment_to_string("hEllO WorLd", &root, default_options), "Hello world");
assert_eq!(cm.titlecase_segment_to_string("Γειά σου Κόσμε", &root, default_options), "Γειά σου κόσμε");
assert_eq!(cm.titlecase_segment_to_string("नमस्ते दुनिया", &root, default_options), "नमस्ते दुनिया");
assert_eq!(cm.titlecase_segment_to_string("Привет мир", &root, default_options), "Привет мир");
// Some behavior is language-sensitive
assert_eq!(cm.titlecase_segment_to_string("istanbul", &root, default_options), "Istanbul");
assert_eq!(cm.titlecase_segment_to_string("istanbul", &langid!("tr"), default_options), "İstanbul"); // Turkish dotted i
assert_eq!(cm.titlecase_segment_to_string("և Երևանի", &root, default_options), "Եւ երևանի");
assert_eq!(cm.titlecase_segment_to_string("և Երևանի", &langid!("hy"), default_options), "Եվ երևանի"); // Eastern Armenian ech-yiwn ligature
assert_eq!(cm.titlecase_segment_to_string("ijkdijk", &root, default_options), "Ijkdijk");
assert_eq!(cm.titlecase_segment_to_string("ijkdijk", &langid!("nl"), default_options), "IJkdijk"); // Dutch IJ digraph
Fields§
§cm: CM
§gc: CodePointMapData<GeneralCategory>
Implementations§
source§impl TitlecaseMapper<CaseMapper>
impl TitlecaseMapper<CaseMapper>
sourcepub const fn new() -> Self
pub const fn new() -> Self
A constructor which creates a TitlecaseMapper
using compiled data
✨ Enabled with the compiled_data
Cargo feature.
sourcepub fn try_new_with_any_provider(
provider: &(impl AnyProvider + ?Sized),
) -> Result<Self, DataError>
pub fn try_new_with_any_provider( provider: &(impl AnyProvider + ?Sized), ) -> Result<Self, DataError>
A version of Self::new
that uses custom data provided by an AnyProvider
.
sourcepub fn try_new_unstable<P>(provider: &P) -> Result<Self, DataError>
pub fn try_new_unstable<P>(provider: &P) -> Result<Self, DataError>
A version of Self::new
that uses custom data provided by a DataProvider
.
source§impl<CM: AsRef<CaseMapper>> TitlecaseMapper<CM>
impl<CM: AsRef<CaseMapper>> TitlecaseMapper<CM>
sourcepub fn try_new_with_mapper_with_any_provider(
provider: &(impl AnyProvider + ?Sized),
casemapper: CM,
) -> Result<Self, DataError>
pub fn try_new_with_mapper_with_any_provider( provider: &(impl AnyProvider + ?Sized), casemapper: CM, ) -> Result<Self, DataError>
A version of Self::new_with_mapper
that uses custom data provided by an AnyProvider
.
sourcepub const fn new_with_mapper(casemapper: CM) -> Self
pub const fn new_with_mapper(casemapper: CM) -> Self
A constructor which creates a TitlecaseMapper
from an existing CaseMapper
(either owned or as a reference) and compiled data
✨ Enabled with the compiled_data
Cargo feature.
sourcepub fn try_new_with_mapper_unstable<P>(
provider: &P,
casemapper: CM,
) -> Result<Self, DataError>
pub fn try_new_with_mapper_unstable<P>( provider: &P, casemapper: CM, ) -> Result<Self, DataError>
Construct this object to wrap an existing CaseMapper (or a reference to one), loading additional data as needed.
A version of Self::new_with_mapper
that uses custom data provided by a DataProvider
.
sourcepub fn titlecase_segment<'a>(
&'a self,
src: &'a str,
langid: &LanguageIdentifier,
options: TitlecaseOptions,
) -> impl Writeable + 'a
pub fn titlecase_segment<'a>( &'a self, src: &'a str, langid: &LanguageIdentifier, options: TitlecaseOptions, ) -> impl Writeable + 'a
Returns the full titlecase mapping of the given string as a Writeable
, treating
the string as a single segment (and thus only titlecasing the beginning of it).
This should typically be used as a lower-level helper to construct the titlecasing operation desired
by the application, for example one can titlecase on a per-word basis by mixing this with
a WordSegmenter
.
This function is context and language sensitive. Callers should pass the text’s language
as a LanguageIdentifier
(usually the id
field of the Locale
) if available, or
Default::default()
for the root locale.
See Self::titlecase_segment_to_string()
for the equivalent convenience function that returns a String,
as well as for an example.
sourcepub fn titlecase_segment_to_string(
&self,
src: &str,
langid: &LanguageIdentifier,
options: TitlecaseOptions,
) -> String
pub fn titlecase_segment_to_string( &self, src: &str, langid: &LanguageIdentifier, options: TitlecaseOptions, ) -> String
Returns the full titlecase mapping of the given string as a String, treating the string as a single segment (and thus only titlecasing the beginning of it).
This should typically be used as a lower-level helper to construct the titlecasing operation desired
by the application, for example one can titlecase on a per-word basis by mixing this with
a WordSegmenter
.
This function is context and language sensitive. Callers should pass the text’s language
as a LanguageIdentifier
(usually the id
field of the Locale
) if available, or
Default::default()
for the root locale.
See Self::titlecase_segment()
for the equivalent lower-level function that returns a Writeable
§Examples
use icu::casemap::TitlecaseMapper;
use icu::locid::langid;
let cm = TitlecaseMapper::new();
let root = langid!("und");
let default_options = Default::default();
// note that the subsequent words are not titlecased, this function assumes
// that the entire string is a single segment and only titlecases at the beginning.
assert_eq!(cm.titlecase_segment_to_string("hEllO WorLd", &root, default_options), "Hello world");
assert_eq!(cm.titlecase_segment_to_string("Γειά σου Κόσμε", &root, default_options), "Γειά σου κόσμε");
assert_eq!(cm.titlecase_segment_to_string("नमस्ते दुनिया", &root, default_options), "नमस्ते दुनिया");
assert_eq!(cm.titlecase_segment_to_string("Привет мир", &root, default_options), "Привет мир");
// Some behavior is language-sensitive
assert_eq!(cm.titlecase_segment_to_string("istanbul", &root, default_options), "Istanbul");
assert_eq!(cm.titlecase_segment_to_string("istanbul", &langid!("tr"), default_options), "İstanbul"); // Turkish dotted i
assert_eq!(cm.titlecase_segment_to_string("և Երևանի", &root, default_options), "Եւ երևանի");
assert_eq!(cm.titlecase_segment_to_string("և Երևանի", &langid!("hy"), default_options), "Եվ երևանի"); // Eastern Armenian ech-yiwn ligature
assert_eq!(cm.titlecase_segment_to_string("ijkdijk", &root, default_options), "Ijkdijk");
assert_eq!(cm.titlecase_segment_to_string("ijkdijk", &langid!("nl"), default_options), "IJkdijk"); // Dutch IJ digraph
Leading adjustment behaviors:
use icu::casemap::titlecase::{LeadingAdjustment, TitlecaseOptions};
use icu::casemap::TitlecaseMapper;
use icu::locid::langid;
let cm = TitlecaseMapper::new();
let root = langid!("und");
let default_options = Default::default();
let mut no_adjust: TitlecaseOptions = Default::default();
no_adjust.leading_adjustment = LeadingAdjustment::None;
// Exhibits leading adjustment when set:
assert_eq!(
cm.titlecase_segment_to_string("«hello»", &root, default_options),
"«Hello»"
);
assert_eq!(
cm.titlecase_segment_to_string("«hello»", &root, no_adjust),
"«hello»"
);
assert_eq!(
cm.titlecase_segment_to_string("'Twas", &root, default_options),
"'Twas"
);
assert_eq!(
cm.titlecase_segment_to_string("'Twas", &root, no_adjust),
"'twas"
);
assert_eq!(
cm.titlecase_segment_to_string("", &root, default_options),
""
);
assert_eq!(cm.titlecase_segment_to_string("", &root, no_adjust), "");
Tail casing behaviors:
use icu::casemap::titlecase::{TitlecaseOptions, TrailingCase};
use icu::casemap::TitlecaseMapper;
use icu::locid::langid;
let cm = TitlecaseMapper::new();
let root = langid!("und");
let default_options = Default::default();
let mut preserve_case: TitlecaseOptions = Default::default();
preserve_case.trailing_case = TrailingCase::Unchanged;
// Exhibits trailing case when set:
assert_eq!(
cm.titlecase_segment_to_string("spOngeBoB", &root, default_options),
"Spongebob"
);
assert_eq!(
cm.titlecase_segment_to_string("spOngeBoB", &root, preserve_case),
"SpOngeBoB"
);
Trait Implementations§
source§impl<CM: Clone> Clone for TitlecaseMapper<CM>
impl<CM: Clone> Clone for TitlecaseMapper<CM>
source§fn clone(&self) -> TitlecaseMapper<CM>
fn clone(&self) -> TitlecaseMapper<CM>
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read more