encoding_c_mem

Function encoding_mem_convert_utf16_to_utf8_partial

source
#[no_mangle]
pub unsafe extern "C" fn encoding_mem_convert_utf16_to_utf8_partial(
    src: *const u16,
    src_len: *mut usize,
    dst: *mut u8,
    dst_len: *mut usize,
)
Expand description

Converts potentially-invalid UTF-16 to valid UTF-8 with errors replaced with the REPLACEMENT CHARACTER with potentially insufficient output space.

Writes the number of code units read into *src_len and the number of bytes written into *dst_len.

Guarantees that the bytes in the destination beyond the number of bytes claimed as written by the second item of the return tuple are left unmodified.

Not all code units are read if there isn’t enough output space.

Note that this method isn’t designed for general streamability but for not allocating memory for the worst case up front. Specifically, if the input starts with or ends with an unpaired surrogate, those are replaced with the REPLACEMENT CHARACTER.

Matches the semantics of TextEncoder.encodeInto() from the Encoding Standard.

§Safety

If you want to convert into a &mut str, use convert_utf16_to_str_partial() instead of using this function together with the unsafe method as_bytes_mut() on &mut str.

§Undefined behavior

UB ensues if src and src_len don’t designate a valid memory block, if src is NULL, if dst and dst_len don’t designate a valid memory block, if dst is NULL or if the two memory blocks overlap. (If src_len is 0, src may be bogus but still has to be non-NULL and aligned. Likewise for dst and dst_len.)