Expand description
The C API for encoding_rs.
§Mapping from Rust
§Naming convention
The wrapper function for each method has a name that starts with the name of the struct lower-cased, followed by an underscore and ends with the name of the method.
For example, Encoding::for_label()
is wrapped as encoding_for_label()
.
§Arguments
Functions that wrap non-static methods take the self
object as their
first argument.
Slice argument foo
is decomposed into a pointer foo
and a length
foo_len
.
§Return values
Multiple return values become out-params. When an out-param is
length-related, foo_len
for a slice becomes a pointer in order to become
an in/out-param.
DecoderResult
, EncoderResult
and CoderResult
become uint32_t
.
InputEmpty
becomes INPUT_EMPTY
. OutputFull
becomes OUTPUT_FULL
.
Unmappable
becomes the scalar value of the unmappable character.
Malformed
becomes a number whose lowest 8 bits, which can have the decimal
value 0, 1, 2 or 3, indicate the number of bytes that were consumed after
the malformed sequence and whose next-lowest 8 bits, when shifted right by
8 indicate the length of the malformed byte sequence (possible decimal
values 1, 2, 3 or 4). The maximum possible sum of the two is 6.
Structs§
- Newtype for
*const Encoding
in order to be able to implementSync
for it.
Constants§
- The minimum length of buffers that may be passed to
encoding_name()
. - Return value for
*_decode_*
and*_encode_*
functions that indicates that the input has been exhausted. - Return value for
*_decode_*
and*_encode_*
functions that indicates that the output space has been exhausted.
Statics§
- The Big5 encoding.
- The EUC-JP encoding.
- The EUC-KR encoding.
- The gb18030 encoding.
- The GBK encoding.
- The IBM866 encoding.
- The ISO-2022-JP encoding.
- The ISO-8859-2 encoding.
- The ISO-8859-3 encoding.
- The ISO-8859-4 encoding.
- The ISO-8859-5 encoding.
- The ISO-8859-6 encoding.
- The ISO-8859-7 encoding.
- The ISO-8859-8 encoding.
- The ISO-8859-8-I encoding.
- The ISO-8859-10 encoding.
- The ISO-8859-13 encoding.
- The ISO-8859-14 encoding.
- The ISO-8859-15 encoding.
- The ISO-8859-16 encoding.
- The KOI8-R encoding.
- The KOI8-U encoding.
- The macintosh encoding.
- The replacement encoding.
- The Shift_JIS encoding.
- The UTF-8 encoding.
- The UTF-16BE encoding.
- The UTF-16LE encoding.
- The windows-874 encoding.
- The windows-1250 encoding.
- The windows-1251 encoding.
- The windows-1252 encoding.
- The windows-1253 encoding.
- The windows-1254 encoding.
- The windows-1255 encoding.
- The windows-1256 encoding.
- The windows-1257 encoding.
- The windows-1258 encoding.
- The x-mac-cyrillic encoding.
- The x-user-defined encoding.
Functions§
- Incrementally decode a byte stream into UTF-8 with malformed sequences replaced with the REPLACEMENT CHARACTER.
- Incrementally decode a byte stream into UTF-8 without replacement.
- Incrementally decode a byte stream into UTF-16 with malformed sequences replaced with the REPLACEMENT CHARACTER.
- Incrementally decode a byte stream into UTF-16 without replacement.
- The
Encoding
thisDecoder
is for. - Deallocates a
Decoder
previously allocated byencoding_new_decoder()
. - Checks for compatibility with storing Unicode scalar values as unsigned bytes taking into account the state of the decoder.
- Query the worst-case UTF-8 output size with replacement.
- Query the worst-case UTF-8 output size without replacement.
- Query the worst-case UTF-16 output size (with or without replacement).
- Incrementally encode into byte stream from UTF-8 with unmappable characters replaced with HTML (decimal) numeric character references.
- Incrementally encode into byte stream from UTF-8 without replacement.
- Incrementally encode into byte stream from UTF-16 with unmappable characters replaced with HTML (decimal) numeric character references.
- Incrementally encode into byte stream from UTF-16 without replacement.
- The
Encoding
thisEncoder
is for. - Deallocates an
Encoder
previously allocated byencoding_new_encoder()
. - Returns
true
if this is an ISO-2022-JP encoder that’s not in the ASCII state andfalse
otherwise. - Query the worst-case output size when encoding from UTF-8 with replacement.
- Query the worst-case output size when encoding from UTF-8 without replacement.
- Query the worst-case output size when encoding from UTF-16 with replacement.
- Query the worst-case output size when encoding from UTF-16 without replacement.
- Validates ASCII.
- Checks whether the output encoding of this encoding can encode every Unicode scalar. (Only true if the output encoding is UTF-8.)
- Performs non-incremental BOM sniffing.
- Implements the get an encoding algorithm.
- This function behaves the same as
encoding_for_label()
, except whenencoding_for_label()
would returnREPLACEMENT_ENCODING
, this method returnsNULL
instead. - Checks whether the bytes 0x00…0x7F map exclusively to the characters U+0000…U+007F and vice versa.
- Checks whether this encoding maps one byte to one Basic Multilingual Plane code point (i.e. byte length equals decoded UTF-16 length) and vice versa (for mappable characters).
- Validates ISO-2022-JP ASCII-state data.
- Writes the name of the given
Encoding
to a caller-supplied buffer as ASCII and returns the number of bytes / ASCII characters written. - Allocates a new
Decoder
for the givenEncoding
on the heap with BOM sniffing enabled and returns a pointer to the newly-allocatedDecoder
. - Allocates a new
Decoder
for the givenEncoding
into memory provided by the caller with BOM sniffing enabled. (In practice, the target should likely be a pointer previously returned byencoding_new_decoder()
.) - Allocates a new
Decoder
for the givenEncoding
on the heap with BOM removal and returns a pointer to the newly-allocatedDecoder
. - Allocates a new
Decoder
for the givenEncoding
into memory provided by the caller with BOM removal. - Allocates a new
Decoder
for the givenEncoding
on the heap with BOM handling disabled and returns a pointer to the newly-allocatedDecoder
. - Allocates a new
Decoder
for the givenEncoding
into memory provided by the caller with BOM handling disabled. - Allocates a new
Encoder
for the givenEncoding
on the heap and returns a pointer to the newly-allocatedEncoder
. (Exception, if theEncoding
isreplacement
, a newDecoder
for UTF-8 is instantiated (and thatDecoder
reportsUTF_8
as itsEncoding
). - Allocates a new
Encoder
for the givenEncoding
into memory provided by the caller. (In practice, the target should likely be a pointer previously returned byencoding_new_encoder()
.) - Returns the output encoding of this encoding. This is UTF-8 for UTF-16BE, UTF-16LE and replacement and the encoding itself otherwise.
- Validates UTF-8.