pub struct Deserializer<'de, R, E: EntityResolver = PredefinedEntityResolver>where
R: XmlRead<'de>,{
reader: XmlReader<'de, R, E>,
read: VecDeque<DeEvent<'de>>,
write: VecDeque<DeEvent<'de>>,
limit: Option<NonZeroUsize>,
key_buf: String,
}
Expand description
A structure that deserializes XML into Rust values.
Fields§
§reader: XmlReader<'de, R, E>
An XML reader that streams events into this deserializer
read: VecDeque<DeEvent<'de>>
When deserializing sequences sometimes we have to skip unwanted events.
That events should be stored and then replayed. This is a replay buffer,
that streams events while not empty. When it exhausted, events will
requested from Self::reader
.
write: VecDeque<DeEvent<'de>>
When deserializing sequences sometimes we have to skip events, because XML
is tolerant to elements order and even if in the XSD order is strictly
specified (using xs:sequence
) most of XML parsers allows order violations.
That means, that elements, forming a sequence, could be overlapped with
other elements, do not related to that sequence.
In order to support this, deserializer will scan events and skip unwanted
events, store them here. After call Self::start_replay()
all events
moved from this to Self::read
.
limit: Option<NonZeroUsize>
Maximum number of events that can be skipped when processing sequences that occur out-of-order. This field is used to prevent potential denial-of-service (DoS) attacks which could cause infinite memory consumption when parsing a very large amount of XML into a sequence field.
key_buf: String
Buffer to store attribute name as a field name exposed to serde consumers
Implementations§
Source§impl<'de, R, E> Deserializer<'de, R, E>where
R: XmlRead<'de>,
E: EntityResolver,
impl<'de, R, E> Deserializer<'de, R, E>where
R: XmlRead<'de>,
E: EntityResolver,
Sourcefn new(reader: R, entity_resolver: E) -> Self
fn new(reader: R, entity_resolver: E) -> Self
Create an XML deserializer from one of the possible quick_xml input sources.
Typically it is more convenient to use one of these methods instead:
Sourcepub const fn get_ref(&self) -> &R
pub const fn get_ref(&self) -> &R
Returns the underlying XML reader.
use serde::Deserialize;
use quick_xml::de::Deserializer;
use quick_xml::Reader;
#[derive(Deserialize)]
struct SomeStruct {
field1: String,
field2: String,
}
// Try to deserialize from broken XML
let mut de = Deserializer::from_str(
"<SomeStruct><field1><field2></SomeStruct>"
// 0 ^= 28 ^= 41
);
let err = SomeStruct::deserialize(&mut de);
assert!(err.is_err());
let reader: &Reader<_> = de.get_ref().get_ref();
assert_eq!(reader.error_position(), 28);
assert_eq!(reader.buffer_position(), 41);
Sourcepub fn event_buffer_size(&mut self, limit: Option<NonZeroUsize>) -> &mut Self
pub fn event_buffer_size(&mut self, limit: Option<NonZeroUsize>) -> &mut Self
Set the maximum number of events that could be skipped during deserialization of sequences.
If <element>
contains more than specified nested elements, $text
or
CDATA nodes, then DeError::TooManyEvents
will be returned during
deserialization of sequence field (any type that uses deserialize_seq
for the deserialization, for example, Vec<T>
).
This method can be used to prevent a DoS attack and infinite memory consumption when parsing a very large XML to a sequence field.
It is strongly recommended to set limit to some value when you parse data from untrusted sources. You should choose a value that your typical XMLs can have between different elements that corresponds to the same sequence.
§Examples
Let’s imagine, that we deserialize such structure:
struct List {
item: Vec<()>,
}
The XML that we try to parse look like this:
<any-name>
<item/>
<!-- Bufferization starts at this point -->
<another-item>
<some-element>with text</some-element>
<yet-another-element/>
</another-item>
<!-- Buffer will be emptied at this point; 7 events were buffered -->
<item/>
<!-- There is nothing to buffer, because elements follows each other -->
<item/>
</any-name>
There, when we deserialize the item
field, we need to buffer 7 events,
before we can deserialize the second <item/>
:
<another-item>
<some-element>
$text(with text)
</some-element>
<yet-another-element/>
(virtual start event)<yet-another-element/>
(virtual end event)</another-item>
Note, that <yet-another-element/>
internally represented as 2 events:
one for the start tag and one for the end tag. In the future this can be
eliminated, but for now we use auto-expanding feature of a reader,
because this simplifies deserializer code.
fn peek(&mut self) -> Result<&DeEvent<'de>, DeError>
fn next(&mut self) -> Result<DeEvent<'de>, DeError>
Sourcefn skip_checkpoint(&self) -> usize
fn skip_checkpoint(&self) -> usize
Returns the mark after which all events, skipped by Self::skip()
call,
should be replayed after calling Self::start_replay()
.
Sourcefn skip(&mut self) -> Result<(), DeError>
fn skip(&mut self) -> Result<(), DeError>
Extracts XML tree of events from and stores them in the skipped events
buffer from which they can be retrieved later. You MUST call
Self::start_replay()
after calling this to give access to the skipped
events and release internal buffers.
fn skip_event(&mut self, event: DeEvent<'de>) -> Result<(), DeError>
Sourcefn start_replay(&mut self, checkpoint: usize)
fn start_replay(&mut self, checkpoint: usize)
Moves buffered events, skipped after given checkpoint
from Self::write
skip buffer to Self::read
buffer.
After calling this method, Self::peek()
and Self::next()
starts
return events that was skipped previously by calling Self::skip()
,
and only when all that events will be consumed, the deserializer starts
to drain events from underlying reader.
This method MUST be called if any number of Self::skip()
was called
after Self::new()
or start_replay()
or you’ll lost events.
fn read_string(&mut self) -> Result<Cow<'de, str>, DeError>
Sourcefn read_string_impl(
&mut self,
allow_start: bool,
) -> Result<Cow<'de, str>, DeError>
fn read_string_impl( &mut self, allow_start: bool, ) -> Result<Cow<'de, str>, DeError>
Consumes consequent Text
and CData
(both a referred below as a text)
events, merge them into one string. If there are no such events, returns
an empty string.
If allow_start
is false
, then only text events are consumed, for other
events an error is returned (see table below).
If allow_start
is true
, then two or three events are expected:
DeEvent::Start
;- (optional)
DeEvent::Text
which content is returned; DeEvent::End
. If text event was missed, an empty string is returned.
Corresponding events are consumed.
§Handling events
The table below shows how events is handled by this method:
Event | XML | Handling |
---|---|---|
DeEvent::Start | <tag>...</tag> | if allow_start == true , result determined by the second table, otherwise emits UnexpectedStart("tag") |
DeEvent::End | </any-tag> | This is impossible situation, the method will panic if it happens |
DeEvent::Text | text content or <![CDATA[cdata content]]> (probably mixed) | Returns event content unchanged |
DeEvent::Eof | Emits UnexpectedEof |
Second event, consumed if DeEvent::Start
was received and allow_start == true
:
Event | XML | Handling |
---|---|---|
DeEvent::Start | <any-tag>...</any-tag> | Emits UnexpectedStart("any-tag") |
DeEvent::End | </tag> | Returns an empty slice. The reader guarantee that tag will match the open one |
DeEvent::Text | text content or <![CDATA[cdata content]]> (probably mixed) | Returns event content unchanged, expects the </tag> after that |
DeEvent::Eof | Emits InvalidXml(IllFormed(MissingEndTag)) |
Sourcefn read_text(&mut self, name: QName<'_>) -> Result<Cow<'de, str>, DeError>
fn read_text(&mut self, name: QName<'_>) -> Result<Cow<'de, str>, DeError>
Consumes one DeEvent::Text
event and ensures that it is followed by the
DeEvent::End
event.
§Parameters
name
: name of a tag opened before reading text. The corresponding end tag should present in input just after the text
Sourcefn read_to_end(&mut self, name: QName<'_>) -> Result<(), DeError>
fn read_to_end(&mut self, name: QName<'_>) -> Result<(), DeError>
Drops all events until event with name name
won’t be
dropped. This method should be called after Self::next()
Source§impl<'de> Deserializer<'de, SliceReader<'de>>
impl<'de> Deserializer<'de, SliceReader<'de>>
Source§impl<'de, E> Deserializer<'de, SliceReader<'de>, E>where
E: EntityResolver,
impl<'de, E> Deserializer<'de, SliceReader<'de>, E>where
E: EntityResolver,
Sourcepub fn from_str_with_resolver(source: &'de str, entity_resolver: E) -> Self
pub fn from_str_with_resolver(source: &'de str, entity_resolver: E) -> Self
Create new deserializer that will borrow data from the specified string and use specified entity resolver.
Source§impl<'de, R> Deserializer<'de, IoReader<R>>where
R: BufRead,
impl<'de, R> Deserializer<'de, IoReader<R>>where
R: BufRead,
Sourcepub fn from_reader(reader: R) -> Self
pub fn from_reader(reader: R) -> Self
Create new deserializer that will copy data from the specified reader into internal buffer.
If you already have a string use Self::from_str
instead, because it
will borrow instead of copy. If you have &[u8]
which is known to represent
UTF-8, you can decode it first before using from_str
.
Deserializer created with this method will not resolve custom entities.
Source§impl<'de, R, E> Deserializer<'de, IoReader<R>, E>where
R: BufRead,
E: EntityResolver,
impl<'de, R, E> Deserializer<'de, IoReader<R>, E>where
R: BufRead,
E: EntityResolver,
Sourcepub fn with_resolver(reader: R, entity_resolver: E) -> Self
pub fn with_resolver(reader: R, entity_resolver: E) -> Self
Create new deserializer that will copy data from the specified reader into internal buffer and use specified entity resolver.
If you already have a string use Self::from_str
instead, because it
will borrow instead of copy. If you have &[u8]
which is known to represent
UTF-8, you can decode it first before using from_str
.
Trait Implementations§
Source§impl<'de, 'a, R, E> Deserializer<'de> for &'a mut Deserializer<'de, R, E>where
R: XmlRead<'de>,
E: EntityResolver,
impl<'de, 'a, R, E> Deserializer<'de> for &'a mut Deserializer<'de, R, E>where
R: XmlRead<'de>,
E: EntityResolver,
Source§fn deserialize_char<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_char<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Character represented as strings.
Source§fn deserialize_string<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_string<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Representation of owned strings the same as non-owned.
Source§fn deserialize_bytes<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_bytes<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Forwards deserialization to the deserialize_any
.
Source§fn deserialize_byte_buf<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_byte_buf<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Forwards deserialization to the deserialize_bytes
.
Source§fn deserialize_unit_struct<V>(
self,
_name: &'static str,
visitor: V,
) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_unit_struct<V>(
self,
_name: &'static str,
visitor: V,
) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Representation of the named units the same as unnamed units.
Source§fn deserialize_tuple<V>(
self,
_len: usize,
visitor: V,
) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_tuple<V>(
self,
_len: usize,
visitor: V,
) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Representation of tuples the same as sequences.
Source§fn deserialize_tuple_struct<V>(
self,
_name: &'static str,
len: usize,
visitor: V,
) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_tuple_struct<V>(
self,
_name: &'static str,
len: usize,
visitor: V,
) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Representation of named tuples the same as unnamed tuples.
Source§fn deserialize_map<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_map<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Forwards deserialization to the deserialize_struct
with empty name and fields.
Source§fn deserialize_identifier<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_identifier<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Identifiers represented as strings.
Source§fn deserialize_ignored_any<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_ignored_any<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Forwards deserialization to the deserialize_unit
.
Source§fn deserialize_unit<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_unit<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Unit represented in XML as a xs:element
or text/CDATA content.
Any content inside xs:element
is ignored and skipped.
Produces unit struct from any of following inputs:
- any
<tag ...>...</tag>
- any
<tag .../>
- any consequent text / CDATA content (can consist of several parts delimited by comments and processing instructions)
§Events handling
Event | XML | Handling |
---|---|---|
DeEvent::Start | <tag>...</tag> | Calls visitor.visit_unit() , consumes all events up to and including corresponding End event |
DeEvent::End | </tag> | This is impossible situation, the method will panic if it happens |
DeEvent::Text | text content or <![CDATA[cdata content]]> (probably mixed) | Calls visitor.visit_unit() . The content is ignored |
DeEvent::Eof | Emits UnexpectedEof |
Source§fn deserialize_newtype_struct<V>(
self,
_name: &'static str,
visitor: V,
) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_newtype_struct<V>(
self,
_name: &'static str,
visitor: V,
) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Forwards deserialization of the inner type. Always calls Visitor::visit_newtype_struct
with the same deserializer.
Source§type Error = DeError
type Error = DeError
Source§fn deserialize_i8<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_i8<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Deserialize
type is expecting an i8
value.Source§fn deserialize_i16<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_i16<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Deserialize
type is expecting an i16
value.Source§fn deserialize_i32<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_i32<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Deserialize
type is expecting an i32
value.Source§fn deserialize_i64<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_i64<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Deserialize
type is expecting an i64
value.Source§fn deserialize_u8<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_u8<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Deserialize
type is expecting a u8
value.Source§fn deserialize_u16<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_u16<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Deserialize
type is expecting a u16
value.Source§fn deserialize_u32<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_u32<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Deserialize
type is expecting a u32
value.Source§fn deserialize_u64<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_u64<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Deserialize
type is expecting a u64
value.Source§fn deserialize_f32<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_f32<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Deserialize
type is expecting a f32
value.Source§fn deserialize_f64<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_f64<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Deserialize
type is expecting a f64
value.Source§fn deserialize_bool<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_bool<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Deserialize
type is expecting a bool
value.Source§fn deserialize_str<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_str<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Deserialize
type is expecting a string value and does
not benefit from taking ownership of buffered data owned by the
Deserializer
. Read moreSource§fn deserialize_struct<V>(
self,
_name: &'static str,
fields: &'static [&'static str],
visitor: V,
) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_struct<V>(
self,
_name: &'static str,
fields: &'static [&'static str],
visitor: V,
) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Deserialize
type is expecting a struct with a particular
name and fields.Source§fn deserialize_enum<V>(
self,
_name: &'static str,
_variants: &'static [&'static str],
visitor: V,
) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_enum<V>(
self,
_name: &'static str,
_variants: &'static [&'static str],
visitor: V,
) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Deserialize
type is expecting an enum value with a
particular name and possible variants.Source§fn deserialize_seq<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_seq<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Deserialize
type is expecting a sequence of values.Source§fn deserialize_option<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_option<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Deserialize
type is expecting an optional value. Read moreSource§fn deserialize_any<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
fn deserialize_any<V>(self, visitor: V) -> Result<V::Value, DeError>where
V: Visitor<'de>,
Deserializer
to figure out how to drive the visitor based
on what data type is in the input. Read moreSource§fn is_human_readable(&self) -> bool
fn is_human_readable(&self) -> bool
Deserialize
implementations should expect to
deserialize their human-readable form. Read moreSource§impl<'de, 'a, R, E> SeqAccess<'de> for &'a mut Deserializer<'de, R, E>where
R: XmlRead<'de>,
E: EntityResolver,
An accessor to sequence elements forming a value for top-level sequence of XML
elements.
impl<'de, 'a, R, E> SeqAccess<'de> for &'a mut Deserializer<'de, R, E>where
R: XmlRead<'de>,
E: EntityResolver,
An accessor to sequence elements forming a value for top-level sequence of XML elements.
Technically, multiple top-level elements violates XML rule of only one top-level element, but we consider this as several concatenated XML documents.
Source§type Error = DeError
type Error = DeError
Source§fn next_element_seed<T>(
&mut self,
seed: T,
) -> Result<Option<T::Value>, Self::Error>where
T: DeserializeSeed<'de>,
fn next_element_seed<T>(
&mut self,
seed: T,
) -> Result<Option<T::Value>, Self::Error>where
T: DeserializeSeed<'de>,
Ok(Some(value))
for the next value in the sequence, or
Ok(None)
if there are no more remaining items. Read moreSource§fn next_element<T>(&mut self) -> Result<Option<T>, Self::Error>where
T: Deserialize<'de>,
fn next_element<T>(&mut self) -> Result<Option<T>, Self::Error>where
T: Deserialize<'de>,
Ok(Some(value))
for the next value in the sequence, or
Ok(None)
if there are no more remaining items. Read more