Struct xml5ever::tokenizer::XmlTokenizer
source · pub struct XmlTokenizer<Sink> {Show 21 fields
opts: XmlTokenizerOpts,
pub sink: Sink,
state: XmlState,
at_eof: bool,
char_ref_tokenizer: Option<Box<CharRefTokenizer>>,
current_char: char,
reconsume: bool,
ignore_lf: bool,
discard_bom: bool,
temp_buf: StrTendril,
current_tag_kind: TagKind,
current_tag_name: StrTendril,
current_tag_attrs: Vec<Attribute>,
current_attr_name: StrTendril,
current_attr_value: StrTendril,
current_doctype: Doctype,
current_comment: StrTendril,
current_pi_target: StrTendril,
current_pi_data: StrTendril,
state_profile: BTreeMap<XmlState, u64>,
time_in_sink: u64,
}
Expand description
The Xml tokenizer.
Fields§
§opts: XmlTokenizerOpts
Options controlling the behavior of the tokenizer.
sink: Sink
Destination for tokens we emit.
state: XmlState
The abstract machine state as described in the spec.
at_eof: bool
Are we at the end of the file, once buffers have been processed completely? This affects whether we will wait for lookahead or not.
char_ref_tokenizer: Option<Box<CharRefTokenizer>>
Tokenizer for character references, if we’re tokenizing one at the moment.
current_char: char
Current input character. Just consumed, may reconsume.
reconsume: bool
Should we reconsume the current input character?
ignore_lf: bool
Did we just consume \r, translating it to \n? In that case we need to ignore the next character if it’s \n.
discard_bom: bool
Discard a U+FEFF BYTE ORDER MARK if we see one? Only done at the beginning of the stream.
temp_buf: StrTendril
Temporary buffer
current_tag_kind: TagKind
Current tag kind.
current_tag_name: StrTendril
Current tag name.
current_tag_attrs: Vec<Attribute>
Current tag attributes.
current_attr_name: StrTendril
Current attribute name.
current_attr_value: StrTendril
Current attribute value.
current_doctype: Doctype
§current_comment: StrTendril
Current comment.
current_pi_target: StrTendril
Current processing instruction target.
current_pi_data: StrTendril
Current processing instruction value.
state_profile: BTreeMap<XmlState, u64>
Record of how many ns we spent in each state, if profiling is enabled.
time_in_sink: u64
Record of how many ns we spent in the token sink.
Implementations§
source§impl<Sink: TokenSink> XmlTokenizer<Sink>
impl<Sink: TokenSink> XmlTokenizer<Sink>
sourcepub fn new(sink: Sink, opts: XmlTokenizerOpts) -> XmlTokenizer<Sink>
pub fn new(sink: Sink, opts: XmlTokenizerOpts) -> XmlTokenizer<Sink>
Create a new tokenizer which feeds tokens to a particular TokenSink
.
sourcepub fn feed(&mut self, input: &mut BufferQueue)
pub fn feed(&mut self, input: &mut BufferQueue)
Feed an input string into the tokenizer.
fn process_token(&mut self, token: Token)
fn get_preprocessed_char( &mut self, c: char, input: &mut BufferQueue ) -> Option<char>
fn bad_eof_error(&mut self)
fn pop_except_from( &mut self, input: &mut BufferQueue, set: SmallCharSet ) -> Option<SetResult>
fn eat(&mut self, input: &mut BufferQueue, pat: &str) -> Option<bool>
sourcepub fn run(&mut self, input: &mut BufferQueue)
pub fn run(&mut self, input: &mut BufferQueue)
Run the state machine for as long as we can.