Struct xml5ever::tokenizer::XmlTokenizer
source · pub struct XmlTokenizer<Sink> {Show 21 fields
opts: XmlTokenizerOpts,
pub sink: Sink,
state: Cell<XmlState>,
at_eof: Cell<bool>,
char_ref_tokenizer: RefCell<Option<Box<CharRefTokenizer>>>,
current_char: Cell<char>,
reconsume: Cell<bool>,
ignore_lf: Cell<bool>,
discard_bom: Cell<bool>,
temp_buf: RefCell<StrTendril>,
current_tag_kind: Cell<TagKind>,
current_tag_name: RefCell<StrTendril>,
current_tag_attrs: RefCell<Vec<Attribute>>,
current_attr_name: RefCell<StrTendril>,
current_attr_value: RefCell<StrTendril>,
current_doctype: RefCell<Doctype>,
current_comment: RefCell<StrTendril>,
current_pi_target: RefCell<StrTendril>,
current_pi_data: RefCell<StrTendril>,
state_profile: RefCell<BTreeMap<XmlState, u64>>,
time_in_sink: Cell<u64>,
}
Expand description
The Xml tokenizer.
Fields§
§opts: XmlTokenizerOpts
Options controlling the behavior of the tokenizer.
sink: Sink
Destination for tokens we emit.
state: Cell<XmlState>
The abstract machine state as described in the spec.
at_eof: Cell<bool>
Are we at the end of the file, once buffers have been processed completely? This affects whether we will wait for lookahead or not.
char_ref_tokenizer: RefCell<Option<Box<CharRefTokenizer>>>
Tokenizer for character references, if we’re tokenizing one at the moment.
current_char: Cell<char>
Current input character. Just consumed, may reconsume.
reconsume: Cell<bool>
Should we reconsume the current input character?
ignore_lf: Cell<bool>
Did we just consume \r, translating it to \n? In that case we need to ignore the next character if it’s \n.
discard_bom: Cell<bool>
Discard a U+FEFF BYTE ORDER MARK if we see one? Only done at the beginning of the stream.
temp_buf: RefCell<StrTendril>
Temporary buffer
current_tag_kind: Cell<TagKind>
Current tag kind.
current_tag_name: RefCell<StrTendril>
Current tag name.
current_tag_attrs: RefCell<Vec<Attribute>>
Current tag attributes.
current_attr_name: RefCell<StrTendril>
Current attribute name.
current_attr_value: RefCell<StrTendril>
Current attribute value.
current_doctype: RefCell<Doctype>
§current_comment: RefCell<StrTendril>
Current comment.
current_pi_target: RefCell<StrTendril>
Current processing instruction target.
current_pi_data: RefCell<StrTendril>
Current processing instruction value.
state_profile: RefCell<BTreeMap<XmlState, u64>>
Record of how many ns we spent in each state, if profiling is enabled.
time_in_sink: Cell<u64>
Record of how many ns we spent in the token sink.
Implementations§
source§impl<Sink: TokenSink> XmlTokenizer<Sink>
impl<Sink: TokenSink> XmlTokenizer<Sink>
sourcepub fn new(sink: Sink, opts: XmlTokenizerOpts) -> XmlTokenizer<Sink>
pub fn new(sink: Sink, opts: XmlTokenizerOpts) -> XmlTokenizer<Sink>
Create a new tokenizer which feeds tokens to a particular TokenSink
.
sourcepub fn feed(&self, input: &BufferQueue)
pub fn feed(&self, input: &BufferQueue)
Feed an input string into the tokenizer.
fn process_token(&self, token: Token)
fn get_preprocessed_char(&self, c: char, input: &BufferQueue) -> Option<char>
fn bad_eof_error(&self)
fn pop_except_from( &self, input: &BufferQueue, set: SmallCharSet, ) -> Option<SetResult>
fn eat(&self, input: &BufferQueue, pat: &str) -> Option<bool>
sourcepub fn run(&self, input: &BufferQueue)
pub fn run(&self, input: &BufferQueue)
Run the state machine for as long as we can.