match_token

Macro match_token

Source
match_token!() { /* proc-macro */ }
Expand description

Implements the match_token!() macro for use by the HTML tree builder in src/tree_builder/rules.rs.

§Example

match_token!(token {
    CommentToken(text) => 1,
    tag @ <base> <link> <meta> => 2,
    </head> => 3,
    </body> </html> </br> => else,
    tag @ </_> => 4,
    token => 5,
})

§Syntax

Because of the simplistic parser, the macro invocation must start with exactly match_token!(token { (with whitespace as specified) and end with exactly }). The left-hand side of each match arm is an optional name @ binding, followed by

  • an ordinary Rust pattern that starts with an identifier or an underscore, or
  • a sequence of HTML tag names as identifiers, each inside “<…>” or “</…>” to match an open or close tag respectively, or
  • a “wildcard tag” “<>” or “</>” to match all open tags or all close tags respectively.

The right-hand side is either an expression or the keyword else. Note that this syntax does not support guards or pattern alternation like Foo | Bar. This is not a fundamental limitation; it’s done for implementation simplicity.

§Semantics

Ordinary Rust patterns match as usual. If present, the name @ binding has the usual meaning. A sequence of named tags matches any of those tags. A single sequence can contain both open and close tags. If present, the name @ binding binds (by move) the Tag struct, not the outer Token. That is, a match arm like

tag @ <html> <head> => ...

expands to something like

TagToken(tag @ Tag { name: local_name!("html"), kind: StartTag })
| TagToken(tag @ Tag { name: local_name!("head"), kind: StartTag }) => ...

A wildcard tag matches any tag of the appropriate kind, unless it was previously matched with an else right-hand side (more on this below). The expansion of this macro reorders code somewhat, to satisfy various restrictions arising from moves. However it provides the semantics of in-order matching, by enforcing the following restrictions on its input:

  • The last pattern must be a variable or the wildcard “_”. In other words it must match everything.
  • Otherwise, ordinary Rust patterns and specific-tag patterns cannot appear after wildcard tag patterns.
  • No tag name may appear more than once.
  • A wildcard tag pattern may not occur in the same arm as any other tag. “<> => …” and “<> </_> => …” are both forbidden.
  • The right-hand side “else” may only appear with specific-tag patterns. It means that these specific tags should be handled by the last, catch-all case arm, rather than by any wildcard tag arm. This situation is common in the HTML5 syntax.