Skip to content

Case-insensitive XML declaration detection per XML spec section 2.6 #927

@SuchAFuriousDeath

Description

@SuchAFuriousDeath

According to https://www.w3.org/TR/xml/#sec-pi:

The target names "XML", "xml", and so on are reserved for standardization in this or future versions of this specification.

This means PITarget names matching [Xx][Mm][Ll] (any case variant) are reserved and shouldn't be considered valid processing instructions.
Currently, quick-xml treats only lowercase <?xml as an XML declaration, while other case variants are treated as regular processing instructions.

The above is basically a standalone issue and probably should be addressed somehow. This is related to Ruffle and ideally should be solved in such a way that allows Ruffle to properly replicate Flash behaviour.

Here is a table laying out what is currently ok and what isn't:

Input quick-xml Event/Error Flash Player behavior Note
<?xml version='1.0'?>\<r/\> Event::Decl XML declaration ✓ Ok
<?XML version='1.0'?>\<r/\> Event::PI XML declaration ✓ Workaround possible: we can check Event::PI and treat it as an xml declaration
<?XmL version='1.0'?>\<r/\> Event::PI XML declaration ✓ Workaround possible: we can check Event::PI and treat it as an xml declaration
<?xml version='1.0'? (unterminated) UnclosedXmlDecl Error #1092 (unterminated XML declaration) ✓ Ok
<?XML version='1.0'? (unterminated) UnclosedPI Error #1092 (unterminated XML declaration) ✗ PI target not exposed in error, cannot distinguish from regular PI
<?XmL version='1.0'? (unterminated) UnclosedPI Error #1092 (unterminated XML declaration) ✗ PI target not exposed in error, cannot distinguish from regular PI
<?test foo?\> Event::PI Processing instruction ✓ Ok
<?test foo? (unterminated) UnclosedPI Error #1097 (unterminated processing instruction) ✓ Ok

If an error was raised as soon as the parser encounters something like <?XmL, then parsing stops and Ruffle would have no chance to process the rest of the xml and for example raise an error about an unterminated xml declaration or any other error that would be encountered later.

One possible solution would be to treat all [Xx][Mm][Ll] case variants as XML declarations, since the spec reserves these names anyway.

Alternatively, new event/error types could be added (e.g., Event::ReservedPI / SyntaxError::UnclosedReservedPI) to let users decide how to handle them. However, this might be more complexity than needed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions