A streaming XML parser for MoonBit, inspired by quick-xml.
- Pull-parser model - Read XML events one at a time (like StAX in Java)
- Streaming - Memory-efficient processing of large documents
- Multi-backend - Works on wasm, wasm-gc, js, and native
- XML 1.0 + Namespaces 1.0 - Full Unicode name character support
// From string
let xml = "<root><item id=\"1\">Hello</item></root>"
let reader = @xml.Reader::from_string(xml)
// From file (resolves external entities)
let reader = @xml.Reader::from_file("document.xml")
while true {
let event = reader.read_event()
match event {
Start(elem) => println("Start: \{elem.name}")
End(name) => println("End: \{name}")
Text(content) => println("Text: \{content}")
Eof => break
_ => continue
}
}| Event | Description |
|---|---|
Start(XmlElement) |
Opening tag <name> |
End(String) |
Closing tag </name> |
Empty(XmlElement) |
Self-closing tag <name/> |
Text(String) |
Text content (entities decoded) |
CData(String) |
CDATA section <![CDATA[...]]> |
Comment(String) |
Comment <!-- ... --> |
PI(target, data) |
Processing instruction <?target data?> |
Decl(version, encoding, standalone) |
XML declaration |
DocType(String) |
DOCTYPE declaration |
Eof |
End of document |
This library is tested against the W3C XML Conformance Test Suite, using libxml2 (lxml) as the reference parser.
Current status: 800/800 tests passing
| Category | Tests | Description |
|---|---|---|
| Valid (with events) | 448 | Parser produces correct event sequence |
| Valid (error-only) | 6 | Parser does not error on valid XML |
| Not-well-formed | 281 | Parser correctly rejects malformed XML |
| Unit tests | 65 | Reader, writer, escape, conformance tests |
Coverage:
- XML 1.0 (James Clark xmltest)
- XML 1.0 Errata 2nd/3rd/4th edition
- Namespaces 1.0
- Sun Microsystems tests
- IBM XML 1.0 tests
# Download the W3C test suite
curl -L -o xmlts.tar.gz "https://www.w3.org/XML/Test/xmlts20130923.tar.gz"
tar -xzf xmlts.tar.gz && mv xmlconf . && rm xmlts.tar.gz
# Run tests
moon test# Requires: libxml2 (xmllint), lxml (pip install lxml)
python3 scripts/generate_conformance_tests.pyThe following test categories are skipped:
- External entity references (require file I/O)
- XML 1.1 documents (we only support XML 1.0)
- DTD validation tests (
invalidtype)
- Non-validating - Does not validate against DTD
- UTF-8 only - Other encodings not supported
- XML 1.0 only - XML 1.1 not supported
Note: External entities (SYSTEM) are resolved when using Reader::from_file().
Apache-2.0