xml.dom.pulldom --- 支援建置部分 DOM 樹¶
The xml.dom.pulldom module provides a "pull parser" which can also be
asked to produce DOM-accessible fragments of the document where necessary. The
basic concept involves pulling "events" from a stream of incoming XML and
processing them. In contrast to SAX which also employs an event-driven
processing model together with callbacks, the user of a pull parser is
responsible for explicitly pulling events from the stream, looping over those
events until either processing is finished or an error condition occurs.
備註
如果你需要剖析不受信任或未經驗證的資料,請參閱 XML 安全性。
在 3.7.1 版的變更: The SAX parser no longer processes general external entities by default to increase security by default. To enable processing of external entities, pass a custom parser instance in:
from xml.dom.pulldom import parse
from xml.sax import make_parser
from xml.sax.handler import feature_external_ges
parser = make_parser()
parser.setFeature(feature_external_ges, True)
parse(filename, parser=parser)
範例:
from xml.dom import pulldom
doc = pulldom.parse('sales_items.xml')
for event, node in doc:
    if event == pulldom.START_ELEMENT and node.tagName == 'item':
        if int(node.getAttribute('price')) > 50:
            doc.expandNode(node)
            print(node.toxml())
event 是一個常數,可以是以下其中一個:
- START_ELEMENT
- END_ELEMENT
- COMMENT
- START_DOCUMENT
- END_DOCUMENT
- CHARACTERS
- PROCESSING_INSTRUCTION
- IGNORABLE_WHITESPACE
node is an object of type xml.dom.minidom.Document,
xml.dom.minidom.Element or xml.dom.minidom.Text.
Since the document is treated as a "flat" stream of events, the document "tree"
is implicitly traversed and the desired elements are found regardless of their
depth in the tree. In other words, one does not need to consider hierarchical
issues such as recursive searching of the document nodes, although if the
context of elements were important, one would either need to maintain some
context-related state (i.e. remembering where one is in the document at any
given point) or to make use of the DOMEventStream.expandNode() method
and switch to DOM-related processing.
- class xml.dom.pulldom.PullDom(documentFactory=None)¶
- class xml.dom.pulldom.SAX2DOM(documentFactory=None)¶
- xml.dom.pulldom.parse(stream_or_string, parser=None, bufsize=None)¶
- Return a - DOMEventStreamfrom the given input. stream_or_string may be either a file name, or a file-like object. parser, if given, must be an- XMLReaderobject. This function will change the document handler of the parser and activate namespace support; other parser configuration (like setting an entity resolver) must have been done in advance.
If you have XML in a string, you can use the parseString() function instead:
- xml.dom.pulldom.parseString(string, parser=None)¶
- 回傳一個表示 (Unicode) string 的 - DOMEventStream。
- xml.dom.pulldom.default_bufsize¶
- Default value for the bufsize parameter to - parse().- The value of this variable can be changed before calling - parse()and the new value will take effect.
DOMEventStream 物件¶
- class xml.dom.pulldom.DOMEventStream(stream, parser, bufsize)¶
- 在 3.11 版的變更: 已移除對 - __getitem__()方法的支援。- getEvent()¶
- Return a tuple containing event and the current node as - xml.dom.minidom.Documentif event equals- START_DOCUMENT,- xml.dom.minidom.Elementif event equals- START_ELEMENTor- END_ELEMENTor- xml.dom.minidom.Textif event equals- CHARACTERS. The current node does not contain information about its children, unless- expandNode()is called.
 - expandNode(node)¶
- 展開 node 的所有子節點到 node。範例: - from xml.dom import pulldom xml = '<html><title>Foo</title> <p>Some text <div>and more</div></p> </html>' doc = pulldom.parseString(xml) for event, node in doc: if event == pulldom.START_ELEMENT and node.tagName == 'p': # 以下陳述式只會印出 '<p/>' print(node.toxml()) doc.expandNode(node) # 以下陳述式會印出包含所有子節點的節點 '<p>Some text <div>and more</div></p>' print(node.toxml()) 
 - reset()¶