Skip to content

xml.dom.pulldom DOMEventStream drops SAX events produced by parser.close() #145259

@zetzschest

Description

@zetzschest

Bug report

Bug description:

Issue

DOMEventStream.getEvent() silently drops SAX events that are produced when parser.close() is called at end-of-stream. When the stream is exhausted, getEvent() calls self.parser.close() and immediately returns None without checking whether close() generated new events. This causes END_ELEMENT and other trailing events to be lost when bufsize causes the final tag to be split across reads.

Reproducer

import io
from xml.dom.pulldom import parse

events = list(parse(io.BytesIO(b'<a></a>'), bufsize=2))
print([e for e, _ in events])
# ['START_DOCUMENT', 'START_ELEMENT'] — END_ELEMENT is missing

Iterating the DOMEventStream (via list()) internally calls getEvent(), which is where the bug lives.

CPython versions tested on:

CPython main branch

Operating systems tested on:

No response

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibStandard Library Python modules in the Lib/ directorytopic-XMLtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions