Skip to content

gh-145259: Fix xml.dom.pulldom dropping events produced by parser.close()#145262

Open
zetzschest wants to merge 3 commits intopython:mainfrom
zetzschest:fix/xml_dom_pulldom_dropped_events
Open

gh-145259: Fix xml.dom.pulldom dropping events produced by parser.close()#145262
zetzschest wants to merge 3 commits intopython:mainfrom
zetzschest:fix/xml_dom_pulldom_dropped_events

Conversation

@zetzschest
Copy link

@zetzschest zetzschest commented Feb 26, 2026

Issue

When the input stream is exhausted, getEvent() calls self.parser.close() and immediately returns None without checking whether close() generated new events. SAX events emitted during close() (e.g. END_ELEMENT for the final tag) are silently lost.

Reproducer

import io
from xml.dom.pulldom import parse
events = list(parse(io.BytesIO(b'<a></a>'), bufsize=2))
print([e for e, _ in events])  # ['START_DOCUMENT', 'START_ELEMENT'] — END_ELEMENT missing

Fix

In Lib/xml/dom/pulldom.py, after calling self.parser.close(), check if new events were produced before returning None:

self.parser.close()
if self.pulldom.firstEvent[1]:
    break
return None

Test updates

Two existing tests in test_pulldom needed adjustments since they were based on the previous behavior:

  • test_end_document was marked @expectedFailure — it now passes as the END_DOCUMENT event is correctly delivered.
  • test_expandItem expected StopIteration immediately after </html> — updated to account for the additional events (like END_DOCUMENT) that are now properly emitted.

@python-cla-bot
Copy link

python-cla-bot bot commented Feb 26, 2026

All commit authors signed the Contributor License Agreement.

CLA signed

@bedevere-app
Copy link

bedevere-app bot commented Feb 26, 2026

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

getEvent() returned None immediately after parser.close() without
checking if close() generated new SAX events, silently dropping
trailing events like END_ELEMENT.
@zetzschest zetzschest force-pushed the fix/xml_dom_pulldom_dropped_events branch from 3b0cc16 to b0cb14a Compare February 26, 2026 15:34
@bedevere-app
Copy link

bedevere-app bot commented Feb 26, 2026

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@zetzschest zetzschest marked this pull request as ready for review February 26, 2026 15:37
- Remove @expectedfailure from test_end_document since END_DOCUMENT
  events are now correctly delivered
- Update test_expandItem to consume remaining events (END_ELEMENT,
  END_DOCUMENT) that are now properly emitted
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant