Skip to content

[mypyc] Do not preemptively diff & union op level dataflow sets#20897

Open
VaggelisD wants to merge 8 commits intopython:masterfrom
VaggelisD:dataflow_cur
Open

[mypyc] Do not preemptively diff & union op level dataflow sets#20897
VaggelisD wants to merge 8 commits intopython:masterfrom
VaggelisD:dataflow_cur

Conversation

@VaggelisD
Copy link
Contributor

Mypyc is currently unable to compile SQLGlot's AST (~950 classes in a single file) as it dies with OOM even on a 64 gb machine.

Upon investigating, one chokehold seems to be the following line which generates the dataflow for each BB Op; Given that most instructions generate empty kill and gen sets, executing the union & difference preemptively ends up creating identical cur copies.

I have verified that this fix locally unblocks compilation; Do note that mypyc still consumes 6-7 gbs of RAM so I'm still looking out for more improvements.

@@ -559,8 +559,14 @@ def run_analysis(
ops = list(reversed(ops))
for op in ops:
opgen, opkill = op.accept(gen_and_kill)
Copy link
Contributor Author

@VaggelisD VaggelisD Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could even cache the visitor result per op so we don't have to accept() twice, here and in L632, right? Not sure how worth it it is though, the visitor work is not that intensive.

@VaggelisD VaggelisD force-pushed the dataflow_cur branch 2 times, most recently from edd4a71 to feced01 Compare February 25, 2026 15:50
@JukkaL
Copy link
Collaborator

JukkaL commented Feb 25, 2026

Test failures are unrelated. I hope that I can fix the tests failing on master soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants