Fix error on show() with an explain plan#1492
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes a .show() failure when the underlying DataFrame is an EXPLAIN / EXPLAIN ANALYZE plan by avoiding adding a LIMIT node that violates DataFusion’s requirement that Explain/Analyze be the root plan (closes #1490).
Changes:
- Update
PyDataFrame.show()to skip addingLIMITforLogicalPlan::ExplainandLogicalPlan::Analyze. - Add a Python regression test ensuring
.show()works onEXPLAINandEXPLAIN ANALYZESQL results.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| python/tests/test_dataframe.py | Adds regression coverage for calling .show() on EXPLAIN / EXPLAIN ANALYZE query results. |
| crates/core/src/dataframe.rs | Adjusts show() logic to avoid wrapping explain/analyze logical plans with a LIMIT. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let mut df = self.df.as_ref().clone(); | ||
| df = match self.df.logical_plan() { | ||
| LogicalPlan::Explain(_) | LogicalPlan::Analyze(_) => { | ||
| // Explain and Analyzer require they are at the top | ||
| // of the plan, so do not add a limit. | ||
| df | ||
| } | ||
| _ => df.limit(0, Some(num))?, |
There was a problem hiding this comment.
DataFrame::logical_plan() in this codebase appears to return a reference (e.g., self.df.logical_plan().clone() in crates/core/src/table.rs), so match self.df.logical_plan() here is matching on &LogicalPlan. As written, the patterns LogicalPlan::Explain(_) | LogicalPlan::Analyze(_) will not match a reference and is likely to fail to compile. Adjust the match to destructure the reference (or explicitly dereference/clone the plan before matching).
| /// Print the result, 20 lines by default | ||
| #[pyo3(signature = (num=20))] | ||
| fn show(&self, py: Python, num: usize) -> PyDataFusionResult<()> { | ||
| let df = self.df.as_ref().clone().limit(0, Some(num))?; | ||
| let mut df = self.df.as_ref().clone(); | ||
| df = match self.df.logical_plan() { | ||
| LogicalPlan::Explain(_) | LogicalPlan::Analyze(_) => { | ||
| // Explain and Analyzer require they are at the top | ||
| // of the plan, so do not add a limit. | ||
| df | ||
| } | ||
| _ => df.limit(0, Some(num))?, | ||
| }; | ||
| print_dataframe(py, df) | ||
| } |
There was a problem hiding this comment.
For EXPLAIN / ANALYZE plans, show(num=...) no longer applies any row limiting, so the num parameter is effectively ignored for these cases. If you still want to respect num without changing the logical plan root, consider limiting at the printing/collection layer (e.g., truncating collected batches / taking first N rows from the stream) rather than adding a LIMIT node.
Which issue does this PR close?
Closes #1490
Rationale for this change
.show()is adding a limit on the dataframe, but this is not allowed for explain and analyze plans.What changes are included in this PR?
If we have an explain or analyze plan, do not add limit during show.
Are there any user-facing changes?
None