[SPARK-53454][SQL] Handle AlwaysTrue/AlwaysFalse in V2ExpressionSQLBuilder#56085
[SPARK-53454][SQL] Handle AlwaysTrue/AlwaysFalse in V2ExpressionSQLBuilder#56085shrirangmhalgi wants to merge 3 commits into
Conversation
…ilder AlwaysTrue and AlwaysFalse implement Literal<Boolean>, so they matched the Literal branch in V2ExpressionSQLBuilder.build() and produced bare "TRUE"/"FALSE" strings. Some databases (Oracle, DB2) do not support these as boolean expressions in WHERE clauses. Add explicit handling at the top of build() to produce "1 = 1" and "1 = 0" respectively, which is valid ANSI SQL across all dialects. This is a fresh, simplified approach per cloud-fan review feedback on the original PR apache#52198 (auto-closed due to inactivity). Closes apache#52198
|
@cloud-fan / @pan3793 Could you please review the following PR which is a simplified followup to #52198 - uses |
pan3793
left a comment
There was a problem hiding this comment.
The approach LGTM - WHERE 1=1 / WHERE 1=0 is universally supported across all mainstream RDBMS and cloud OLAP engines.
cloud-fan
left a comment
There was a problem hiding this comment.
LGTM. The fix is small, focused, and correct — 1 = 1/1 = 0 is universally portable and complements the existing inputToSQLNoBool/predicateToIntSQL machinery in JDBCSQLBuilder (which handles scalar-value substitution for boolean-unsupported dialects in non-predicate positions).
One follow-up worth a separate JIRA: visitIn's empty-list case still emits bare FALSE:
return "CASE WHEN " + v + " IS NULL THEN NULL ELSE FALSE END";This is the same class of bug for Oracle/DB2, but it can't be fixed the same way — 1 = 0 is a condition, not a value, so it can't be dropped into a CASE THEN/ELSE on dialects without boolean support. Worth tracking separately so the broader issue isn't lost.
Re-reviewing — see follow-up comment about where the fix should live.
cloud-fan
left a comment
There was a problem hiding this comment.
Sorry for the churn — I approved too quickly and want to revisit one point.
V2ExpressionSQLBuilder has two consumers:
JDBCSQLBuilder— where the Oracle/DB2 bare-TRUE/FALSEproblem actually lives.ToStringSQLBuilder— used byExpressionWithToString.toString()for human-readable display (EXPLAIN, thePushedFiltersinfo string, logs, error messages).
Only #1 needs 1 = 1/1 = 0. The current placement in the base class collaterally degrades #2 — the JDBCV2Suite.df9 golden-string update is exactly that path (PushedFilters info, not raw JDBC SQL): the displayed CASE WHEN ... THEN TRUE WHEN ... THEN FALSE ELSE ... END becomes CASE WHEN ... THEN 1 = 1 WHEN ... THEN 1 = 0 ELSE ... END, which is less readable for users inspecting plans. Generally I think we should leave the human-readable rendering alone, and only emit 1 = 1/1 = 0 when we're generating SQL to send to a JDBC endpoint.
Could you move the handling into JDBCSQLBuilder instead? E.g. override build there to short-circuit on AlwaysTrue/AlwaysFalse and super.build for everything else — mirroring the pattern MsSqlServerSQLBuilder already uses. That way the JDBC SQL gets the portable form and ToStringSQLBuilder keeps emitting TRUE/FALSE, so the df9 golden-string assertion goes back to the original and you can drop the JDBCV2Suite change entirely.
Unrelated follow-up worth a separate JIRA: V2ExpressionSQLBuilder.visitIn's empty-list case still emits bare FALSE:
return "CASE WHEN " + v + " IS NULL THEN NULL ELSE FALSE END";Same class of bug, but it can't be fixed the same way — 1 = 0 is a condition, not a value, so it can't be dropped into a CASE THEN/ELSE on dialects without boolean support. The whole CASE structure would need restructuring.
Move AlwaysTrue/AlwaysFalse handling from V2ExpressionSQLBuilder (base) to JDBCSQLBuilder so only the JDBC wire path gets 1=1/1=0. The display path (ToStringSQLBuilder) keeps emitting TRUE/FALSE for readability.
|
Thanks for the review @cloud-fan. Moved the fix to |
What changes were proposed in this pull request?
Handle
AlwaysTrueandAlwaysFalsepredicates inV2ExpressionSQLBuilder.build()by producing"1 = 1"and"1 = 0"respectively.AlwaysTrue/AlwaysFalseimplementLiteral<Boolean>, so they previously matched theLiteralbranch and produced bare"TRUE"/"FALSE"strings viavisitLiteral(). Some databases (Oracle, DB2) do not support these as boolean expressions in WHERE clauses.Why are the changes needed?
When AQE simplifies predicates to
AlwaysTrue/AlwaysFalseand they are pushed down to JDBC, the generated SQL contains bareTRUE/FALSEwhich fails on databases that don't support boolean literals.This is a fresh, simplified approach per reviewer feedback on the original PR #52198 (auto-closed due to inactivity) - using
1=1/1=0universally without a dialect-specific API.Does this PR introduce any user-facing change?
Yes. JDBC pushed filter SQL now uses
1 = 1/1 = 0instead ofTRUE/FALSEforAlwaysTrue/AlwaysFalsepredicates. This fixes queries that previously failed on Oracle/DB2.How was this patch tested?
V2PredicateSuiteJDBCV2Suite(data assertions unchanged)Was this patch authored or co-authored using generative AI tooling?
Yes.