Skip to content

Comments

fix: Support concat_ws with literal NULL separator#3542

Merged
andygrove merged 4 commits intoapache:mainfrom
0lai0:native_engine_crashes_on_concat_ws
Feb 18, 2026
Merged

fix: Support concat_ws with literal NULL separator#3542
andygrove merged 4 commits intoapache:mainfrom
0lai0:native_engine_crashes_on_concat_ws

Conversation

@0lai0
Copy link
Contributor

@0lai0 0lai0 commented Feb 17, 2026

Which issue does this PR close?

Closes #3339

Rationale for this change

When Spark's ConstantFolding optimizer rule is disabled, concat_ws with a literal NULL separator is pushed down to the native engine and currently causes a crash with:

Expected string literal, got None.

Spark's semantics for concat_ws are that a NULL separator should yield a NULL result. We should either match Spark's behavior or fall back gracefully instead of crashing the native engine.

What changes are included in this PR?

  • Add a dedicated CometConcatWs serde handler that rewrites concat_ws with a literal NULL separator to a NULL literal, matching Spark semantics.
  • Update QueryPlanSerde to map ConcatWs to CometConcatWs instead of the generic CometScalarFunction("concat_ws").
  • Re-enable the all-literal concat_ws query in spark/src/test/resources/sql-tests/expressions/string/concat_ws.sql that was previously ignored due to this bug.

How are these changes tested?

./mvnw test -Dsuites="org.apache.comet.CometSqlFileTestSuite concat_ws" -Dtest=none

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @0lai0!

@coderfender
Copy link
Contributor

Tes failures :

- sql-file: expressions/string/concat_ws.sql [parquet.enable.dictionary=false] *** FAILED *** (719 milliseconds)
  org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 768.0 failed 1 times, most recent failure: Lost task 0.0 in stage 768.0 (TID 3188) (localhost executor driver): org.apache.comet.CometNativeException: Expected string literal, got None.
This issue was likely caused by a bug in DataFusion's code. Please help us to resolve this by filing a bug report in our issue tracker: 

@coderfender
Copy link
Contributor

@0lai0 , @andygrove . We might want to hold off onto this PR before merging. There is a test failure and I am not sure we covered all possible Literal conditions in our case statement . Steps to reproduce the SQL failure

  test("concat_ws test - no constant folding") {
    withSQLConf(
      "spark.sql.optimizer.excludedRules" ->
        "org.apache.spark.sql.catalyst.optimizer.ConstantFolding") {
      withParquetTable(Seq(1, 2).map(Tuple1(_)), "t") {
        val df = sql("SELECT concat_ws(',', NULL, 'b', 'c'), concat_ws(NULL, 'a', 'b') FROM t")
        df.explain(true)
        checkSparkAnswerAndOperator(df)
      }
    }
  }

Error (with plan)

== Parsed Logical Plan ==
'Project [unresolvedalias('concat_ws(,, null, b, c), None), unresolvedalias('concat_ws(null, a, b), None)]
+- 'UnresolvedRelation [t], [], false

== Analyzed Logical Plan ==
concat_ws(,, NULL, b, c): string, concat_ws(NULL, a, b): string
Project [concat_ws(,, cast(null as array<string>), b, c) AS concat_ws(,, NULL, b, c)#5, concat_ws(cast(null as string), a, b) AS concat_ws(NULL, a, b)#6]
+- SubqueryAlias t
 +- View (`t`, [_1#3])
    +- Relation [_1#3] parquet

== Optimized Logical Plan ==
Project [concat_ws(,, null, b, c) AS concat_ws(,, NULL, b, c)#5, concat_ws(null, a, b) AS concat_ws(NULL, a, b)#6]
+- Relation [_1#3] parquet

== Physical Plan ==
*(1) CometColumnarToRow
+- CometProject [concat_ws(,, NULL, b, c)#5, concat_ws(NULL, a, b)#6], [concat_ws(,, null, b, c) AS concat_ws(,, NULL, b, c)#5, concat_ws(null, a, b) AS concat_ws(NULL, a, b)#6]
 +- CometScan [native_iceberg_compat] parquet [] Batched: true, DataFilters: [], Format: CometParquet, Location: InMemoryFileIndex(1 paths)[file:/private/var/folders/k0/t16s7rgj6gl2x008c266k4vm0000gn/T/spark-53..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<>



Job aborted due to stage failure: Task 0 in stage 3.0 failed 1 times, most recent failure: Lost task 0.0 in stage 3.0 (TID 5) (172.16.2.87 executor driver): org.apache.comet.CometNativeException: Expected string literal, got None.
This issue was likely caused by a bug in DataFusion's code. Please help us to resolve this by filing a bug report in our issue tracker: https://github.com/apache/datafusion/issues
  at org.apache.comet.Native.executePlan(Native Method)
  at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2(CometExecIterator.scala:150)
  at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2$adapted(CometExecIterator.scala:149)
  at org.apache.comet.vector.NativeUtil.getNextBatch(NativeUtil.scala:232)
  at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1(CometExecIterator.scala:149)
  at org.apache.comet.Tracing$.withTrace(Tracing.scala:31)
  at org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:147)
  at org.apache.comet.CometExecIterator.hasNext(CometExecIterator.scala:203)
  at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.cometcolumnartorow_nextBatch_0$(Unknown Source)
  at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
  at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
  at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
  at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
  at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
  at org.apache.spark.util.Iterators$.size(Iterators.scala:29)
  at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1953)
  at org.apache.spark.rdd.RDD.$anonfun$count$1(RDD.scala:1269)
  at org.apache.spark.rdd.RDD.$anonfun$count$1$adapted(RDD.scala:1269)
  at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2303)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
  at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
  at org.apache.spark.scheduler.Task.run(Task.scala:139)
  at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
  at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
  at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
  at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
  at java.base/java.lang.Thread.run(Thread.java:840)

@coderfender
Copy link
Contributor

We have 2 options here .

  1. We could either merge this PR to handle NULL separator (but continue to ignore the failed test)
  2. Handle case where all inputs are literals

0lai0 and others added 2 commits February 18, 2026 17:26
@0lai0
Copy link
Contributor Author

0lai0 commented Feb 18, 2026

Thanks @andygrove and @coderfender for review. I think we need to open a follow up issue.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@andygrove
Copy link
Member

Thanks for the fix, @0lai0! The null separator handling is clean.

I pushed a small addition to handle the case where all concat_ws arguments are foldable (literals). Without this, queries like SELECT concat_ws(',', 'hello', 'world') would still crash the native engine when ConstantFolding is disabled, since the native concat_ws doesn't support all-scalar inputs. This was the query that was ignored in the test file referencing #3339.

The new case in CometConcatWs.convert checks expr.children.forall(_.foldable) and returns None to fall back to Spark. I also updated the test from query ignore(...) to query spark_answer_only so it actually runs and verifies correctness.

Note that the null separator check still comes first, so concat_ws(NULL, 'a', 'b') is handled by the existing null-literal short-circuit before the foldable check is reached.

@andygrove
Copy link
Member

@coderfender ok to merge this one now?

@coderfender
Copy link
Contributor

Yup ! Thank you for fixing tests @0lai0 @andygrove

@andygrove andygrove merged commit 732b9fe into apache:main Feb 18, 2026
203 of 206 checks passed
@0lai0
Copy link
Contributor Author

0lai0 commented Feb 19, 2026

Thanks @andygrove and @coderfender !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Native engine crashes on concat_ws with literal NULL separator

3 participants