[SPARK-56969][SQL] Enhance the SQL config spark.sql.timestampNanosTypes.enabled#56094
[SPARK-56969][SQL] Enhance the SQL config spark.sql.timestampNanosTypes.enabled#56094MaxGekk wants to merge 5 commits into
spark.sql.timestampNanosTypes.enabled#56094Conversation
…cument preview conf Extend TypeUtils.failUnsupportedDataType to reject TimestampNTZNanosType and TimestampLTZNanosType when spark.sql.timestampNanosTypes.enabled is off, add UNSUPPORTED_TIMESTAMP_NANOS_TYPE, align the conf default with Utils.isTesting, and add a short enablement note in sql-ref-datatypes.md.
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Do we still need [WIP] in the PR title, @MaxGekk ?
spark.sql.timestampNanosTypes.enabledspark.sql.timestampNanosTypes.enabled
|
@dongjoon-hyun The PR is ready for review. I removed the |
| "types. Enabling this flag does not guarantee full SQL support: casts, Parquet read, " + | ||
| "typed literals, and other operations may still fail until their respective features " + | ||
| "are implemented.") | ||
| .version("4.2.0") |
There was a problem hiding this comment.
@HyukjinKwon @cloud-fan @dongjoon-hyun I merged this config to master, will it be released in 4.2.0? Should I merge new features to branch-4.x?
Mirror the SQLConf default (Utils.isTesting) in DefaultSqlApiConf so no-session-bound paths (Spark Connect parsing without a session) match session-bound behavior. Drop the link to "Runtime SQL Configuration" in sql-ref-datatypes.md since the conf is .internal() and is not listed there. Co-authored-by: Isaac
|
@stevomitric Please, have a look at the PR. It is related to your changes in #56041 |
stevomitric
left a comment
There was a problem hiding this comment.
Latest commit says "Co-authored-by: Isaac" but the description states "Generated-by: Cursor Auto".
Maybe squash the latest commits?
| } | ||
|
|
||
| def unsupportedTimestampNanosTypeError(): Throwable = { | ||
| new AnalysisException( |
There was a problem hiding this comment.
Why not reuse FEATURE_NOT_ENABLED ?
Drop the dedicated UNSUPPORTED_TIMESTAMP_NANOS_TYPE error condition and route the analysis-time gate through the same FEATURE_NOT_ENABLED throw used by the parser path. Extract the throw into DataTypeErrors.timestampNanosTypesNotEnabledError so both gates share one source of truth. Short-circuit on the conf flag before traversing the data type recursively. Co-authored-by: Isaac
|
Waiting for @stevomitric PR #56112 be merged as soon as CI passed. |
|
Merging to master/4.x. Thank you, @stevomitric @dongjoon-hyun for review. |
…pes.enabled` ### What changes were proposed in this pull request? This PR completes [SPARK-56969](https://issues.apache.org/jira/browse/SPARK-56969) on top of the parser gating added in [SPARK-56965](https://issues.apache.org/jira/browse/SPARK-56965) / [#56041](#56041). - Extend `TypeUtils.failUnsupportedDataType` to recursively reject `TimestampNTZNanosType` and `TimestampLTZNanosType` when `spark.sql.timestampNanosTypes.enabled` is off. - Add `UNSUPPORTED_TIMESTAMP_NANOS_TYPE` with a message naming the conf key. - Expand the SQLConf doc for `spark.sql.timestampNanosTypes.enabled` and align its default with `Utils.isTesting` (mirroring `spark.sql.timeType.enabled`). - Add a short enablement note for preview nanos timestamp types in `docs/sql-ref-datatypes.md`. Part of SPIP [SPARK-56822](https://issues.apache.org/jira/browse/SPARK-56822). ### Why are the changes needed? Parser and JSON entry points already gate parameterized nanos timestamp types behind `spark.sql.timestampNanosTypes.enabled`, but analyzed schemas/plans could still surface these types through other paths (for example `CREATE TABLE`, connectors, or programmatic schemas) before downstream execution support is ready. Analysis-time gating closes that gap and keeps behavior consistent with the existing preview flag. ### Does this PR introduce _any_ user-facing change? Yes. - When `spark.sql.timestampNanosTypes.enabled` is off, analyzed schemas/plans containing `TIMESTAMP_NTZ(p)` / `TIMESTAMP_LTZ(p)` with `p` in `[7, 9]` now fail with `UNSUPPORTED_TIMESTAMP_NANOS_TYPE`. - The conf default is now `Utils.isTesting` instead of always `false`, so tests enable the preview by default while production remains off. - `docs/sql-ref-datatypes.md` documents how to enable the preview feature. Unparameterized `TIMESTAMP`, `TIMESTAMP_NTZ`, and `TIMESTAMP_LTZ` behavior is unchanged. Example: ```sql SET spark.sql.timestampNanosTypes.enabled=false; CREATE TABLE t (c TIMESTAMP_NTZ(9)); -- UNSUPPORTED_TIMESTAMP_NANOS_TYPE: ... Set "spark.sql.timestampNanosTypes.enabled" to "true" ... ``` ### How was this patch tested? Added/updated unit tests: - TypeUtilsSuite: default conf behavior, analysis gating on/off, microsecond types unaffected - DataTypeParserSuite: explicitly disable conf when testing parser rejection - DataTypeSuite: explicitly disable conf when testing JSON rejection Existing nanos parser/JSON tests continue to pass with the conf enabled. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Cursor Auto Closes #56094 from MaxGekk/nanos-conf. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Max Gekk <max.gekk@gmail.com> (cherry picked from commit 5477eea) Signed-off-by: Max Gekk <max.gekk@gmail.com>
What changes were proposed in this pull request?
This PR completes SPARK-56969 on top of the parser gating added in SPARK-56965 / #56041.
TypeUtils.failUnsupportedDataTypeto recursively rejectTimestampNTZNanosTypeandTimestampLTZNanosTypewhenspark.sql.timestampNanosTypes.enabledis off.UNSUPPORTED_TIMESTAMP_NANOS_TYPEwith a message naming the conf key.spark.sql.timestampNanosTypes.enabledand align its default withUtils.isTesting(mirroringspark.sql.timeType.enabled).docs/sql-ref-datatypes.md.Part of SPIP SPARK-56822.
Why are the changes needed?
Parser and JSON entry points already gate parameterized nanos timestamp types behind
spark.sql.timestampNanosTypes.enabled, but analyzed schemas/plans could still surface these types through other paths (for exampleCREATE TABLE, connectors, or programmatic schemas) before downstream execution support is ready.Analysis-time gating closes that gap and keeps behavior consistent with the existing preview flag.
Does this PR introduce any user-facing change?
Yes.
spark.sql.timestampNanosTypes.enabledis off, analyzed schemas/plans containingTIMESTAMP_NTZ(p)/TIMESTAMP_LTZ(p)withpin[7, 9]now fail withUNSUPPORTED_TIMESTAMP_NANOS_TYPE.Utils.isTestinginstead of alwaysfalse, so tests enable the preview by default while production remains off.docs/sql-ref-datatypes.mddocuments how to enable the preview feature.Unparameterized
TIMESTAMP,TIMESTAMP_NTZ, andTIMESTAMP_LTZbehavior is unchanged.Example:
How was this patch tested?
Added/updated unit tests:
Existing nanos parser/JSON tests continue to pass with the conf enabled.
Was this patch authored or co-authored using generative AI tooling?
Generated-by: Cursor Auto