SQL: Auto union Iceberg and Redpanda topic#575
Conversation
✅ Deploy Preview for rp-cloud ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
03d3421 to
3fd1c5a
Compare
44f89d0 to
72c11f6
Compare
| warehouse = 's3://lakehouse-data/', | ||
| auth_type = 'oauth2', | ||
| oauth2_client_id = '<client-id>', | ||
| oauth2_client_secret = '<client-secret>', |
There was a problem hiding this comment.
Would we tell users to create and reference secrets here in the same way we do for RP Cloud secrets such as for catalog credentials in the cluster config, like so? https://deploy-preview-575--rp-cloud.netlify.app/redpanda-cloud/manage/iceberg/use-iceberg-catalogs/#use-a-secret-in-cluster-configuration
|
|
||
| [source,sql] | ||
| ---- | ||
| DROP ICEBERG CATALOG IF EXISTS lakehouse_catalog; |
There was a problem hiding this comment.
I would add here that the user cannot drop an iceberg catalog when there is another Kafka catalog "linking" the iceberg catalog.
| |Base URL of the Redpanda HTTP Proxy REST API. Required when the catalog includes a `USING CATALOG` clause; Redpanda SQL uses this endpoint to fetch Iceberg translation state for queries that span the topic and its Iceberg table. | ||
| |=== | ||
|
|
||
| // TODO: SME — `connection_timeout` (Int64, ms; maps to rdkafka `timeout_ms`) and `rd_kafka_debug` (STRING; librdkafka `debug` config) are also accepted by the parser at oxla/src/catalog/kafka/conversions.cpp:62-63. Confirm whether either should be documented as user-facing for v1 GA, or kept undocumented as internal/troubleshooting knobs. |
|
|
||
| // TODO: SME — confirm when REFRESH must be run on the linked Iceberg table. Source shows that if the Iceberg table's schema isn't refreshed, the query fails at planning time with: `Schema not found for Iceberg table '<table>'. Run: REFRESH <catalog>=><table>`. Confirm: | ||
| // - Is REFRESH required only the first time, or every time the Iceberg schema changes? | ||
| // - Is REFRESH required when new records are added to the Iceberg table (no schema change), or only on schema change? |
There was a problem hiding this comment.
REFRESH pertains to the schema/shape of the table only. The whole point of the bridge queries is to fetch all data without refreshing anything.
| == Query live and historical records together | ||
|
|
||
| // TODO: SME — confirm when REFRESH must be run on the linked Iceberg table. Source shows that if the Iceberg table's schema isn't refreshed, the query fails at planning time with: `Schema not found for Iceberg table '<table>'. Run: REFRESH <catalog>=><table>`. Confirm: | ||
| // - Is REFRESH required only the first time, or every time the Iceberg schema changes? |
There was a problem hiding this comment.
Currently it's advised to run REFRESH at any shape change from any side, as that's the view Oxla has and considers during preparing the query. However, I believe @Bixkog is working on performing the REFRESH on all tables automatically on creation of the catalogs, so the tables will be visible immediately after creating the catalogs. The refresh is needed on any subsequent shape change, though, anyway.
Description
This pull request adds comprehensive documentation for Redpanda SQL's new support for Iceberg catalogs and bridge queries, enabling users to query both live Redpanda topics and their Iceberg-committed history. It introduces reference and how-to content for creating, altering, and dropping Iceberg catalogs, details the new
USING CATALOGclause for Redpanda catalogs, and provides a step-by-step guide for querying Iceberg-enabled topics.New SQL statement documentation:
CREATE ICEBERG CATALOG,ALTER ICEBERG CATALOG, andDROP ICEBERG CATALOG, including syntax, options (covering authentication and TLS), and usage examples. [1] [2] [3]Enhancements to Redpanda catalog documentation:
USING CATALOGclause forCREATE REDPANDA CATALOG, which links a Redpanda catalog to an Iceberg catalog for bridge queries.pandaproxy_urloption, required when usingUSING CATALOG, and provided an example of creating a linked catalog. [1] [2]How-to guide for querying Iceberg-enabled topics:
Resolves https://github.com/redpanda-data/documentation-private/issues/
Review deadline: 18 May
Page previews
Redpanda SQL > Query Data > Query Iceberg topics
Reference > Redpanda SQL Reference > Statements
CREATE ICEBERG CATALOG
ALTER ICEBERG CATALOG
DROP ICEBERG CATALOG
CREATE REDPANDA CATALOG > Create catalog linked to Iceberg catalog
Checks