Skip to content

SQL: Auto union Iceberg and Redpanda topic#575

Open
kbatuigas wants to merge 10 commits into
rp-sqlfrom
DOC-2006-document-feature-auto-union-iceberg-and-redpanda-topic
Open

SQL: Auto union Iceberg and Redpanda topic#575
kbatuigas wants to merge 10 commits into
rp-sqlfrom
DOC-2006-document-feature-auto-union-iceberg-and-redpanda-topic

Conversation

@kbatuigas
Copy link
Copy Markdown
Contributor

@kbatuigas kbatuigas commented May 4, 2026

Description

This pull request adds comprehensive documentation for Redpanda SQL's new support for Iceberg catalogs and bridge queries, enabling users to query both live Redpanda topics and their Iceberg-committed history. It introduces reference and how-to content for creating, altering, and dropping Iceberg catalogs, details the new USING CATALOG clause for Redpanda catalogs, and provides a step-by-step guide for querying Iceberg-enabled topics.

New SQL statement documentation:

  • Added reference pages for CREATE ICEBERG CATALOG, ALTER ICEBERG CATALOG, and DROP ICEBERG CATALOG, including syntax, options (covering authentication and TLS), and usage examples. [1] [2] [3]
  • Updated navigation to include the new Iceberg catalog statement references.

Enhancements to Redpanda catalog documentation:

  • Documented the new USING CATALOG clause for CREATE REDPANDA CATALOG, which links a Redpanda catalog to an Iceberg catalog for bridge queries.
  • Added the pandaproxy_url option, required when using USING CATALOG, and provided an example of creating a linked catalog. [1] [2]

How-to guide for querying Iceberg-enabled topics:

  • Added a step-by-step guide describing how to set up storage, Iceberg, and Redpanda catalogs, map topics as SQL tables, and run bridge queries that span live and historical data. The guide also explains prerequisites and links to related reference content.

Resolves https://github.com/redpanda-data/documentation-private/issues/
Review deadline: 18 May

Page previews

Redpanda SQL > Query Data > Query Iceberg topics
Reference > Redpanda SQL Reference > Statements
CREATE ICEBERG CATALOG
ALTER ICEBERG CATALOG
DROP ICEBERG CATALOG
CREATE REDPANDA CATALOG > Create catalog linked to Iceberg catalog

Checks

  • New feature
  • Content gap
  • Support Follow-up
  • Small fix (typos, links, copyedits, etc)

@netlify
Copy link
Copy Markdown

netlify Bot commented May 4, 2026

Deploy Preview for rp-cloud ready!

Name Link
🔨 Latest commit 2149a46
🔍 Latest deploy log https://app.netlify.com/projects/rp-cloud/deploys/6a0aa4e149d2a800081084a7
😎 Deploy Preview https://deploy-preview-575--rp-cloud.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d3d1d704-2a2d-4b24-a0e4-a3a4899f23ca

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch DOC-2006-document-feature-auto-union-iceberg-and-redpanda-topic

Comment @coderabbitai help to get the list of available commands and usage tips.

@kbatuigas kbatuigas force-pushed the DOC-2006-document-feature-auto-union-iceberg-and-redpanda-topic branch 2 times, most recently from 03d3421 to 3fd1c5a Compare May 11, 2026 23:06
@kbatuigas kbatuigas force-pushed the DOC-2006-document-feature-auto-union-iceberg-and-redpanda-topic branch from 44f89d0 to 72c11f6 Compare May 13, 2026 18:49
warehouse = 's3://lakehouse-data/',
auth_type = 'oauth2',
oauth2_client_id = '<client-id>',
oauth2_client_secret = '<client-secret>',
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we tell users to create and reference secrets here in the same way we do for RP Cloud secrets such as for catalog credentials in the cluster config, like so? https://deploy-preview-575--rp-cloud.netlify.app/redpanda-cloud/manage/iceberg/use-iceberg-catalogs/#use-a-secret-in-cluster-configuration

@kbatuigas kbatuigas requested a review from Greketrotny May 13, 2026 18:53
@kbatuigas kbatuigas marked this pull request as ready for review May 13, 2026 18:53
@kbatuigas kbatuigas requested a review from a team as a code owner May 13, 2026 18:53
@kbatuigas kbatuigas requested a review from mattschumpert May 13, 2026 18:54

[source,sql]
----
DROP ICEBERG CATALOG IF EXISTS lakehouse_catalog;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add here that the user cannot drop an iceberg catalog when there is another Kafka catalog "linking" the iceberg catalog.

|Base URL of the Redpanda HTTP Proxy REST API. Required when the catalog includes a `USING CATALOG` clause; Redpanda SQL uses this endpoint to fetch Iceberg translation state for queries that span the topic and its Iceberg table.
|===

// TODO: SME — `connection_timeout` (Int64, ms; maps to rdkafka `timeout_ms`) and `rd_kafka_debug` (STRING; librdkafka `debug` config) are also accepted by the parser at oxla/src/catalog/kafka/conversions.cpp:62-63. Confirm whether either should be documented as user-facing for v1 GA, or kept undocumented as internal/troubleshooting knobs.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't document that.


// TODO: SME — confirm when REFRESH must be run on the linked Iceberg table. Source shows that if the Iceberg table's schema isn't refreshed, the query fails at planning time with: `Schema not found for Iceberg table '<table>'. Run: REFRESH <catalog>=><table>`. Confirm:
// - Is REFRESH required only the first time, or every time the Iceberg schema changes?
// - Is REFRESH required when new records are added to the Iceberg table (no schema change), or only on schema change?
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REFRESH pertains to the schema/shape of the table only. The whole point of the bridge queries is to fetch all data without refreshing anything.

== Query live and historical records together

// TODO: SME — confirm when REFRESH must be run on the linked Iceberg table. Source shows that if the Iceberg table's schema isn't refreshed, the query fails at planning time with: `Schema not found for Iceberg table '<table>'. Run: REFRESH <catalog>=><table>`. Confirm:
// - Is REFRESH required only the first time, or every time the Iceberg schema changes?
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it's advised to run REFRESH at any shape change from any side, as that's the view Oxla has and considers during preparing the query. However, I believe @Bixkog is working on performing the REFRESH on all tables automatically on creation of the catalogs, so the tables will be visible immediately after creating the catalogs. The refresh is needed on any subsequent shape change, though, anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants