Skip to content

[lake/hudi][docs] Add Hudi tiering service documentation#3549

Open
fhan688 wants to merge 1 commit into
apache:mainfrom
fhan688:Introduce-document-for-tiering-service-to-hudi
Open

[lake/hudi][docs] Add Hudi tiering service documentation#3549
fhan688 wants to merge 1 commit into
apache:mainfrom
fhan688:Introduce-document-for-tiering-service-to-hudi

Conversation

@fhan688

@fhan688 fhan688 commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Purpose

Linked issue: #3514

This PR adds user-facing documentation for using Fluss Tiering Service with Apache Hudi.

The Hudi lake integration has already been implemented in the fluss-lake-hudi module, but the website did
not yet provide a dedicated guide that explains how to configure Hudi as lakehouse storage, start the tiering
service, create Hudi-backed Fluss tables, and read tiered data.

Brief change log

  • Add a new Hudi integration guide under streaming-lakehouse/integrate-data-lakes/formats.
  • Document Hudi dependencies, version compatibility, DFS/HMS catalog configuration, server-side plugin JARs,
    and Flink tiering service startup.
  • Describe the Fluss-to-Hudi table mapping based on the implementation:
    • primary-key Fluss tables are mapped to Hudi Merge-On-Read tables;
    • log tables are mapped to Hudi Copy-On-Write tables;
    • Hudi record key, bucket index, partition path fields, and Fluss system columns are managed consistently
      with the code.
  • Document Hudi table properties, union read behavior, direct Hudi reads through native Hudi catalog, type
    mapping, auto compaction, commit metadata, and current limitations.
  • Update existing documentation/config descriptions to list Hudi as a supported lakehouse format.

Tests

  • git diff --check
  • mvn -q -pl fluss-common -am -DskipTests -DskipITs -Dcheckstyle.skip=true compile

Website build was not run locally because website/node_modules is not installed in the current workspace.

API and Format

No API or storage format changes.

This PR only adds documentation and updates option descriptions to reflect that Hudi is now a supported
lakehouse format.

Documentation

This PR introduces the Hudi Tiering Service integration guide and updates existing lakehouse storage
documentation to include Hudi.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a dedicated, user-facing Apache Hudi integration guide to the Fluss website and updates existing configuration/option descriptions to include Hudi as a supported lakehouse (datalake) format.

Changes:

  • Added a comprehensive Hudi Tiering Service guide (dependencies, configuration, tiering job startup, table mapping, reads, and limitations).
  • Updated tiered storage and lakehouse docs to list Hudi among supported formats.
  • Updated configuration/option descriptions (website + ConfigOptions) to reflect Hudi support.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
website/docs/streaming-lakehouse/integrate-data-lakes/formats/hudi.md New end-to-end documentation for configuring and using Hudi as Fluss lakehouse storage.
website/docs/maintenance/tiered-storage/overview.md Updates supported lakehouse formats list to include Hudi.
website/docs/maintenance/tiered-storage/lakehouse-storage.md Updates lakehouse storage description to include Hudi as supported.
website/docs/maintenance/configuration.md Updates datalake.format documentation to include Hudi.
website/docs/engine-flink/options.md Updates table.datalake.format documentation to include hudi.
fluss-common/src/main/java/org/apache/fluss/config/ConfigOptions.java Updates option descriptions to include hudi in supported datalake formats.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants