You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fluss processes Paimon configurations by removing the `datalake.paimon.` prefix and then use the remaining configuration (without the prefix `datalake.paimon.`) to create the Paimon catalog. Checkout the [Paimon documentation](https://paimon.apache.org/docs/1.3/maintenance/configurations/) for more details on the available configurations.
38
+
Fluss processes Paimon configurations by removing the `datalake.paimon.` prefix and then use the remaining configuration (without the prefix `datalake.paimon.`) to create the Paimon catalog. Checkout the [Paimon documentation](https://paimon.apache.org/docs/$PAIMON_VERSION_SHORT$/maintenance/configurations/) for more details on the available configurations.
39
39
40
40
For example, if you want to configure to use Hive catalog, you can configure like following:
41
41
```yaml
@@ -66,7 +66,7 @@ Then, you must start the datalake tiering service to tier Fluss's data to the la
66
66
- Put [fluss-lake-paimon jar](https://repo1.maven.org/maven2/org/apache/fluss/fluss-lake-paimon/$FLUSS_VERSION$/fluss-lake-paimon-$FLUSS_VERSION$.jar) into `${FLINK_HOME}/lib`
67
67
- Put [paimon-bundle jar](https://repo.maven.apache.org/maven2/org/apache/paimon/paimon-bundle/$PAIMON_VERSION$/paimon-bundle-$PAIMON_VERSION$.jar) into `${FLINK_HOME}/lib`
68
68
- [Download](https://flink.apache.org/downloads/) pre-bundled Hadoop jar `flink-shaded-hadoop-2-uber-*.jar` and put into `${FLINK_HOME}/lib`
69
-
- Put Paimon's [filesystem jar](https://paimon.apache.org/docs/1.3/project/download/) into `${FLINK_HOME}/lib`, if you use s3 to store paimon data, please put `paimon-s3` jar into `${FLINK_HOME}/lib`
69
+
- Put Paimon's [filesystem jar](https://paimon.apache.org/docs/$PAIMON_VERSION_SHORT$/project/download/) into `${FLINK_HOME}/lib`, if you use s3 to store paimon data, please put `paimon-s3` jar into `${FLINK_HOME}/lib`
70
70
- The other jars that Paimon may require, for example, if you use HiveCatalog, you will need to put hive related jars
You can add more jars to this `lib` directory based on your requirements:
60
+
-**Cloud storage support**: For AWS S3 integration with Paimon, add the corresponding [paimon-s3](https://repo.maven.apache.org/maven2/org/apache/paimon/paimon-s3/$PAIMON_VERSION$/paimon-s3-$PAIMON_VERSION$.jar)
61
+
-**Other catalog backends**: Add jars needed for alternative Paimon catalog implementations (e.g., Hive, JDBC)
62
+
:::
63
+
64
+
3. Create a `docker-compose.yml` file with the following content:
@@ -116,11 +157,7 @@ The Docker Compose environment consists of the following containers:
116
157
- **Fluss Cluster:** a Fluss `CoordinatorServer`, a Fluss `TabletServer` and a `ZooKeeper` server.
117
158
- **Flink Cluster**: a Flink `JobManager` and a Flink `TaskManager` container to execute queries.
118
159
119
-
**Note:** The `apache/fluss-quickstart-flink` image is based on [flink:1.20.3-java17](https://hub.docker.com/layers/library/flink/1.20-java17/images/sha256:296c7c23fa40a9a3547771b08fc65e25f06bc4cfd3549eee243c99890778cafc) and
120
-
includes the [fluss-flink](engine-flink/getting-started.md), [paimon-flink](https://paimon.apache.org/docs/1.3/flink/quick-start/) and
121
-
[flink-connector-faker](https://flink-packages.org/packages/flink-faker) to simplify this guide.
122
-
123
-
3. To start all containers, run:
160
+
4. To start all containers, run:
124
161
```shell
125
162
docker compose up -d
126
163
```
@@ -312,23 +349,69 @@ Congratulations, you are all set!
312
349
313
350
First, use the following command to enter the Flink SQL CLI Container:
Next, perform streaming data writing into the **datalake-enabled** table, `datalake_enriched_orders`:
638
-
```sql title="Flink SQL"
639
-
-- switch to streaming mode
640
-
SET'execution.runtime-mode'='streaming';
641
-
```
642
721
643
722
```sql title="Flink SQL"
644
723
-- insert tuples into datalake_enriched_orders
@@ -674,9 +753,15 @@ The data for the `datalake_enriched_orders` table is stored in Fluss (for real-t
674
753
When querying the `datalake_enriched_orders` table, Fluss uses a union operation that combines data from both Fluss and Paimon to provide a complete result set -- combines **real-time** and **historical** data.
675
754
676
755
If you wish to query only the data stored in Paimon—offering high-performance access without the overhead of unioning data—you can use the `datalake_enriched_orders$lake` table by appending the `$lake` suffix.
677
-
This approach also enables all the optimizations and features of a Flink Paimon table source, including [system table](https://paimon.apache.org/docs/1.3/concepts/system-tables/) such as `datalake_enriched_orders$lake$snapshots`.
756
+
This approach also enables all the optimizations and features of a Flink Paimon table source, including [system table](https://paimon.apache.org/docs/$PAIMON_VERSION_SHORT$/concepts/system-tables/) such as `datalake_enriched_orders$lake$snapshots`.
678
757
679
758
To query the snapshots directly from Paimon, use the following SQL:
759
+
760
+
```sql title="Flink SQL"
761
+
-- use tableau result mode
762
+
SET'sql-client.execution.result-mode'='tableau';
763
+
```
764
+
680
765
```sql title="Flink SQL"
681
766
-- switch to batch mode
682
767
SET'execution.runtime-mode'='batch';
@@ -726,33 +811,7 @@ The result looks like:
726
811
```
727
812
You can execute the real-time analytics query multiple times, and the results will vary with each run as new data is continuously written to Fluss in real-time.
728
813
729
-
Finally, you can use the following command to view the files stored in Paimon:
730
-
```shell
731
-
docker compose exec taskmanager tree /tmp/paimon/fluss.db
The files adhere to Paimon's standard format, enabling seamless querying with other engines such as [Spark](https://paimon.apache.org/docs/1.3/spark/quick-start/) and [Trino](https://paimon.apache.org/docs/1.3/ecosystem/trino/).
814
+
The files adhere to Paimon's standard format, enabling seamless querying with other engines such as [Spark](https://paimon.apache.org/docs/$PAIMON_VERSION_SHORT$/spark/quick-start/) and [Trino](https://paimon.apache.org/docs/$PAIMON_VERSION_SHORT$/ecosystem/trino/).
756
815
757
816
</TabItem>
758
817
@@ -776,7 +835,6 @@ SET 'sql-client.execution.result-mode' = 'tableau';
776
835
SET'execution.runtime-mode'='batch';
777
836
```
778
837
779
-
780
838
```sql title="Flink SQL"
781
839
-- query snapshots in iceberg
782
840
SELECT snapshot_id, operation FROM datalake_enriched_orders$lake$snapshots;
0 commit comments