Skip to content

Conversation

@letian-jiang
Copy link

DRILL-8542: Support Paimon format plugin

Description

Introduce a Paimon format plugin, enabling Drill users to query Paimon tables directly via filesystem paths.

  • support reading data from Paimon tables in Parquet and ORC formats
  • support projection/filter/limit pushdown
  • support snapshot read via table() with snapshotId or snapshotAsOfTime
  • support metadata tables: #snapshots, #schemas, #files, #manifests

Usage examples:

  • SELECT * FROM dfs./path/to/paimon_table
  • SELECT * FROM table(dfs./path/to/paimon_table, snapshotId => 123)
  • SELECT * FROM table(dfs./path/to/paimon_table, snapshotAsOfTime => 1700000000000)
  • SELECT * FROM dfs./path/to/paimon_table#snapshots

Reference:

Signed-off-by: Letian Jiang <[email protected]>
@cgivre cgivre self-requested a review January 27, 2026 15:32
@cgivre cgivre added enhancement PRs that add a new functionality to Drill new-format New Format Plugin doc-impacting PRs that affect the documentation labels Jan 27, 2026
Signed-off-by: Letian Jiang <[email protected]>
Copy link
Contributor

@cgivre cgivre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@letian-jiang
Thank you very much for this submission. I did a quick first pass and overall it looks good.

General comments:

  1. You're overriding a lot of methods that I don't think actually need to be overridden. If the methods are not doing anything, I would suggest removing them.
  2. Again, thank you for providing robust unit tests. However, I made some refactoring suggestions for this class.
  3. Please avoid the x ? y : z notation for complex logic.
  4. For non-obvious functions, please add some documentation. We don't need docstrings for getters/setters and other obvious code, but most people working on Drill are likely unfamiliar with Paimon, so any help you could give in the code is greatly appreciated.

}
return null;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Please add blank line at the end of classes. Here and elsewhere.

@cgivre cgivre changed the title DRILL-8542: Support Paimon format plugin DRILL-8542: Support Paimon Format Plugin Jan 27, 2026
Signed-off-by: Letian Jiang <[email protected]>
@letian-jiang
Copy link
Author

I believe I have addressed all the review comments. Could you please take another look? @cgivre

}
}

}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add new line at the end of all classes. Here and elsewhere..

import org.apache.drill.common.exceptions.UserException;
import org.apache.drill.common.expression.LogicalExpression;
import org.apache.drill.common.expression.SchemaPath;
import org.apache.drill.exec.physical.impl.scan.framework.ManagedReader;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This is using the older version of the ManagedReader class. Please use org.apache.drill.exec.physical.impl.scan.v3.ManagedReader. This will require refactoring this class a bit.

}

@Test
public void testSelectManifestsMetadata() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Can you please add a SerDe test? If this test fails, you won't be able to update the config. Take a look here as an example. :

public void testSerDe() throws Exception {
String sql = "SELECT COUNT(*) as cnt FROM table(cp.`excel/test_data.xlsx` (type=> 'excel', sheetName => 'inconsistentData', allTextMode => true))";
String plan = queryBuilder().sql(sql).explainJson();
long cnt = queryBuilder().physical(plan).singletonLong();
assertEquals("Counts should match",4L, cnt);
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-impacting PRs that affect the documentation enhancement PRs that add a new functionality to Drill new-format New Format Plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants