-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Search before asking
- I searched the issues and found no similar issues.
Linkis Component
linkis-engineconn-plugins
What happened
English:
Linkis currently lacks native support for Apache Kylin, the leading massive-scale OLAP engine designed for trillion-row data analysis. Kylin is widely adopted in financial and telecom industries (ICBC, CCB, CMB, China Mobile, China Telecom) for regulatory reports, customer profiling, and risk analysis with PB-scale data.
Market Demand:
- Massive-Scale Support: Specifically designed for trillion-row data analysis with sub-second query response through pre-computation Cube technology
- Financial Industry Standard: Core banks (ICBC, CCB, CMB) rely on Kylin for regulatory reports (EAST, T+1 reports) with tens of TB data
- Telecom Industry Adoption: China Mobile and China Telecom use Kylin for user behavior analysis with 10TB-100TB+ data volumes
- Apache Top-Level Project: Mature ecosystem with commercial support from Kyligence
- Pre-computation Advantage: Converts complex queries into simple KV lookups for guaranteed performance
Strategic Value:
Kylin complements existing engines by addressing fixed-dimension massive-scale analysis:
- Kylin: PB-scale, sub-second, fixed dimensions → regulatory reports, massive-scale dashboards
- Doris: TB-PB scale, second-level, flexible → real-time multi-dimensional analysis
- ClickHouse: TB-scale, sub-second, flexible → real-time wide-table queries
- Presto: PB-scale, minute-level, extremely flexible → ad-hoc queries
中文:
Linkis目前缺乏对Apache Kylin的原生支持,Kylin是领先的超大规模OLAP引擎,专为万亿级数据分析设计。Kylin在金融和电信行业(工商银行、建设银行、招商银行、中国移动、中国电信)被广泛采用,用于监管报表、客户画像和PB级数据的风险分析。
市场需求:
- 超大规模支持: 专为万亿级数据分析设计,通过预计算Cube技术实现亚秒级查询响应
- 金融行业标配: 核心银行(工商银行、建设银行、招商银行)依赖Kylin处理数十TB的监管报表(EAST、T+1报表)
- 电信行业采用: 中国移动和中国电信使用Kylin进行10TB-100TB+数据量的用户行为分析
- Apache顶级项目: 成熟的生态系统,Kyligence提供商业支持
- 预计算优势: 将复杂查询转化为简单的KV查找,保证性能
战略价值:
Kylin通过解决固定维度超大规模分析来补充现有引擎:
- Kylin: PB级、亚秒级、固定维度 → 监管报表、超大规模仪表板
- Doris: TB-PB级、秒级、灵活 → 实时多维分析
- ClickHouse: TB级、亚秒级、灵活 → 实时大宽表查询
- Presto: PB级、分钟级、极灵活 → 即席查询
What you expected to happen
English:
Linkis should provide an Apache Kylin engine plugin with the following capabilities:
-
Query Support:
- SQL query interface for Kylin Cubes
- Support for Kylin 4.x and Kylin 5.x versions
- MDX query support (optional)
- Query routing to pre-built Cubes
-
Cube Management:
- Trigger Cube build jobs from Linkis
- Monitor Cube build progress
- Cube refresh and incremental build support
- Cube metadata query and management
-
Data Operations:
- Query pre-built Cubes with standard SQL
- Support for complex aggregations
- Drill-down and roll-up operations
- Time-based filtering and analysis
-
Integration with Linkis:
- Unified task submission interface
- Resource management for Cube builds
- Permission control integration
- Metadata catalog integration
-
Performance Features:
- Query result caching
- Connection pooling
- Cube query optimization
- Pushdown computation to Kylin
中文:
Linkis应该提供Apache Kylin引擎插件,具备以下能力:
-
查询支持:
- Kylin Cube的SQL查询接口
- 支持Kylin 4.x和Kylin 5.x版本
- MDX查询支持(可选)
- 查询路由到预构建的Cube
-
Cube管理:
- 从Linkis触发Cube构建作业
- 监控Cube构建进度
- Cube刷新和增量构建支持
- Cube元数据查询和管理
-
数据操作:
- 使用标准SQL查询预构建的Cube
- 支持复杂聚合
- 下钻和上卷操作
- 基于时间的过滤和分析
-
与Linkis集成:
- 统一的任务提交接口
- Cube构建的资源管理
- 权限控制集成
- 元数据目录集成
-
性能特性:
- 查询结果缓存
- 连接池
- Cube查询优化
- 下推计算到Kylin
How to reproduce
English:
Current situation:
- Users need to manually access Kylin through REST API or web UI
- No unified interface for Cube management within Linkis
- Cannot leverage Linkis's resource management for Cube builds
- Isolated from Linkis's permission and metadata systems
Use case example:
-- Financial regulatory report: Daily transaction summary by region
-- Data volume: 50TB+, Cube pre-built for fixed dimensions
SELECT
report_date,
region_code,
branch_code,
SUM(transaction_amount) as total_amount,
COUNT(transaction_id) as transaction_count,
COUNT(DISTINCT customer_id) as customer_count
FROM transaction_cube
WHERE report_date >= '2024-01-01'
AND report_date <= '2024-12-31'
AND region_code IN ('BJ', 'SH', 'GZ', 'SZ')
GROUP BY report_date, region_code, branch_code
ORDER BY report_date DESC, total_amount DESC;
-- Cube build job (currently cannot be submitted through Linkis)
-- BUILD CUBE transaction_cube START('2024-12-01') END('2024-12-31')