Skip to content

Commit 9f78801

Browse files
committed
Update benchmarks README with K8s CI documentation
1 parent 02acbc2 commit 9f78801

1 file changed

Lines changed: 95 additions & 3 deletions

File tree

benchmarks/README.md

Lines changed: 95 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,10 +17,102 @@ specific language governing permissions and limitations
1717
under the License.
1818
-->
1919

20-
# Running Comet Benchmarks in Microk8s
20+
# Comet Benchmarks
2121

22-
This guide explains how to run benchmarks derived from TPC-H and TPC-DS in Apache DataFusion Comet deployed in a
23-
local Microk8s cluster.
22+
This guide explains how to run benchmarks derived from TPC-H and TPC-DS in Apache DataFusion Comet.
23+
24+
## Table of Contents
25+
26+
- [GitHub CI Benchmarks (Kind)](#github-ci-benchmarks-kind)
27+
- [Local Development (Kind)](#local-development-kind)
28+
- [Microk8s Deployment](#running-comet-benchmarks-in-microk8s)
29+
30+
---
31+
32+
## GitHub CI Benchmarks (Kind)
33+
34+
The project includes automated benchmark CI that runs on every PR affecting Rust (`native/**/*.rs`) or Scala/Java (`spark/**/*.scala`, `spark/**/*.java`) code.
35+
36+
### What the CI Does
37+
38+
1. Creates a Kind Kubernetes cluster (1 control-plane + 2 workers)
39+
2. Installs Spark Operator via Helm
40+
3. Builds Comet from source
41+
4. Generates TPC-H SF=1 data (~1GB)
42+
5. Runs TPC-H Q1 with Spark baseline
43+
6. Runs TPC-H Q1 with Comet enabled
44+
7. **Validates that Comet achieves ≥1.1x speedup (10% improvement)**
45+
46+
### Manual Trigger
47+
48+
You can manually trigger the benchmark CI from GitHub Actions with custom parameters:
49+
50+
- **scale_factor**: TPC-H scale factor (default: 1)
51+
- **query**: TPC-H query (q1, q6, q14, simple)
52+
- **min_speedup**: Minimum required speedup (default: 1.1)
53+
54+
---
55+
56+
## Local Development (Kind)
57+
58+
Run benchmarks locally using Kind (Kubernetes in Docker).
59+
60+
### Prerequisites
61+
62+
```bash
63+
# Install Kind, kubectl, and Helm
64+
brew install kind kubectl helm # macOS
65+
# Or see: https://kind.sigs.k8s.io/docs/user/quick-start/
66+
```
67+
68+
### Quick Start
69+
70+
```bash
71+
# 1. Setup Kind cluster with Spark Operator
72+
./hack/k8s-benchmark-setup.sh
73+
74+
# 2. Build Comet
75+
make release PROFILES="-Pspark-3.5 -Pscala-2.12"
76+
77+
# 3. Build benchmark Docker image
78+
docker build -t comet-bench:local -f benchmarks/Dockerfile.k8s .
79+
kind load docker-image comet-bench:local --name comet-bench
80+
81+
# 4. Generate TPC-H data
82+
./benchmarks/scripts/generate-tpch-data.sh 1 /tmp/comet-bench-data/tpch
83+
84+
# 5. Run Spark baseline
85+
./benchmarks/scripts/run-k8s-benchmark.sh spark q1
86+
87+
# 6. Run Comet benchmark
88+
./benchmarks/scripts/run-k8s-benchmark.sh comet q1
89+
90+
# 7. Compare results
91+
python3 benchmarks/scripts/compare-results.py \
92+
--spark /tmp/comet-bench-results/spark_q1_result.json \
93+
--comet /tmp/comet-bench-results/comet_q1_result.json \
94+
--min-speedup 1.1
95+
96+
# 8. Cleanup
97+
./hack/k8s-benchmark-setup.sh --delete
98+
```
99+
100+
### Environment Variables
101+
102+
| Variable | Default | Description |
103+
|----------|---------|-------------|
104+
| `COMET_BENCH_CLUSTER` | `comet-bench` | Kind cluster name |
105+
| `COMET_BENCH_NAMESPACE` | `comet-bench` | Kubernetes namespace |
106+
| `COMET_DOCKER_IMAGE` | `comet-bench:local` | Docker image for benchmarks |
107+
| `DRIVER_MEMORY` | `2g` | Spark driver memory |
108+
| `EXECUTOR_MEMORY` | `2g` | Spark executor memory |
109+
| `EXECUTOR_INSTANCES` | `2` | Number of Spark executors |
110+
111+
---
112+
113+
## Running Comet Benchmarks in Microk8s
114+
115+
This section explains how to run benchmarks in a local Microk8s cluster
24116

25117
## Use Microk8s locally
26118

0 commit comments

Comments
 (0)