feat(kernel): copy-into-cage Arrow result handoff (KERNEL_FETCH_MODE=copycage) by msrathore-db · Pull Request #444 · databricks/databricks-sql-nodejs

msrathore-db · 2026-06-22T22:11:35Z

What

Adds an opt-in result-handoff mode for the kernel path that avoids the Arrow IPC re-encode/decode: KERNEL_FETCH_MODE=copycage.

Instead of decoding the kernel's per-batch Arrow IPC bytes via RecordBatchReader, the kernel hands over each Arrow buffer as a V8-owned (in-cage) ArrayBuffer with the bytes copied in, plus a descriptor. The driver rebuilds the RecordBatch via apache-arrow makeData (lib/kernel/KernelArrowImport.ts) and feeds the existing ArrowResultConverter unchanged.

Changes

KERNEL_FETCH_MODE ∈ { ipc (default), copycage }, resolved in KernelOperationBackend. Double-gated: only engages when the binding exposes fetchNextBatchCopycage and the schema is supported (dictionary / union / Large* types fall back to IPC for the whole result).
Wired on both the sync metadata fetch handle and the async AsyncResultHandle (the main executeStatement path) — verified via native call-counters that the driver actually invokes fetchNextBatchCopycage and does not silently fall back to IPC.
New importZeroCopyBatch importer + tests/unit/kernel/KernelArrowImport.test.ts (pinned layout-compat test so an arrow-rs / apache-arrow layout drift fails loudly).
ArrowBatch gains an optional pre-decoded recordBatches field that the converter consumes when present; the IPC path is unchanged.

Verification

Byte-identical to IPC end-to-end through the driver (useKernel) across all 21 type families + edge cases, including the empty-array / all-null / empty-string / map cases.
Multi-batch 5,000,000-row integrity; concurrency; tsc + result/kernel unit suites green.

⚠️ Dependency — merge ordering

Depends on the kernel napi change adding fetchNextBatchCopycage (databricks-sql-kernel PR). The driver consumes the kernel via the published @databricks/databricks-sql-kernel-* native package, so this must merge after that kernel change is released and the native dependency / KERNEL_REV is bumped here. Opened as a draft until then.

This pull request and its description were written by Isaac.

…copycage) Adds an opt-in result-handoff mode for the kernel path that avoids the Arrow IPC re-encode. Instead of decoding the kernel's per-batch Arrow IPC bytes via RecordBatchReader, the kernel hands over each Arrow buffer as a V8-owned (in-cage) ArrayBuffer with the bytes copied in, plus a descriptor; the driver rebuilds the RecordBatch via apache-arrow makeData (KernelArrowImport.ts) and feeds the existing ArrowResultConverter unchanged. - KERNEL_FETCH_MODE in {ipc (default), copycage}; double-gated on the binding exposing fetchNextBatchCopycage AND the schema being supported (dictionary/union/Large* fall back to IPC). - Wired on both the sync metadata fetch handle and the async AsyncResultHandle (the main executeStatement path). - New importZeroCopyBatch importer + KernelArrowImport unit test. - ArrowBatch gains an optional pre-decoded `recordBatches` the converter consumes when present (IPC path unchanged). Verified byte-identical to the IPC path across all 21 Databricks type families + edge cases (empty/all-null/empty-array/map/empty-string, NaN/+-Inf, decimal/interval/variant/deeply-nested), multi-batch 5M integrity, and concurrency. NOTE: depends on the kernel napi change adding fetchNextBatchCopycage (databricks-sql-kernel) — the driver consumes it via the published @databricks/databricks-sql-kernel-* native package, so this must land after that kernel change is released and the native dependency bumped. Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>

msrathore-db had a problem deploying to azure-prod June 22, 2026 22:11 — with GitHub Actions Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(kernel): copy-into-cage Arrow result handoff (KERNEL_FETCH_MODE=copycage)#444

feat(kernel): copy-into-cage Arrow result handoff (KERNEL_FETCH_MODE=copycage)#444
msrathore-db wants to merge 1 commit into
mainfrom
msrathore/copycage-final

msrathore-db commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

msrathore-db commented Jun 22, 2026

What

Changes

Verification

⚠️ Dependency — merge ordering

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant