Skip to content

feat(export): implement chunked R2 database export with DO alarm cont…#102

Open
scrollingreel wants to merge 1 commit intoouterbase:mainfrom
scrollingreel:feat/chunked-r2-export
Open

feat(export): implement chunked R2 database export with DO alarm cont…#102
scrollingreel wants to merge 1 commit intoouterbase:mainfrom
scrollingreel:feat/chunked-r2-export

Conversation

@scrollingreel
Copy link

@scrollingreel scrollingreel commented Mar 2, 2026

Purpose

Implement chunked R2-based database export to support any database size (up to 10GB).
Uses DO alarms for exports exceeding the 30-second Worker timeout, with breathing
intervals to prevent database lock contention. Includes callback URL notifications
and new status/download endpoints.

/claim #59

Tasks

  • Replace in-memory dump with batched R2 chunked export (1000 rows/batch)
  • Implement DO alarm continuation for long-running exports
  • Add breathing intervals (2s) between alarm cycles
  • Add POST /export/dump with optional callbackUrl param
  • Add GET /export/status/:exportId endpoint
  • Add GET /export/download/:exportId endpoint
  • Add R2 bucket binding to wrangler.toml
  • Backwards compatible fallback when R2 is not configured
  • Comprehensive test suite (15 tests)

Verify

  1. Run pnpm vitest run src/export/chunked-dump.test.ts — 15 tests should pass
  2. Run pnpm vitest run — all 166 existing tests still pass
  3. Configure R2 bucket, deploy, and hit GET /export/dump on a small DB — should return .sql file directly
  4. Hit POST /export/dump?callbackUrl=https://example.com on a large DB — should return 202 with exportId
  5. Poll GET /export/status/:exportId — should show progress
  6. After completion, GET /export/download/:exportId — should return the .sql file

Before

Screenshot 2026-03-02 200224 image

After

image

#Short Demo video of changes

Recording.2026-03-02.201444.1.mp4

…inuation

- Replace in-memory dump with batched chunked export using R2 storage
- Process rows in configurable batches (1000 rows) with 20s time limit
- Use DO alarms to resume long-running exports that exceed 30s timeout
- Store export state in DO storage for pause/resume across alarm cycles
- Add breathing intervals between alarm cycles (2s gap) to prevent DB lock
- Support optional callbackUrl for async completion notifications (POST)
- Fast path: small exports (<20s) return file directly in response
- Async path: large exports return 202 with exportId for polling
- Add GET /export/status/:exportId endpoint to check progress
- Add GET /export/download/:exportId endpoint to download completed dumps
- Add POST /export/dump endpoint with callbackUrl query param support
- Exclude internal tmp_ tables from exports
- Handle NULL values correctly in SQL INSERT statements
- Graceful error handling with callback notifications on failure
- R2 bucket binding optional - falls back to legacy in-memory dump
- Add R2 bucket configuration to wrangler.toml and Env interface
- Add comprehensive test suite (15 tests) for all new functionality
- Backwards compatible: existing GET /export/dump still works

Closes outerbase#59
@scrollingreel
Copy link
Author

Implementation Summary

Problem

The current GET /export/dump endpoint loads the entire database into memory before returning it as a response. This fails for:

  • Databases approaching the 1GB Durable Object limit (10GB in future)
  • Any export that takes longer than 30 seconds (Worker timeout)

Solution

Chunked R2-based export with DO alarm continuation:

  • Batched processing: Reads rows in batches of 1000 with a 20s time limit per cycle
  • R2 storage: Streams SQL chunks to R2 instead of holding everything in memory
  • DO alarms: When approaching the 30s timeout, saves progress to DO storage and schedules an alarm to resume after a 2s breathing interval
  • Dual delivery: Small exports (<20s) return directly; large exports return 202 with an exportId for polling via /export/status/:exportId and downloading via /export/download/:exportId
  • Callback URL: Optional ?callbackUrl= parameter sends a POST notification when export completes or fails
  • Backwards compatible: Falls back to the original in-memory dump when R2 is not configured

New Endpoints

Method Path Description
GET /export/dump Enhanced dump (R2 chunked if available, else legacy)
POST /export/dump?callbackUrl=... Start async export with callback
GET /export/status/:exportId Check export progress
GET /export/download/:exportId Download completed export from R2

Tests

15 new tests covering initialization, chunk processing, NULL handling, error handling, callbacks, status endpoint, and download endpoint. All existing tests unaffected.

/claim #59

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant