Skip to content

Conversation

@touch-of-grey
Copy link
Contributor

@touch-of-grey touch-of-grey commented Jan 14, 2026

Based on draft shared from @jackye1995 , cleanup and publish for review.

Currently regional writer can reach 300MB/s performance on Amazon S3, around 50MB/s under high backpressure based on the benchmark.

Memtable read performance is inline with Lance table with a single fragment in memory, and much faster than using Lance table and append 1 fragment per batch.

@github-actions github-actions bot added the enhancement New feature or request label Jan 14, 2026
@jackye1995
Copy link
Contributor

Thanks for the fast turnaround! I will take a look tonight. Meanwhile, I think the code path deserves some benchmark, can you add that?

@jackye1995 jackye1995 self-requested a review January 14, 2026 00:30
@touch-of-grey
Copy link
Contributor Author

All review comments have been addressed in commit cda38c0:

  1. ✅ Removed rename_if_not_exists/copy_if_not_exists wrappers - using existing object_store APIs directly
  2. ✅ Fixed "Arrow field metadata" typo
  3. ✅ Changed region_specs: Vec<RegionSpec> to region_spec: Option<RegionSpec>, added TODO for add_region_spec() API
  4. ✅ Moved load_vector_index_config and load_ivf_pq_components to standalone functions
  5. ✅ Renamed indexed_writes to sync_indexed_write for clarity
  6. ✅ Added comment clarifying that version hint failures are logged as warnings
  7. ✅ Differentiated local vs cloud manifest writes (local uses temp file + atomic rename, cloud uses PUT-IF-NOT-EXISTS)
  8. ✅ Made manifest_scan_batch_size configurable in RegionWriterConfig with default of 2

@touch-of-grey touch-of-grey force-pushed the lsm-writer branch 6 times, most recently from 45ed939 to 1277042 Compare January 22, 2026 08:57
@jackye1995 jackye1995 changed the title feat: introduce MemWAL writer feat: introduce MemWAL regional writer and MemTable reader Jan 23, 2026
@github-actions
Copy link
Contributor

ACTION NEEDED
Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

@touch-of-grey touch-of-grey force-pushed the lsm-writer branch 2 times, most recently from cf9c002 to 6cf05f0 Compare January 26, 2026 08:33
Copy link
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for bearing with all my comments and direct edits! I think this looks good now! We still need replay feature during initialization and also better alignment with the Lance scanner, but this is a great foundation. Pending CI to merge

@touch-of-grey touch-of-grey changed the title feat: introduce MemWAL regional writer and MemTable reader feat: introduce MemWAL regional writer and MemTable reader Jan 26, 2026
…form compatibility

Store only the generation folder name (e.g., "77ea7152_gen_1") in the manifest
instead of the full object store path. This fixes Windows test failures where
the full path was being incorrectly manipulated.

Tests now construct the full path as: base_path/_mem_wal/region_id/folder_name

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@jackye1995
Copy link
Contributor

Known CI failure, merging

@jackye1995 jackye1995 merged commit a410894 into lance-format:main Jan 26, 2026
35 of 37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants