You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(webapp): bound logs search memory and fix pagination at scale (#4012)
## Summary
The logs search page (behind a feature flag) ran ClickHouse out of
memory when browsing back over long time ranges. This keeps it within
bounded memory and fixes a pagination bug that could skip or duplicate
rows at a page boundary.
## Fix
Memory: the list query reads in sort-key order, which opens one read
stream per part in the window, and on object storage those per-part read
buffers dominate peak memory, so it scaled with the number of parts
scanned. Two changes bound it:
- The logs ClickHouse client caps the per-part read buffers via new
env-tunable settings. The object-storage-only setting is opt-in, so it
is never sent to a ClickHouse version that lacks it.
- Recent-first window narrowing: rows come back newest first, so the
presenter probes the most recent window and only widens toward the full
requested range when a page is short. A busy environment fills a page
from a few recent parts instead of scanning the whole range; a quiet one
still returns every row in a couple of cheap reads.
Correctness: the keyset cursor ordered on (triggered_timestamp,
trace_id), which is not unique because the spans of a trace share both,
so rows at a tie could be skipped or duplicated across pages. The cursor
and ORDER BY now include span_id, and the cursor is versioned so stale
cursors reset to the first page.
Guards: the effective page size is capped, and the existing per-query
memory limit lets a pathological wide browse fail with an error instead
of taking the node down.
## ClickHouse 26.2
The memory fix relies on lazy materialization deferring the wide
attributes column to the output rows, which only holds on 26.x. Cloud
already runs 26.2, so this moves the dev stack, testcontainers, and CI
to match. The ClickHouse test suite passes on 26.2.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Keep logs search within bounded ClickHouse memory when browsing long time ranges, and fix pagination that could skip or duplicate entries sharing a timestamp.
0 commit comments