Skip to content

Drop RoutingStorage and Touch: GETEX already handles TTL refresh#4294

Closed
yrobla wants to merge 2 commits intomainfrom
issue-4210
Closed

Drop RoutingStorage and Touch: GETEX already handles TTL refresh#4294
yrobla wants to merge 2 commits intomainfrom
issue-4210

Conversation

@yrobla
Copy link
Contributor

@yrobla yrobla commented Mar 20, 2026

Summary

The RFC for horizontal proxyrunner scaling (THV-0047) originally proposed an LRU cache in front of Redis to reduce hot-path round-trips. After reviewing the implementation, that approach was reconsidered.

RedisStorage.Load already uses GETEX, which atomically returns the session data and refreshes the Redis TTL in a single command. That makes every piece of the proposed design unnecessary:

  • The LRU cache never avoided a Redis RTT — every cache hit still called remote.Touch (Redis EXPIRE) to prevent stale reads after TTL expiry. The LRU only saved deserializing a small session payload, which is negligible compared to the RTT.
  • Touch on the Storage interface is a breaking change for all implementors and leaks a Redis-specific TTL concern onto a shared interface.
  • RoutingStorage added cross-replica staleness edge cases, LRU eviction/recovery logic, and a generated mock — all complexity with no measurable benefit.

This PR removes those additions. GETEX handles TTL refresh as an internal Redis concern, invisible to callers. No interface changes are needed.

Closes #4210

Type of change

  • Bug fix
  • New feature
  • Refactoring (no behavior change)
  • Dependency update
  • Documentation
  • Other (describe):

Test plan

  • Unit tests (task test)
  • Linting (task lint-fix)

Does this introduce a user-facing change?

No.

Special notes for reviewers

If Redis RTT becomes a measurable bottleneck in the future, a local TTL cache is still the right approach — but the design would look different: no per-hit remote calls, bounded staleness via local expiry, GETEX on cache miss. That is a separate decision backed by profiling data.

@yrobla yrobla requested a review from Copilot March 20, 2026 15:29
@github-actions github-actions bot added the size/M Medium PR: 300-599 lines changed label Mar 20, 2026
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cbae45d38a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Mar 20, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new two-tier session.Storage implementation that fronts a remote storage backend (e.g., Redis) with a bounded in-memory LRU to reduce remote round-trips while preserving remote durability/authority.

Changes:

  • Introduce RoutingStorage implementing the session.Storage interface with local LRU + remote fallback.
  • Add a comprehensive unit test suite covering cache hits/misses, eviction, delete semantics, remote failure behavior, and constructor panics.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
pkg/transport/session/storage_routing.go New two-tier storage implementation using hashicorp/golang-lru for local caching and a delegated remote Storage.
pkg/transport/session/storage_routing_test.go Unit tests validating caching behavior, remote delegation, eviction, and concurrency sanity.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov
Copy link

codecov bot commented Mar 20, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 69.10%. Comparing base (a685e89) to head (fb3e7df).
⚠️ Report is 10 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4294      +/-   ##
==========================================
+ Coverage   68.85%   69.10%   +0.25%     
==========================================
  Files         473      477       +4     
  Lines       47899    48135     +236     
==========================================
+ Hits        32979    33264     +285     
- Misses      12279    12287       +8     
+ Partials     2641     2584      -57     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Mar 20, 2026
@yrobla yrobla requested a review from jerm-dro as a code owner March 20, 2026 16:07
@yrobla yrobla requested a review from Copilot March 20, 2026 16:07
@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Mar 20, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@yrobla yrobla requested a review from Copilot March 20, 2026 16:24
@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/M Medium PR: 300-599 lines changed labels Mar 20, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Adds RoutingStorage, a two-tier session storage that serves reads from a
bounded in-memory LRU cache and falls back to a remote Storage (Redis) on
cache misses. Store writes to remote first, only promoting to the local
cache on success. DeleteExpired delegates entirely to remote, which owns
the TTL source of truth; the local cache self-corrects on subsequent Loads.

Uses github.com/hashicorp/golang-lru/v2 (already a transitive dependency)
for a thread-safe generic LRU. NewRoutingStorage panics on maxLocalEntries
<= 0 so that callers apply a sensible default (e.g. 1000) before calling.
@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Mar 20, 2026
Copy link
Contributor

@jerm-dro jerm-dro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the LRU cache?

The original RFC I proposed mentioned LRU caching, but after seeing this implementation I've reconsidered. The PR correctly identifies that we must refresh Redis TTLs on access — something the RFC didn't account for. That constraint changes the calculus.

Every Load cache hit still calls remote.Touch (Redis EXPIRE), so every request pays a Redis RTT regardless. The LRU only saves deserializing the session payload, but sessions are small and GET and EXPIRE are both O(1) — the RTT dominates.

That means we're taking on significant complexity for a marginal saving: a new Touch method on the Storage interface (breaking change for all implementations), cross-replica staleness edge cases, LRU eviction/recovery logic, and a generated mock.

Suggestion: drop the LRU and have RedisStorage.Load use GETEX instead of GET. GETEX atomically returns the data and refreshes the TTL — the TTL refresh becomes an internal Redis concern, invisible to callers. No Touch, no interface change, no cache invalidation bugs. The RoutingStorage, mock, and Touch additions can all be removed.

If we later find Redis RTT is a bottleneck, we can add a local TTL cache in front, but that would be a different design (no per-hit remote calls, bounded staleness via local expiry, GETEX on cache miss).

RedisStorage.Load already uses GETEX which atomically returns the
session data and refreshes the Redis TTL in a single round-trip.
That makes RoutingStorage and Touch redundant:

- Every LRU cache hit in RoutingStorage still called remote.Touch
  (Redis EXPIRE), paying a Redis RTT regardless. The LRU only saved
  deserializing a small session payload — negligible versus the RTT.
- Touch on the Storage interface is a breaking change for all
  implementors and leaks a Redis-specific concern onto the interface.

Remove: RoutingStorage, Touch from Storage interface, Touch from
LocalStorage and RedisStorage, the generated mock, and the
golang-lru/v2 direct dependency.

GETEX already handles TTL refresh as an internal Redis concern,
invisible to callers. No interface changes needed.

Closes #4210

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@yrobla yrobla changed the title Implement RoutingStorage LRU+remote two-tier backend Drop RoutingStorage and Touch: GETEX already handles TTL refresh Mar 23, 2026
@github-actions github-actions bot added size/XS Extra small PR: < 100 lines changed and removed size/L Large PR: 600-999 lines changed labels Mar 23, 2026
@yrobla
Copy link
Contributor Author

yrobla commented Mar 23, 2026

Closing — no changes needed. RedisStorage.Load already uses GETEX on main, which atomically returns session data and refreshes the Redis TTL in a single round-trip. The LRU cache, Touch interface method, and RoutingStorage were all predicated on TTL refresh being a separate concern, but it isn't. Nothing to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XS Extra small PR: < 100 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement RoutingStorage (LRU + Redis fallback)

4 participants