Skip to content

fix: bound TypeAdapter lru_cache to prevent memory leak in multi-threaded usage#2907

Open
giulio-leone wants to merge 1 commit intoopenai:mainfrom
giulio-leone:fix/responses-parse-memory-leak
Open

fix: bound TypeAdapter lru_cache to prevent memory leak in multi-threaded usage#2907
giulio-leone wants to merge 1 commit intoopenai:mainfrom
giulio-leone:fix/responses-parse-memory-leak

Conversation

@giulio-leone
Copy link

Summary

Fixes #2672

Problem

_CachedTypeAdapter uses lru_cache(maxsize=None) (unbounded), which causes memory leak in multi-threaded environments. When responses.parse is called from multiple threads, generic types like ParsedResponseOutputMessage[MyClass] are created anew by pydantic in each thread, generating distinct cache keys. This means the cache never hits and grows without bound.

Root Cause

In src/openai/_models.py line 802:

_CachedTypeAdapter = cast('TypeAdapter[object]', lru_cache(maxsize=None)(_TypeAdapter))

Python's typing module creates new generic type objects in different threads, so ParsedResponseOutputMessage[Fact] in thread A has a different identity than ParsedResponseOutputMessage[Fact] in thread B. Since lru_cache keys on object identity, each thread creates a new cache entry.

Fix

Set maxsize=4096 to cap memory usage while still providing effective caching for typical workloads. This is a standard practice for lru_cache — the LRU eviction policy ensures the most-used types stay cached while preventing unbounded growth.

Testing

Verified that the cache is bounded:

info = _CachedTypeAdapter.cache_info()
assert info.maxsize == 4096

@giulio-leone giulio-leone requested a review from a team as a code owner February 28, 2026 16:56
@giulio-leone
Copy link
Author

Friendly ping — CI is green and this is ready for review. Happy to address any feedback. Thanks!

@giulio-leone giulio-leone force-pushed the fix/responses-parse-memory-leak branch from 29a5d5a to ae5c65b Compare March 4, 2026 04:29
@giulio-leone
Copy link
Author

Intervention note for this PR:

Current blocker appears to be missing CI execution rather than failing jobs:

  • No checks reported on this branch.

Suggested unblock sequence:

  1. Trigger CI workflows for this PR (or confirm why workflows are not attached for this branch/fork).
  2. Once checks appear, fix any failing jobs and re-run.
  3. Request review only after a green check suite is visible.

If useful, I can run a follow-up status sweep as soon as checks are attached.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unrestricted caching keyed by generated types causes memory leak in multi-threaded regimes

1 participant