-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Python: ADR: Unifying Context Management with ContextMiddleware (Python) #3609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Python: ADR: Unifying Context Management with ContextMiddleware (Python) #3609
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds a new ADR proposing a unified ContextMiddleware abstraction for Python to replace ContextProvider, ChatMessageStore, and AgentThread, using an onion/wrapper middleware pipeline pattern.
Changes:
- Introduces a proposed
ContextMiddleware+SessionContext+ pipeline design for composable context engineering. - Describes a
StorageContextMiddlewareapproach for loading/storing conversation history and optional auditing. - Outlines migration impact and a phased implementation/testing plan.
| --- | ||
| # These are optional elements. Feel free to remove any of them. | ||
| status: proposed | ||
| contact: eavanvalkenburg | ||
| date: 2026-02-02 | ||
| deciders: eavanvalkenburg, markwallace-microsoft, sphenry, alliscode, johanst, brettcannon | ||
| consulted: taochenosu, moonbox3, dmytrostruk, giles17 | ||
| --- | ||
|
|
||
| # Unifying Context Management with ContextMiddleware |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ADR is using the placeholder number “00XX” in the filename/path. Per docs/decisions/README.md (step 1), ADRs should be named with the next sequential number (currently 0015-…). Please rename the file accordingly and update any references to the old filename.
| | 2 | **Instance or Factory** | Middleware can be shared instances or `(session_id) -> Middleware` factories for per-session state. | | ||
| | 3 | **Default Storage at Runtime** | `InMemoryStorageMiddleware` auto-added when no service_session_id, store≠True, and no pipeline. Evaluated at runtime so users can modify pipeline first. | | ||
| | 4 | **Multiple Storage Allowed** | Warn if multiple have `load_messages=True` (likely misconfiguration). | | ||
| | 5 | **Single Storage Class** | One `StorageContextMiddleware` configured for memory/audit/evaluation - no separate classes. | |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Design decision #5 says “Single Storage Class… no separate classes”, but later examples and the implementation plan reference multiple storage middleware classes (e.g., InMemoryStorageMiddleware, RedisStorageMiddleware, CosmosStorageMiddleware). Please reconcile this wording (e.g., “single storage middleware abstraction/base class with multiple implementations”) so the ADR is internally consistent.
| | 5 | **Single Storage Class** | One `StorageContextMiddleware` configured for memory/audit/evaluation - no separate classes. | | |
| | 5 | **Single Storage Abstraction, Multiple Implementations** | One unified storage middleware abstraction (e.g., `StorageContextMiddleware`-style) with multiple concrete implementations (in-memory, Redis, Cosmos) configured for memory/audit/evaluation. | |
| return session | ||
| ``` | ||
|
|
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SessionContext docstring refers to add_context_messages(), but the class defines add_messages() as the API for adding context messages. Please update the docstring to match the actual method name to avoid confusion.
| metadata: dict[str, Any] | None = None, | ||
| ): | ||
| self.session_id = session_id | ||
| self.service_session_id = service_session_id | ||
| self.input_messages = input_messages | ||
| self.context_messages: dict[str, list[ChatMessage]] = context_messages or {} | ||
| self.instructions: list[str] = instructions or [] | ||
| self.tools: list[ToolProtocol] = tools or [] |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the code sample, ContextMiddlewareFactory / ContextMiddlewareConfig reference ContextMiddleware before the ContextMiddleware class is defined. As written, this would raise a NameError at runtime unless you use from __future__ import annotations or quote the type / move these aliases below the class definition.
| tools: The tools to add | ||
| """ | ||
| for tool in tools: | ||
| # Add source attribution to tool metadata | ||
| if hasattr(tool, 'metadata') and isinstance(tool.metadata, dict): | ||
| tool.metadata["context_source"] = source_id | ||
| self.tools.extend(tools) | ||
|
|
||
| # --- Methods for reading context --- |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example in the ContextMiddleware docstring has incorrect indentation/structure: the “POST-PROCESSING” block appears nested under the factory function, not inside process(), which makes the example invalid and hard to follow. Please fix the snippet structure so the post-processing example is shown within process() after await next(context).
| 3. Response messages (if include_response=True) | ||
|
|
||
| Args: | ||
| include_input: If True, append input_messages after context | ||
| include_response: If True, append response_messages at the end |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ContextMiddleware.process() docstring references context.history_messages, but SessionContext does not define a history_messages attribute (history appears to be stored under context_messages keyed by source_id). Please update the docstring to reference the correct API/field so implementers know where to read loaded history from.
| UserWarning | ||
| ) | ||
|
|
||
| async def session_created(self, session_id: str | None) -> None: | ||
| """Notify all middleware that a session was created.""" | ||
| for middleware in self._middleware: | ||
| await middleware.session_created(session_id) | ||
|
|
||
| async def execute(self, context: SessionContext) -> None: | ||
| """Execute the middleware pipeline.""" |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the AgentSession sample, _ensure_default_storage() uses len(self._context_pipeline) and calls self._context_pipeline.prepend(...), but the provided ContextMiddlewarePipeline sample does not define __len__ or prepend. Please either add these APIs to the pipeline sample or adjust the sample logic (e.g., check self._context_pipeline is None / expose middleware length) to keep the ADR’s code consistent.
| Default storage behavior (applied at runtime, not init): | ||
| - If service_session_id is set: service handles storage, no default added | ||
| - If options.store=True: user expects service storage, no default added | ||
| - If no service_session_id AND store is not True AND no pipeline: | ||
| InMemoryStorageMiddleware is automatically added | ||
|
|
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the ChatAgent sample, context_middleware is typed as Sequence[ContextMiddleware], but earlier in the ADR the configuration type is ContextMiddlewareConfig = ContextMiddleware | ContextMiddlewareFactory and ContextMiddlewarePipeline.from_config() expects configs (instances or factories). Please update the sample signature/type to Sequence[ContextMiddlewareConfig] (or similar) so it matches the proposed API.
| await next(context) | ||
|
|
||
| # Post-processing | ||
| await self.store_interaction(context.input_messages, context.response_messages) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you modify response_messages here? If you do, do those modifications get returned to the callers, both middleware higher in the stack and the user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the intent will be (and i'll clarify) that this is not the place to alter the responses, just read them and do something, the AgentMiddleware can be used for that purpose.
| InMemoryStorageMiddleware("memory"), | ||
| RAGContextMiddleware("rag"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the user only wanted to use the user input to do a rag search, rather than that plus the chat history, how do they filter to only user input?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the setup of each middleware there are controls over which messages they want to use, and since all messages from other context providers are stored separately that is fully configurable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an example of that?
| - No `service_session_id` (service not managing storage) | ||
| - `options.store` is not `True` (user not expecting service storage) | ||
| - Pipeline is empty or None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the user only supplies regular context middleware and no storage context middleware, does the user get no chat history storage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we can go two routes here, both only for when there is no service_session_id and store==False:
- If the pipeline is present but does not have StorageMiddleware -> Add InMemoryStorage
- If there is NO middleware(pipeline) -> Add pipeline with InMemoryStorage
The first might be easier for getting started but the second is clearer, because if you do already have a pipeline with a middleware the order suddenly matters, so we should clarify that in that case the user should do this themselves because we do not want to make assumptions about ordering. This will make sure that the simple case always works, but once people start adding their own middlewares, they should know what they want to do and thus I think approach two is the way to do.
| - No composability for context providers | ||
| - Inconsistent with middleware pattern used elsewhere | ||
|
|
||
| ### Option 2: ContextMiddleware (Chosen) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this ADR! The unified ContextMiddleware pattern is a nice improvement for context organization and attribution.
One question: would this design support implementing context compaction strategies?
For example, a common need is letting agents run for arbitrary long periods by automatically compacting/replacing the message history when a max context budget is hit. E.g., agent runs calls 10s or 100s of tool calls + model call in succession and between those calls may compact context between.
Currently this is challenging to do today because:
- ChatMessageStore.list_messages() is only called once at the start of agent.run(), not during the tool loop
- ChatMiddleware operates on a copy of messages, so modifications don't persist across tool loop iterations
see some related notes here.
With the new ContextMiddleware, would it be possible to:
Have middleware run during tool loop iterations (not just at the invocation boundary)?
Allow a compaction strategy to truly replace the context mid-execution?
If this is out of scope for this ADR, it might be worth noting as a future consideration—or confirming whether the new architecture would make such a feature easier to add later (or there is some other recommended parttern for this).
…dback - Add Option 3: ContextHooks with before_run/after_run pattern - Add detailed pros/cons for both wrapper and hooks approaches - Add Open Discussion section on context compaction strategies - Clarify response_messages is read-only (use AgentMiddleware for modifications) - Add SimpleRAG examples showing input-only filtering - Clarify default storage only added when NO middleware configured - Add RAGWithBuffer examples for self-managed history - Rename hook methods to before_run/after_run
e005fc3 to
761f2bf
Compare
Summary
This ADR proposes unifying \ContextProvider, \ChatMessageStore, and \AgentThread\ into a single \ContextMiddleware\ concept for the Python SDK.
Problem
Currently, developers doing 'Context Engineering' must understand multiple abstractions:
Proposed Solution
A unified \ContextMiddleware\ using the onion/wrapper pattern (like existing \AgentMiddleware):
\\python
class RAGMiddleware(ContextMiddleware):
async def process(self, context: SessionContext, next) -> None:
# Pre-processing: add context
docs = await self.retrieve_documents(context.input_messages[-1].text)
context.add_messages(self.source_id, [ChatMessage.system(f'Context: {docs}')])
agent = ChatAgent(
chat_client=client,
context_middleware=[
InMemoryStorageMiddleware('memory'),
RAGMiddleware('rag'),
]
)
\\
Key Decisions
Migration Impact
See the full ADR in \docs/decisions/00XX-python-context-middleware.md\ for detailed design, code examples, and implementation workplan.
Related Issues
This ADR addresses the following issues: