Skip to content

Conversation

@eavanvalkenburg
Copy link
Member

@eavanvalkenburg eavanvalkenburg commented Feb 2, 2026

Summary

This ADR proposes unifying \ContextProvider, \ChatMessageStore, and \AgentThread\ into a single \ContextMiddleware\ concept for the Python SDK.

Problem

Currently, developers doing 'Context Engineering' must understand multiple abstractions:

  • \ContextProvider\ - Injects instructions, messages, and tools
  • \ChatMessageStore\ - Stores and retrieves conversation history
  • \AgentThread\ - Coordinates between them

Proposed Solution

A unified \ContextMiddleware\ using the onion/wrapper pattern (like existing \AgentMiddleware):

\\python
class RAGMiddleware(ContextMiddleware):
async def process(self, context: SessionContext, next) -> None:
# Pre-processing: add context
docs = await self.retrieve_documents(context.input_messages[-1].text)
context.add_messages(self.source_id, [ChatMessage.system(f'Context: {docs}')])

    await next(context)

    # Post-processing: store/audit
    await self.store_interaction(context.input_messages, context.response_messages)

agent = ChatAgent(
chat_client=client,
context_middleware=[
InMemoryStorageMiddleware('memory'),
RAGMiddleware('rag'),
]
)
\\

Key Decisions

Decision Rationale
Onion/wrapper pattern Familiar from existing middleware, natural pre/post processing
Agent owns config, Session owns pipeline Enables per-session factories
Mandatory \source_id\ Attribution in \context_messages\ dict
Default \InMemoryStorageMiddleware\ Zero-config conversation history
Single \StorageContextMiddleware\ class Configure for memory/audit/evaluation
Clean break (no shims) We're in preview

Migration Impact

Current New
\ContextProvider\ \ContextMiddleware\
\ChatMessageStore\ \StorageContextMiddleware\
\AgentThread\ \AgentSession\

See the full ADR in \docs/decisions/00XX-python-context-middleware.md\ for detailed design, code examples, and implementation workplan.

Related Issues

This ADR addresses the following issues:

Copilot AI review requested due to automatic review settings February 2, 2026 09:24
@markwallace-microsoft markwallace-microsoft added the documentation Improvements or additions to documentation label Feb 2, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new ADR proposing a unified ContextMiddleware abstraction for Python to replace ContextProvider, ChatMessageStore, and AgentThread, using an onion/wrapper middleware pipeline pattern.

Changes:

  • Introduces a proposed ContextMiddleware + SessionContext + pipeline design for composable context engineering.
  • Describes a StorageContextMiddleware approach for loading/storing conversation history and optional auditing.
  • Outlines migration impact and a phased implementation/testing plan.

Comment on lines +1 to +10
---
# These are optional elements. Feel free to remove any of them.
status: proposed
contact: eavanvalkenburg
date: 2026-02-02
deciders: eavanvalkenburg, markwallace-microsoft, sphenry, alliscode, johanst, brettcannon
consulted: taochenosu, moonbox3, dmytrostruk, giles17
---

# Unifying Context Management with ContextMiddleware
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ADR is using the placeholder number “00XX” in the filename/path. Per docs/decisions/README.md (step 1), ADRs should be named with the next sequential number (currently 0015-…). Please rename the file accordingly and update any references to the old filename.

Copilot uses AI. Check for mistakes.
| 2 | **Instance or Factory** | Middleware can be shared instances or `(session_id) -> Middleware` factories for per-session state. |
| 3 | **Default Storage at Runtime** | `InMemoryStorageMiddleware` auto-added when no service_session_id, store≠True, and no pipeline. Evaluated at runtime so users can modify pipeline first. |
| 4 | **Multiple Storage Allowed** | Warn if multiple have `load_messages=True` (likely misconfiguration). |
| 5 | **Single Storage Class** | One `StorageContextMiddleware` configured for memory/audit/evaluation - no separate classes. |
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design decision #5 says “Single Storage Class… no separate classes”, but later examples and the implementation plan reference multiple storage middleware classes (e.g., InMemoryStorageMiddleware, RedisStorageMiddleware, CosmosStorageMiddleware). Please reconcile this wording (e.g., “single storage middleware abstraction/base class with multiple implementations”) so the ADR is internally consistent.

Suggested change
| 5 | **Single Storage Class** | One `StorageContextMiddleware` configured for memory/audit/evaluation - no separate classes. |
| 5 | **Single Storage Abstraction, Multiple Implementations** | One unified storage middleware abstraction (e.g., `StorageContextMiddleware`-style) with multiple concrete implementations (in-memory, Redis, Cosmos) configured for memory/audit/evaluation. |

Copilot uses AI. Check for mistakes.
Comment on lines +406 to +1082
return session
```

Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SessionContext docstring refers to add_context_messages(), but the class defines add_messages() as the API for adding context messages. Please update the docstring to match the actual method name to avoid confusion.

Copilot uses AI. Check for mistakes.
Comment on lines +553 to +1457
metadata: dict[str, Any] | None = None,
):
self.session_id = session_id
self.service_session_id = service_session_id
self.input_messages = input_messages
self.context_messages: dict[str, list[ChatMessage]] = context_messages or {}
self.instructions: list[str] = instructions or []
self.tools: list[ToolProtocol] = tools or []
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the code sample, ContextMiddlewareFactory / ContextMiddlewareConfig reference ContextMiddleware before the ContextMiddleware class is defined. As written, this would raise a NameError at runtime unless you use from __future__ import annotations or quote the type / move these aliases below the class definition.

Copilot uses AI. Check for mistakes.
Comment on lines +602 to +1507
tools: The tools to add
"""
for tool in tools:
# Add source attribution to tool metadata
if hasattr(tool, 'metadata') and isinstance(tool.metadata, dict):
tool.metadata["context_source"] = source_id
self.tools.extend(tools)

# --- Methods for reading context ---
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example in the ContextMiddleware docstring has incorrect indentation/structure: the “POST-PROCESSING” block appears nested under the factory function, not inside process(), which makes the example invalid and hard to follow. Please fix the snippet structure so the post-processing example is shown within process() after await next(context).

Copilot uses AI. Check for mistakes.
Comment on lines +648 to +1549
3. Response messages (if include_response=True)

Args:
include_input: If True, append input_messages after context
include_response: If True, append response_messages at the end
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ContextMiddleware.process() docstring references context.history_messages, but SessionContext does not define a history_messages attribute (history appears to be stored under context_messages keyed by source_id). Please update the docstring to reference the correct API/field so implementers know where to read loaded history from.

Copilot uses AI. Check for mistakes.
Comment on lines +1011 to +1917
UserWarning
)

async def session_created(self, session_id: str | None) -> None:
"""Notify all middleware that a session was created."""
for middleware in self._middleware:
await middleware.session_created(session_id)

async def execute(self, context: SessionContext) -> None:
"""Execute the middleware pipeline."""
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the AgentSession sample, _ensure_default_storage() uses len(self._context_pipeline) and calls self._context_pipeline.prepend(...), but the provided ContextMiddlewarePipeline sample does not define __len__ or prepend. Please either add these APIs to the pipeline sample or adjust the sample logic (e.g., check self._context_pipeline is None / expose middleware length) to keep the ADR’s code consistent.

Copilot uses AI. Check for mistakes.
Comment on lines +1068 to +1970
Default storage behavior (applied at runtime, not init):
- If service_session_id is set: service handles storage, no default added
- If options.store=True: user expects service storage, no default added
- If no service_session_id AND store is not True AND no pipeline:
InMemoryStorageMiddleware is automatically added

Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the ChatAgent sample, context_middleware is typed as Sequence[ContextMiddleware], but earlier in the ADR the configuration type is ContextMiddlewareConfig = ContextMiddleware | ContextMiddlewareFactory and ContextMiddlewarePipeline.from_config() expects configs (instances or factories). Please update the sample signature/type to Sequence[ContextMiddlewareConfig] (or similar) so it matches the proposed API.

Copilot uses AI. Check for mistakes.
await next(context)

# Post-processing
await self.store_interaction(context.input_messages, context.response_messages)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you modify response_messages here? If you do, do those modifications get returned to the callers, both middleware higher in the stack and the user?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the intent will be (and i'll clarify) that this is not the place to alter the responses, just read them and do something, the AgentMiddleware can be used for that purpose.

Comment on lines +207 to +838
InMemoryStorageMiddleware("memory"),
RAGContextMiddleware("rag"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the user only wanted to use the user input to do a rag search, rather than that plus the chat history, how do they filter to only user input?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the setup of each middleware there are controls over which messages they want to use, and since all messages from other context providers are stored separately that is fully configurable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an example of that?

Comment on lines 295 to 297
- No `service_session_id` (service not managing storage)
- `options.store` is not `True` (user not expecting service storage)
- Pipeline is empty or None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the user only supplies regular context middleware and no storage context middleware, does the user get no chat history storage?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we can go two routes here, both only for when there is no service_session_id and store==False:

  1. If the pipeline is present but does not have StorageMiddleware -> Add InMemoryStorage
  2. If there is NO middleware(pipeline) -> Add pipeline with InMemoryStorage

The first might be easier for getting started but the second is clearer, because if you do already have a pipeline with a middleware the order suddenly matters, so we should clarify that in that case the user should do this themselves because we do not want to make assumptions about ordering. This will make sure that the simple case always works, but once people start adding their own middlewares, they should know what they want to do and thus I think approach two is the way to do.

- No composability for context providers
- Inconsistent with middleware pattern used elsewhere

### Option 2: ContextMiddleware (Chosen)
Copy link
Contributor

@victordibia victordibia Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this ADR! The unified ContextMiddleware pattern is a nice improvement for context organization and attribution.

One question: would this design support implementing context compaction strategies?

For example, a common need is letting agents run for arbitrary long periods by automatically compacting/replacing the message history when a max context budget is hit. E.g., agent runs calls 10s or 100s of tool calls + model call in succession and between those calls may compact context between.

Currently this is challenging to do today because:

  • ChatMessageStore.list_messages() is only called once at the start of agent.run(), not during the tool loop
  • ChatMiddleware operates on a copy of messages, so modifications don't persist across tool loop iterations

see some related notes here.

With the new ContextMiddleware, would it be possible to:

Have middleware run during tool loop iterations (not just at the invocation boundary)?
Allow a compaction strategy to truly replace the context mid-execution?

If this is out of scope for this ADR, it might be worth noting as a future consideration—or confirming whether the new architecture would make such a feature easier to add later (or there is some other recommended parttern for this).

@dmytrostruk @alliscode @markwallace-microsoft

…dback

- Add Option 3: ContextHooks with before_run/after_run pattern
- Add detailed pros/cons for both wrapper and hooks approaches
- Add Open Discussion section on context compaction strategies
- Clarify response_messages is read-only (use AgentMiddleware for modifications)
- Add SimpleRAG examples showing input-only filtering
- Clarify default storage only added when NO middleware configured
- Add RAGWithBuffer examples for self-managed history
- Rename hook methods to before_run/after_run
@eavanvalkenburg eavanvalkenburg force-pushed the adr-python-context-middleware branch from e005fc3 to 761f2bf Compare February 4, 2026 11:12
@github-actions github-actions bot changed the title ADR: Unifying Context Management with ContextMiddleware (Python) Python: ADR: Unifying Context Management with ContextMiddleware (Python) Feb 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants