Skip to content

Conversation

@eavanvalkenburg
Copy link
Member

@eavanvalkenburg eavanvalkenburg commented Jan 22, 2026

Motivation and Context

Summary

  • Migrate chat/agent telemetry to mixin-based usage and remove legacy decorators, with streaming telemetry now using finalizers/teardown hooks instead of consuming streams.
    • This makes understanding the code a lot simpler, because we can set attributes on the chat client in the init of those mixin (making them technically not a mixin)
    • Added those parameters to the constructors, making it easier to configure things like function calling
  • Replace function invocation decorators with FunctionInvokingChatClient/FunctionInvokingMixin across clients, tests, and samples; update docs/comments accordingly.
  • Introducing a ResponseStream object that can is created to unify the API's
    • It is generic over TUpdate and TFinal, which in our case is usually ChatResponseUpdate and ChatReponse or the agent equivalent.
    • It features a update_hook mechanism, to allow you to run code while the internal stream is being unpacked, this can mostly be leveraged by middleware
    • It features a teardown hook mechanism, this get's run when the stream is exhausted, it's used now by the telemtry to record the duration
    • It features a finalizer (one or more) mechanism, that runs after the end of the stream, which is used to turn the updates list into a final object, this can be used by middleware and is also used in function calling and telemetry
    • In principle the ResponseStream is created by the most lowlevel object, the actual chat client implementations, and ideally all the layers in between should only use the hooks to do something, FunctionCalling does not work that way, because there are multiple calls to the underlying chat client that all then have to be combined into a single stream at runtime. Agent also creates a new stream, because it goes from ResponseStream[ChatResponseUpdate, ChatResponse] to ResponseStream[AgentResponseUpdate, AgentResponse], but the object has a classmethod called wrap that is used to wrap the ResponseStream from the chat client into the new ResponseStream in the Agent.
  • Overall this change reduces the number of times we iterate the stream and return a new AsyncGenerator and the new hooks actually make it simpler to create middleware that alters the stream (as the sample shows), it should therefore also improve performance a bit.
  • Removed use_instrumentation/use_agent_instrumentation and use_function_invocation decorators; mixins are now the supported path.

Description

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Copilot AI review requested due to automatic review settings January 22, 2026 17:34
@markwallace-microsoft markwallace-microsoft added documentation Improvements or additions to documentation python labels Jan 22, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR consolidates the Python Agent Framework's streaming and non-streaming APIs into a unified interface. The primary changes include:

Changes:

  • Unified run() and get_response() methods with stream parameter replacing separate run_stream() and get_streaming_response() methods
  • Migration from decorator-based (@use_instrumentation, @use_function_invocation) to mixin-based architecture for telemetry and function invocation
  • Introduction of ResponseStream class for unified stream handling with hooks, finalizers, and teardown support
  • Renamed AgentExecutionException to AgentRunException

Reviewed changes

Copilot reviewed 84 out of 85 changed files in this pull request and generated 28 comments.

Show a summary per file
File Description
_types.py Added ResponseStream class for unified streaming, updated prepare_messages to handle None
_clients.py Refactored BaseChatClient with unified get_response() method, introduced FunctionInvokingChatClient mixin
openai/_responses_client.py Consolidated streaming/non-streaming into single _inner_get_response() method
openai/_chat_client.py Similar consolidation for chat completions API
openai/_assistants_client.py Unified assistants API with stream parameter
_workflows/_workflow.py Consolidated run() and run_stream() into single run(stream=bool) method
_workflows/_agent.py Updated WorkflowAgent.run() to use stream parameter
Test files (multiple) Updated all tests to use run(stream=True) and get_response(stream=True)
Sample files (multiple) Updated samples to demonstrate new unified API
Provider clients Updated all provider implementations (Azure, Anthropic, Bedrock, Ollama, etc.) to use mixins

@eavanvalkenburg eavanvalkenburg force-pushed the python_single_response branch 3 times, most recently from 07afd46 to dd65afa Compare January 23, 2026 10:46
@markwallace-microsoft
Copy link
Member

markwallace-microsoft commented Jan 23, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/a2a/agent_framework_a2a
   _agent.py148894%262, 400–401, 438–439, 468–470
packages/ag-ui/agent_framework_ag_ui
   _client.py1552186%83–84, 88–92, 96–100, 263, 295, 464–466, 481–484
   _run.py44112471%154–161, 304, 323–324, 339–340, 351, 354–355, 357–358, 360–362, 372, 382–385, 389–391, 393, 403, 406–409, 411–412, 415–421, 424–426, 429, 445–447, 454, 460–461, 463–464, 478–484, 495, 508, 510–511, 545–546, 603–605, 617–619, 643, 648–650, 766, 777–778, 785, 803–805, 839–841, 856, 862, 870, 872, 908–914, 917–920, 922–931, 934, 941–942, 947, 953–955, 968–970
   _types.py360100% 
   _utils.py101298%257, 262
packages/ag-ui/agent_framework_ag_ui/_orchestration
   _tooling.py570100% 
packages/anthropic/agent_framework_anthropic
   _chat_client.py36115058%371, 403, 405, 420, 442–445, 454, 456, 487–491, 493, 495–496, 498, 503–504, 506, 539–540, 549, 551–552, 557, 574–575, 617, 632, 636–637, 653, 662, 664, 668–669, 712–714, 716, 729–730, 737–739, 743–745, 749–752, 763, 765, 787, 797, 819–825, 832–833, 841–842, 850–853, 860–861, 867–868, 874–875, 881, 889–891, 895, 902–903, 909–910, 916–917, 923, 931–934, 941–942, 961, 968–969, 988, 1010, 1012, 1021–1022, 1028, 1050–1051, 1057–1058, 1067–1077, 1084–1090, 1097–1103, 1110–1119, 1126–1129
packages/azure-ai/agent_framework_azure_ai
   _chat_client.py4837584%382, 387–388, 390–391, 394, 397, 399, 404, 665–666, 668, 671, 674, 677–682, 685, 687, 695, 707–709, 713, 716–717, 725–728, 738, 746–749, 751–752, 754–755, 762, 770–771, 779–780, 785–786, 790–797, 802, 805, 813, 819, 827–829, 832, 854–855, 984, 1012, 1027, 1148, 1174, 1183, 1192, 1325
   _client.py1931194%360, 362, 410, 438–441, 484, 519, 521, 597
packages/copilotstudio/agent_framework_copilotstudio
   _agent.py83593%155–156, 191, 199, 316
packages/core/agent_framework
   _agents.py3325284%471, 897, 955, 960, 977–978, 980–981, 984, 987–991, 994, 997–999, 1002, 1009, 1053–1055, 1168, 1209, 1211, 1220–1225, 1231, 1233, 1243–1244, 1251, 1253–1254, 1262–1266, 1274–1275, 1277, 1282, 1284, 1318, 1358, 1378
   _clients.py52590%294, 318–319, 495, 497
   _middleware.py3831296%761, 763, 845, 861, 891, 910, 1043, 1058, 1060, 1299, 1448, 1527
   _serialization.py105496%516, 532, 542, 610
   _tools.py7817890%229, 275, 326, 328, 356, 526, 561–562, 664, 666, 686, 704, 718, 730, 735, 737, 744, 777, 833–835, 876, 901–910, 916–925, 961, 969, 1210, 1415, 1503, 1507, 1590–1594, 1614, 1616–1617, 1722, 1759, 1761, 1773, 1775, 1840, 1867, 1921, 2000, 2189, 2211–2212, 2274–2275, 2297–2298, 2320–2325
   _types.py111310390%86, 109–110, 164, 169, 188, 190, 194, 198, 200, 202, 204, 222, 226, 252, 274, 279, 284, 288, 314, 318, 664–665, 1036, 1098, 1115, 1133, 1138, 1156, 1166, 1183–1184, 1186, 1204–1205, 1207, 1214–1215, 1217, 1252, 1263–1264, 1266, 1304, 1549, 1554, 1558, 1562, 1748, 1757, 1767, 1812, 1855–1860, 1882, 1887, 2182, 2288, 2297, 2445, 2658, 2662, 2673, 2675, 2680, 2735, 2832–2834, 2873, 2965, 2992, 3001, 3247–3249, 3252–3254, 3258, 3263, 3267, 3379–3381, 3409, 3463, 3467–3469, 3471, 3482–3483, 3486–3490, 3496
   exceptions.py480100% 
   observability.py5948286%331, 333–335, 338–340, 345–346, 352–353, 359–360, 367, 369–371, 374–376, 381–382, 388–389, 395–396, 403, 659, 662, 670–671, 674–677, 679, 682–684, 687–688, 716, 718, 729–731, 733–736, 740, 748, 849, 851, 1000, 1002, 1006–1011, 1013, 1016–1020, 1022, 1135–1136, 1138, 1156, 1303, 1321, 1454–1456, 1515, 1687, 1841, 1843
packages/core/agent_framework/_workflows
   _agent.py2844584%62, 70–76, 104–105, 297, 355, 369, 382, 431–434, 440, 446, 450–451, 454–460, 464–465, 534, 541, 547–548, 559, 591, 598, 619, 628, 632, 634–636, 643
   _agent_executor.py1702386%94, 116, 150, 166–167, 218–219, 221–222, 254–256, 264–266, 276–278, 280, 284, 288, 292–293
   _handoff.py3825785%110–111, 113, 142–143, 163–173, 175, 177, 179, 184, 284, 338, 363, 389, 397–398, 412, 461–462, 492, 539–541, 724, 731, 736, 823, 826, 835–838, 848, 853, 860, 866–869, 904, 909, 1106, 1109, 1117, 1135, 1142, 1217
   _workflow.py2511793%88, 258–260, 262–263, 281, 309, 410, 678, 712, 717, 720, 739–741, 806
packages/core/agent_framework/azure
   _chat_client.py79494%301, 303, 316–317
   _responses_client.py37683%146, 169, 198–201
packages/core/agent_framework/openai
   _assistants_client.py2753587%359, 361, 363, 366, 370–371, 374, 377, 382–383, 385, 388–390, 395, 406, 431, 433, 435, 437, 439, 444, 447, 450, 454, 465, 550, 635, 670, 707–710, 762, 779
   _chat_client.py2672192%180–181, 185, 298, 305, 386–393, 395–398, 408, 493, 530, 546
   _responses_client.py5626288%277–278, 283, 314, 322, 345, 407, 439, 464, 470, 488–489, 511, 516, 572, 586, 603, 616, 671, 750, 755, 759–761, 765–766, 789, 858, 880–881, 896–897, 915–916, 1047–1048, 1064, 1066, 1141–1149, 1197, 1252, 1267, 1303–1304, 1306–1308, 1322–1324, 1334–1335, 1341, 1356
   _shared.py1351688%63, 69–72, 151, 153, 155, 162, 164, 177, 253, 277, 341–342, 344
TOTAL16441208787% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
3702 225 💤 0 ❌ 0 🔥 1m 8s ⏱️

@eavanvalkenburg eavanvalkenburg changed the title Python: [BREAKING} Python single response Python: [BREAKING] Moved to a single get_response and run API Jan 23, 2026
@eavanvalkenburg eavanvalkenburg force-pushed the python_single_response branch 4 times, most recently from 32f0473 to 5c78d91 Compare January 30, 2026 05:03
@eavanvalkenburg eavanvalkenburg requested a review from a team as a code owner January 30, 2026 16:25
self,
messages: str | ChatMessage | Sequence[str | ChatMessage] | None = None,
*,
stream: Literal[False] = ...,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the ... required?

return None

return response
return response # type: ignore[return-value,no-any-return]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: is this because of the generic parameter?

response = ChatResponse.from_chat_response_updates(all_updates)
attributes = _get_response_attributes(attributes, response, duration=duration)
_capture_response(
duration = perf_counter() - start_time_stamp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to capture the duration in the span? The gen_ai.client.operation.duration is a metrics attribute.

Comment on lines +1154 to +1158
def _close_span() -> None:
if span_state["closed"]:
return
span_state["closed"] = True
span_cm.__exit__(None, None, None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

@markwallace-microsoft markwallace-microsoft added .NET workflows Related to Workflows in agent-framework lab Agent Framework Lab labels Feb 1, 2026
@github-actions github-actions bot changed the title Python: [BREAKING] Moved to a single get_response and run API .NET: Python: [BREAKING] Moved to a single get_response and run API Feb 1, 2026
eavanvalkenburg and others added 26 commits February 1, 2026 15:54
- Add @overload decorators to AgentProtocol.run() for type compatibility
- Add missing docstring params (middleware, function_invocation_configuration)
- Fix TODO format (TD002) by adding author tags
- Fix broken observability tests from upstream:
  - Replace non-existent use_instrumentation with direct instantiation
  - Replace non-existent use_agent_instrumentation with AgentTelemetryLayer mixin
  - Fix get_streaming_response to use get_response(stream=True)
  - Add AgentInitializationError import
  - Update streaming exception tests to match actual behavior
- Replace non-existent AgentExecutionException with AgentRunException
- Add 'tests' to pythonpath in ag-ui pyproject.toml for utils_test_ag_ui import
- Replace deprecated asyncio.get_event_loop().run_until_complete with asyncio.run
- Update _prepare_options patching to use correct class path
- Fix test_to_azure_ai_agent_tools_web_search_missing_connection to clear env vars
- Move test utilities to conftest.py for proper pytest discovery
- Update all test imports to use conftest instead of utils_test_ag_ui
- Remove old utils_test_ag_ui.py file
- Revert pythonpath change in pyproject.toml
- Renamed BareChatClient to BaseChatClient (abstract base class)
- Renamed BareOpenAIChatClient to RawOpenAIChatClient
- Renamed BareOpenAIResponsesClient to RawOpenAIResponsesClient
- Renamed BareAzureAIClient to RawAzureAIClient
- Added warning docstrings to Raw* classes about layer ordering
- Updated README in samples/getting_started/agents/custom with layer docs
- Added test for span ordering with function calling
This ensures each inner LLM call gets its own telemetry span, resulting in
the correct span sequence: chat -> execute_tool -> chat

Updated all production clients and test mocks to use correct ordering:
- ChatMiddlewareLayer (first)
- FunctionInvocationLayer (second)
- ChatTelemetryLayer (third)
- BaseChatClient/Raw...Client (fourth)
…ages

- Updated declarative workflows to use agent.run(stream=True)
- Updated devui executor and discovery to use run() method
- Updated durabletask entities to use run(stream=True)
- Fixed lint errors and test updates
- Updated _update_conversation_id to also update options dict
- Use mutable_options copy for proper propagation between loop iterations
- Fixes Assistants client thread_id not found during function invocation
…3509)

* Added ClaudeAgent implementation

* Updated streaming logic

* Small updates

* Small update

* Fixes

* Small fix

* Naming improvements

* Updated imports

* Addressed comments

* Updated package versions
@eavanvalkenburg eavanvalkenburg changed the title .NET: Python: [BREAKING] Moved to a single get_response and run API Python: [BREAKING] Moved to a single get_response and run API Feb 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation lab Agent Framework Lab python workflows Related to Workflows in agent-framework

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants