-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Python: [BREAKING] Moved to a single get_response and run API #3379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Python: [BREAKING] Moved to a single get_response and run API #3379
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR consolidates the Python Agent Framework's streaming and non-streaming APIs into a unified interface. The primary changes include:
Changes:
- Unified
run()andget_response()methods withstreamparameter replacing separaterun_stream()andget_streaming_response()methods - Migration from decorator-based (
@use_instrumentation,@use_function_invocation) to mixin-based architecture for telemetry and function invocation - Introduction of
ResponseStreamclass for unified stream handling with hooks, finalizers, and teardown support - Renamed
AgentExecutionExceptiontoAgentRunException
Reviewed changes
Copilot reviewed 84 out of 85 changed files in this pull request and generated 28 comments.
Show a summary per file
| File | Description |
|---|---|
_types.py |
Added ResponseStream class for unified streaming, updated prepare_messages to handle None |
_clients.py |
Refactored BaseChatClient with unified get_response() method, introduced FunctionInvokingChatClient mixin |
openai/_responses_client.py |
Consolidated streaming/non-streaming into single _inner_get_response() method |
openai/_chat_client.py |
Similar consolidation for chat completions API |
openai/_assistants_client.py |
Unified assistants API with stream parameter |
_workflows/_workflow.py |
Consolidated run() and run_stream() into single run(stream=bool) method |
_workflows/_agent.py |
Updated WorkflowAgent.run() to use stream parameter |
| Test files (multiple) | Updated all tests to use run(stream=True) and get_response(stream=True) |
| Sample files (multiple) | Updated samples to demonstrate new unified API |
| Provider clients | Updated all provider implementations (Azure, Anthropic, Bedrock, Ollama, etc.) to use mixins |
python/packages/core/agent_framework/openai/_assistants_client.py
Outdated
Show resolved
Hide resolved
python/packages/core/tests/workflow/test_agent_executor_tool_calls.py
Outdated
Show resolved
Hide resolved
07afd46 to
dd65afa
Compare
32f0473 to
5c78d91
Compare
| self, | ||
| messages: str | ChatMessage | Sequence[str | ChatMessage] | None = None, | ||
| *, | ||
| stream: Literal[False] = ..., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the ... required?
| return None | ||
|
|
||
| return response | ||
| return response # type: ignore[return-value,no-any-return] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: is this because of the generic parameter?
| response = ChatResponse.from_chat_response_updates(all_updates) | ||
| attributes = _get_response_attributes(attributes, response, duration=duration) | ||
| _capture_response( | ||
| duration = perf_counter() - start_time_stamp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to capture the duration in the span? The gen_ai.client.operation.duration is a metrics attribute.
| def _close_span() -> None: | ||
| if span_state["closed"]: | ||
| return | ||
| span_state["closed"] = True | ||
| span_cm.__exit__(None, None, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed?
- Add @overload decorators to AgentProtocol.run() for type compatibility - Add missing docstring params (middleware, function_invocation_configuration) - Fix TODO format (TD002) by adding author tags - Fix broken observability tests from upstream: - Replace non-existent use_instrumentation with direct instantiation - Replace non-existent use_agent_instrumentation with AgentTelemetryLayer mixin - Fix get_streaming_response to use get_response(stream=True) - Add AgentInitializationError import - Update streaming exception tests to match actual behavior
- Replace non-existent AgentExecutionException with AgentRunException
- Add 'tests' to pythonpath in ag-ui pyproject.toml for utils_test_ag_ui import - Replace deprecated asyncio.get_event_loop().run_until_complete with asyncio.run
- Update _prepare_options patching to use correct class path - Fix test_to_azure_ai_agent_tools_web_search_missing_connection to clear env vars
- Move test utilities to conftest.py for proper pytest discovery - Update all test imports to use conftest instead of utils_test_ag_ui - Remove old utils_test_ag_ui.py file - Revert pythonpath change in pyproject.toml
- Renamed BareChatClient to BaseChatClient (abstract base class) - Renamed BareOpenAIChatClient to RawOpenAIChatClient - Renamed BareOpenAIResponsesClient to RawOpenAIResponsesClient - Renamed BareAzureAIClient to RawAzureAIClient - Added warning docstrings to Raw* classes about layer ordering - Updated README in samples/getting_started/agents/custom with layer docs - Added test for span ordering with function calling
This ensures each inner LLM call gets its own telemetry span, resulting in the correct span sequence: chat -> execute_tool -> chat Updated all production clients and test mocks to use correct ordering: - ChatMiddlewareLayer (first) - FunctionInvocationLayer (second) - ChatTelemetryLayer (third) - BaseChatClient/Raw...Client (fourth)
…ages - Updated declarative workflows to use agent.run(stream=True) - Updated devui executor and discovery to use run() method - Updated durabletask entities to use run(stream=True) - Fixed lint errors and test updates
- Updated _update_conversation_id to also update options dict - Use mutable_options copy for proper propagation between loop iterations - Fixes Assistants client thread_id not found during function invocation
…3509) * Added ClaudeAgent implementation * Updated streaming logic * Small updates * Small update * Fixes * Small fix * Naming improvements * Updated imports * Addressed comments * Updated package versions
ebfc3b0 to
92995e6
Compare
Motivation and Context
Summary
ResponseStream[ChatResponseUpdate, ChatResponse]toResponseStream[AgentResponseUpdate, AgentResponse], but the object has a classmethod calledwrapthat is used to wrap the ResponseStream from the chat client into the new ResponseStream in the Agent.Description
Contribution Checklist