-
Notifications
You must be signed in to change notification settings - Fork 3.2k
[Cosmos] Hotfix azure cosmos 4.14.3 #44226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* change workloads based on feedback * add staging yml file * add staging yml file * test_none_options * test_none_options * update changelog * revert unrelated file * react to copilot comments
|
/azp run python - cosmos - tests |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This hotfix release (version 4.14.3) backports critical bugfixes from a beta release to stable. The PR addresses two main issues: improper timeout enforcement across client operations and errors when passing None values for optional parameters in various container methods. The timeout fix introduces operation-scoped timeout tracking to distinguish between cumulative timeouts (for multi-request operations like read_items) and per-page timeouts (for paginated queries), ensuring timeouts are enforced correctly throughout the request lifecycle. The None options fix modifies the valid_key_value_exist utility function to automatically remove None-valued parameters from kwargs, preventing them from causing errors downstream.
Key changes include:
- Comprehensive timeout tracking with
OperationStartTimeandTimeoutScopeconstants to properly track elapsed time across retry attempts and multiple requests - Modified exception handling to chain inner exceptions when raising
CosmosClientTimeoutErrorfor better error diagnostics - Enhanced test coverage with new test files
test_none_options.pyandtest_none_options_async.pyto validate None parameter handling across all container methods
Reviewed changes
Copilot reviewed 29 out of 29 changed files in this pull request and generated 21 comments.
Show a summary per file
| File | Description |
|---|---|
azure/cosmos/_version.py |
Version bump from 4.14.2 to 4.14.3 |
azure/cosmos/exceptions.py |
Added optional message parameter to CosmosClientTimeoutError constructor |
azure/cosmos/_constants.py |
Added TimeoutScope enum and OperationStartTime constant for timeout tracking |
azure/cosmos/_utils.py |
Enhanced valid_key_value_exist to remove None values from kwargs |
azure/cosmos/_base.py |
Added OperationStartTime initialization in build_options |
azure/cosmos/_retry_utility.py |
Refactored timeout checking to use operation start time and chain inner exceptions |
azure/cosmos/aio/_retry_utility_async.py |
Async version of retry utility timeout refactoring |
azure/cosmos/_query_iterable.py |
Added timeout checking before fetching next page with TimeoutScope awareness |
azure/cosmos/aio/_query_iterable_async.py |
Async version of query iterable timeout checking |
azure/cosmos/container.py |
Enhanced _get_properties_with_options to pass timeout and OperationStartTime; added TimeoutScope.OPERATION for read_items |
azure/cosmos/aio/_container.py |
Async version of container timeout propagation changes |
azure/cosmos/_execution_context/* |
Updated callback signatures to accept kwargs for timeout propagation |
azure/cosmos/_synchronized_request.py |
Added OperationStartTime cleanup from kwargs |
azure/cosmos/aio/_asynchronous_request.py |
Async version of OperationStartTime cleanup |
tests/test_none_options.py |
New comprehensive test suite for None parameter handling (sync) |
tests/test_none_options_async.py |
New comprehensive test suite for None parameter handling (async) |
tests/test_crud.py |
Added extensive timeout tests including connection errors, throttling, and inner exception validation |
tests/test_crud_async.py |
Async version of timeout tests with transport mocking |
tests/test_read_items.py |
Removed unused imports (CosmosDict, CaseInsensitiveDict) |
tests/test_changefeed_partition_key_variation*.py |
Fixed function signature to accept **kwargs |
tests/workloads/run_workloads.sh |
Removed blocking wait command |
docs/TimeoutAndRetriesConfig.md |
Updated documentation to clarify timeout applies per-operation not per-request |
CHANGELOG.md |
Added 4.14.3 release notes documenting both bugfixes |
| populate_index_metrics=None, populate_query_metrics=None, priority=None, | ||
| response_hook=None, session_token=None, throughput_bucket=None) | ||
| _items = [doc async for doc in pager] | ||
| assert _items == _items |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assertion assert _items == _items is a tautology that always passes and provides no actual test validation. Consider replacing with assert isinstance(_items, list) to at least verify the result type.
| assert _items == _items | |
| assert isinstance(_items, list) |
|
|
||
| # Check timeout before fetching next block | ||
| if timeout: | ||
| elapsed = time.time() - self._options.get(_Constants.OperationStartTime) |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If OperationStartTime is not set in _options, self._options.get(_Constants.OperationStartTime) will return None, causing a TypeError when subtracting from time.time(). This can happen if TimeoutScope is OPERATION and the operation start time was never initialized. Add a default value: self._options.get(_Constants.OperationStartTime, time.time()).
| elapsed = time.time() - self._options.get(_Constants.OperationStartTime) | |
| elapsed = time.time() - self._options.get(_Constants.OperationStartTime, time.time()) |
| single_partition_key = 0 | ||
|
|
||
| large_string = 'a' * 1000 # 1KB string | ||
| for i in range(200): # Insert 500 documents |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment incorrectly states "Insert 500 documents" but the loop range is 200. Should be "Insert 200 documents".
| for i in range(200): # Insert 500 documents | |
| for i in range(200): # Insert 200 documents |
| pre_trigger_include=None, post_trigger_include=None, | ||
| session_token=None, priority=None, | ||
| throughput_bucket=None) | ||
| assert any(r.get("resourceBody").get("id") == id1 for r in batch_result) or any(r.get("resourceBody").get("id") == id2 for r in batch_result) |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assertion logic is overly complex and hard to understand. Consider simplifying to: assert any(r.get("resourceBody", {}).get("id") in [id1, id2] for r in batch_result) which is clearer and also includes a safe default for missing "resourceBody" keys.
| assert any(r.get("resourceBody").get("id") == id1 for r in batch_result) or any(r.get("resourceBody").get("id") == id2 for r in batch_result) | |
| assert any(r.get("resourceBody", {}).get("id") in [id1, id2] for r in batch_result) |
| async def test_list_conflicts_none_options_async(self): | ||
| pager = self.container.list_conflicts(max_item_count=None, response_hook=None) | ||
| conflicts = [c async for c in pager] | ||
| assert conflicts == conflicts # simple sanity (may be empty) |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assertion assert conflicts == conflicts is a tautology that always passes and provides no actual test validation. Consider replacing with assert isinstance(conflicts, list) to at least verify the result type.
| assert conflicts == conflicts # simple sanity (may be empty) | |
| assert isinstance(conflicts, list) # simple sanity (may be empty) |
| pager = self.container.query_conflicts("SELECT * FROM c", parameters=None, partition_key=None, | ||
| max_item_count=None, response_hook=None, enable_cross_partition_query=True) | ||
| conflicts = list(pager) | ||
| assert conflicts == conflicts |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assertion assert conflicts == conflicts is a tautology that always passes and provides no actual test validation. Consider replacing with assert isinstance(conflicts, list) to at least verify the result type.
| assert conflicts == conflicts | |
| assert isinstance(conflicts, list) |
| async def test_timeout_for_point_operation_async(self): | ||
| """Test that point operations respect client timeout""" | ||
|
|
||
| # Create a container for testing | ||
| created_container = await self.database_for_test.create_container( | ||
| id='point_op_timeout_container_' + str(uuid.uuid4()), | ||
| partition_key=PartitionKey(path="/pk") | ||
| ) | ||
|
|
||
| # Create a test item | ||
| test_item = { | ||
| 'id': 'test_item_1', | ||
| 'pk': 'partition1', | ||
| 'data': 'test_data' | ||
| } | ||
| await created_container.create_item(test_item) | ||
|
|
||
| # Long timeout should succeed | ||
| result = await created_container.read_item( | ||
| item='test_item_1', | ||
| partition_key='partition1', | ||
| timeout=1.0 # 1 second timeout | ||
| ) | ||
| self.assertEqual(result['id'], 'test_item_1') |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test creates a container but never cleans it up. Consider adding cleanup logic (e.g., await self.database_for_test.delete_container(created_container.id) at the end of the test or in a finally block) to avoid resource leaks in the test environment.
| def test_timeout_for_point_operation(self): | ||
| """Test that point operations respect client timeout""" | ||
|
|
||
| # Create a container for testing | ||
| created_container = self.databaseForTest.create_container( | ||
| id='point_op_timeout_container_' + str(uuid.uuid4()), | ||
| partition_key=PartitionKey(path="/pk") | ||
| ) | ||
|
|
||
| # Create a test item | ||
| test_item = { | ||
| 'id': 'test_item_1', | ||
| 'pk': 'partition1', | ||
| 'data': 'test_data' | ||
| } | ||
| created_container.create_item(test_item) | ||
|
|
||
| # Test 1: Short timeout should fail | ||
| with self.assertRaises(exceptions.CosmosClientTimeoutError): | ||
| created_container.read_item( | ||
| item='test_item_1', | ||
| partition_key='partition1', | ||
| timeout=0.00000002 # very small timeout to force failure | ||
| ) | ||
|
|
||
| # Test 2: Long timeout should succeed | ||
| result = created_container.read_item( | ||
| item='test_item_1', | ||
| partition_key='partition1', | ||
| timeout=3.0 | ||
| ) | ||
| self.assertEqual(result['id'], 'test_item_1') |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test creates a container but never cleans it up. Consider adding cleanup logic (e.g., self.databaseForTest.delete_container(created_container.id) at the end of the test or in a finally block) to avoid resource leaks in the test environment.
| # Test 2: Timeout too short should fail | ||
| item_pages_short_timeout = container_with_delay.query_items( | ||
| query="SELECT * FROM c", | ||
| #enable_cross_partition_query=True, |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commented out enable_cross_partition_query=True should either be removed or uncommented with an explanation. Leaving commented code without context makes the code harder to maintain.
|
/azp run python - cosmos - tests |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
Our most recent beta release has some fixes we should include as a stable version release. This PR serves to show the difference between the previous version (4.14.2) and this one with the bugfixes we want to release to a stable version.