Skip to content

Conversation

@simorenoh
Copy link
Member

Our most recent beta release has some fixes we should include as a stable version release. This PR serves to show the difference between the previous version (4.14.2) and this one with the bugfixes we want to release to a stable version.

dibahlfi and others added 3 commits December 2, 2025 14:23
* change workloads based on feedback

* add staging yml file

* add staging yml file

* test_none_options

* test_none_options

* update changelog

* revert unrelated file

* react to copilot comments
@simorenoh simorenoh requested a review from a team as a code owner December 2, 2025 19:47
Copilot AI review requested due to automatic review settings December 2, 2025 19:47
@simorenoh simorenoh marked this pull request as draft December 2, 2025 19:48
@github-actions github-actions bot added the Cosmos label Dec 2, 2025
@simorenoh
Copy link
Member Author

/azp run python - cosmos - tests

@azure-pipelines
Copy link

Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command.

Copilot finished reviewing on behalf of simorenoh December 2, 2025 19:54
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This hotfix release (version 4.14.3) backports critical bugfixes from a beta release to stable. The PR addresses two main issues: improper timeout enforcement across client operations and errors when passing None values for optional parameters in various container methods. The timeout fix introduces operation-scoped timeout tracking to distinguish between cumulative timeouts (for multi-request operations like read_items) and per-page timeouts (for paginated queries), ensuring timeouts are enforced correctly throughout the request lifecycle. The None options fix modifies the valid_key_value_exist utility function to automatically remove None-valued parameters from kwargs, preventing them from causing errors downstream.

Key changes include:

  • Comprehensive timeout tracking with OperationStartTime and TimeoutScope constants to properly track elapsed time across retry attempts and multiple requests
  • Modified exception handling to chain inner exceptions when raising CosmosClientTimeoutError for better error diagnostics
  • Enhanced test coverage with new test files test_none_options.py and test_none_options_async.py to validate None parameter handling across all container methods

Reviewed changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 21 comments.

Show a summary per file
File Description
azure/cosmos/_version.py Version bump from 4.14.2 to 4.14.3
azure/cosmos/exceptions.py Added optional message parameter to CosmosClientTimeoutError constructor
azure/cosmos/_constants.py Added TimeoutScope enum and OperationStartTime constant for timeout tracking
azure/cosmos/_utils.py Enhanced valid_key_value_exist to remove None values from kwargs
azure/cosmos/_base.py Added OperationStartTime initialization in build_options
azure/cosmos/_retry_utility.py Refactored timeout checking to use operation start time and chain inner exceptions
azure/cosmos/aio/_retry_utility_async.py Async version of retry utility timeout refactoring
azure/cosmos/_query_iterable.py Added timeout checking before fetching next page with TimeoutScope awareness
azure/cosmos/aio/_query_iterable_async.py Async version of query iterable timeout checking
azure/cosmos/container.py Enhanced _get_properties_with_options to pass timeout and OperationStartTime; added TimeoutScope.OPERATION for read_items
azure/cosmos/aio/_container.py Async version of container timeout propagation changes
azure/cosmos/_execution_context/* Updated callback signatures to accept kwargs for timeout propagation
azure/cosmos/_synchronized_request.py Added OperationStartTime cleanup from kwargs
azure/cosmos/aio/_asynchronous_request.py Async version of OperationStartTime cleanup
tests/test_none_options.py New comprehensive test suite for None parameter handling (sync)
tests/test_none_options_async.py New comprehensive test suite for None parameter handling (async)
tests/test_crud.py Added extensive timeout tests including connection errors, throttling, and inner exception validation
tests/test_crud_async.py Async version of timeout tests with transport mocking
tests/test_read_items.py Removed unused imports (CosmosDict, CaseInsensitiveDict)
tests/test_changefeed_partition_key_variation*.py Fixed function signature to accept **kwargs
tests/workloads/run_workloads.sh Removed blocking wait command
docs/TimeoutAndRetriesConfig.md Updated documentation to clarify timeout applies per-operation not per-request
CHANGELOG.md Added 4.14.3 release notes documenting both bugfixes

populate_index_metrics=None, populate_query_metrics=None, priority=None,
response_hook=None, session_token=None, throughput_bucket=None)
_items = [doc async for doc in pager]
assert _items == _items
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion assert _items == _items is a tautology that always passes and provides no actual test validation. Consider replacing with assert isinstance(_items, list) to at least verify the result type.

Suggested change
assert _items == _items
assert isinstance(_items, list)

Copilot uses AI. Check for mistakes.

# Check timeout before fetching next block
if timeout:
elapsed = time.time() - self._options.get(_Constants.OperationStartTime)
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If OperationStartTime is not set in _options, self._options.get(_Constants.OperationStartTime) will return None, causing a TypeError when subtracting from time.time(). This can happen if TimeoutScope is OPERATION and the operation start time was never initialized. Add a default value: self._options.get(_Constants.OperationStartTime, time.time()).

Suggested change
elapsed = time.time() - self._options.get(_Constants.OperationStartTime)
elapsed = time.time() - self._options.get(_Constants.OperationStartTime, time.time())

Copilot uses AI. Check for mistakes.
single_partition_key = 0

large_string = 'a' * 1000 # 1KB string
for i in range(200): # Insert 500 documents
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment incorrectly states "Insert 500 documents" but the loop range is 200. Should be "Insert 200 documents".

Suggested change
for i in range(200): # Insert 500 documents
for i in range(200): # Insert 200 documents

Copilot uses AI. Check for mistakes.
pre_trigger_include=None, post_trigger_include=None,
session_token=None, priority=None,
throughput_bucket=None)
assert any(r.get("resourceBody").get("id") == id1 for r in batch_result) or any(r.get("resourceBody").get("id") == id2 for r in batch_result)
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion logic is overly complex and hard to understand. Consider simplifying to: assert any(r.get("resourceBody", {}).get("id") in [id1, id2] for r in batch_result) which is clearer and also includes a safe default for missing "resourceBody" keys.

Suggested change
assert any(r.get("resourceBody").get("id") == id1 for r in batch_result) or any(r.get("resourceBody").get("id") == id2 for r in batch_result)
assert any(r.get("resourceBody", {}).get("id") in [id1, id2] for r in batch_result)

Copilot uses AI. Check for mistakes.
async def test_list_conflicts_none_options_async(self):
pager = self.container.list_conflicts(max_item_count=None, response_hook=None)
conflicts = [c async for c in pager]
assert conflicts == conflicts # simple sanity (may be empty)
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion assert conflicts == conflicts is a tautology that always passes and provides no actual test validation. Consider replacing with assert isinstance(conflicts, list) to at least verify the result type.

Suggested change
assert conflicts == conflicts # simple sanity (may be empty)
assert isinstance(conflicts, list) # simple sanity (may be empty)

Copilot uses AI. Check for mistakes.
pager = self.container.query_conflicts("SELECT * FROM c", parameters=None, partition_key=None,
max_item_count=None, response_hook=None, enable_cross_partition_query=True)
conflicts = list(pager)
assert conflicts == conflicts
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion assert conflicts == conflicts is a tautology that always passes and provides no actual test validation. Consider replacing with assert isinstance(conflicts, list) to at least verify the result type.

Suggested change
assert conflicts == conflicts
assert isinstance(conflicts, list)

Copilot uses AI. Check for mistakes.
Comment on lines +1116 to +1139
async def test_timeout_for_point_operation_async(self):
"""Test that point operations respect client timeout"""

# Create a container for testing
created_container = await self.database_for_test.create_container(
id='point_op_timeout_container_' + str(uuid.uuid4()),
partition_key=PartitionKey(path="/pk")
)

# Create a test item
test_item = {
'id': 'test_item_1',
'pk': 'partition1',
'data': 'test_data'
}
await created_container.create_item(test_item)

# Long timeout should succeed
result = await created_container.read_item(
item='test_item_1',
partition_key='partition1',
timeout=1.0 # 1 second timeout
)
self.assertEqual(result['id'], 'test_item_1')
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test creates a container but never cleans it up. Consider adding cleanup logic (e.g., await self.database_for_test.delete_container(created_container.id) at the end of the test or in a finally block) to avoid resource leaks in the test environment.

Copilot uses AI. Check for mistakes.
Comment on lines +1436 to +1467
def test_timeout_for_point_operation(self):
"""Test that point operations respect client timeout"""

# Create a container for testing
created_container = self.databaseForTest.create_container(
id='point_op_timeout_container_' + str(uuid.uuid4()),
partition_key=PartitionKey(path="/pk")
)

# Create a test item
test_item = {
'id': 'test_item_1',
'pk': 'partition1',
'data': 'test_data'
}
created_container.create_item(test_item)

# Test 1: Short timeout should fail
with self.assertRaises(exceptions.CosmosClientTimeoutError):
created_container.read_item(
item='test_item_1',
partition_key='partition1',
timeout=0.00000002 # very small timeout to force failure
)

# Test 2: Long timeout should succeed
result = created_container.read_item(
item='test_item_1',
partition_key='partition1',
timeout=3.0
)
self.assertEqual(result['id'], 'test_item_1')
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test creates a container but never cleans it up. Consider adding cleanup logic (e.g., self.databaseForTest.delete_container(created_container.id) at the end of the test or in a finally block) to avoid resource leaks in the test environment.

Copilot uses AI. Check for mistakes.
# Test 2: Timeout too short should fail
item_pages_short_timeout = container_with_delay.query_items(
query="SELECT * FROM c",
#enable_cross_partition_query=True,
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commented out enable_cross_partition_query=True should either be removed or uncommented with an explanation. Leaving commented code without context makes the code harder to maintain.

Copilot uses AI. Check for mistakes.
@simorenoh
Copy link
Member Author

/azp run python - cosmos - tests

@azure-pipelines
Copy link

Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

4 participants