Python: Bugfix: issue-3364: detect and preserve custom store state#3378
Python: Bugfix: issue-3364: detect and preserve custom store state#3378hombin wants to merge 1 commit intomicrosoft:mainfrom
Conversation
|
@microsoft-github-policy-service agree |
|
Hi @markwallace-microsoft , when you have a moment, could you please help review and approve this PR? |
df149af to
458884b
Compare
| # Check if input has fields beyond what standard ChatMessageStoreState would have | ||
| extra_fields = set(chat_message_store_state.keys()) - standard_fields | ||
| if extra_fields: | ||
| # Custom store with additional fields (e.g., redis_url, thread_id, key_prefix) | ||
| # Preserve as dict to retain all custom configuration | ||
| self.chat_message_store_state = chat_message_store_state | ||
| else: | ||
| # Standard ChatMessageStoreState - convert for backward compatibility | ||
| # This handles both {"messages": []} and {"type": "...", "messages": [...]} |
There was a problem hiding this comment.
The detection logic for custom stores may incorrectly classify stores that only differ in the 'type' field value as standard stores. For example, if a custom store serializes as {"type": "custom_store_state", "messages": []}, it will be treated as a standard store because it has no extra fields beyond 'type' and 'messages', and will be converted to ChatMessageStoreState with type "chat_message_store_state", losing the original custom type information.
To fix this, also check if the 'type' field value (when present) differs from 'chat_message_store_state'. If it does, treat it as a custom store and preserve as dict. For example:
if extra_fields or (
"type" in chat_message_store_state
and chat_message_store_state["type"] != "chat_message_store_state"
):
# Custom store with additional fields or different type
self.chat_message_store_state = chat_message_store_state| # Check if input has fields beyond what standard ChatMessageStoreState would have | |
| extra_fields = set(chat_message_store_state.keys()) - standard_fields | |
| if extra_fields: | |
| # Custom store with additional fields (e.g., redis_url, thread_id, key_prefix) | |
| # Preserve as dict to retain all custom configuration | |
| self.chat_message_store_state = chat_message_store_state | |
| else: | |
| # Standard ChatMessageStoreState - convert for backward compatibility | |
| # This handles both {"messages": []} and {"type": "...", "messages": [...]} | |
| standard_type = standard_state_dict.get("type") | |
| # Check if input has fields beyond what standard ChatMessageStoreState would have | |
| extra_fields = set(chat_message_store_state.keys()) - standard_fields | |
| if extra_fields or ( | |
| "type" in chat_message_store_state | |
| and standard_type is not None | |
| and chat_message_store_state["type"] != standard_type | |
| ): | |
| # Custom store with additional fields or different type (e.g., redis_url, thread_id, key_prefix) | |
| # Preserve as dict to retain all custom configuration | |
| self.chat_message_store_state = chat_message_store_state | |
| else: | |
| # Standard ChatMessageStoreState - convert for backward compatibility | |
| # This handles both {"messages": []} and {"type": standard_type, "messages": [...]} |
| # Determine if this is a custom store that needs to preserve extra fields | ||
| # Standard ChatMessageStoreState has 'type' (from SerializationMixin) and 'messages' | ||
| # Create a temporary instance to get the expected fields dynamically | ||
| temp_state = ChatMessageStoreState() | ||
| standard_state_dict = temp_state.to_dict() | ||
| standard_fields = set(standard_state_dict.keys()) |
There was a problem hiding this comment.
Creating a temporary ChatMessageStoreState instance with no arguments may not be reliable for determining the standard fields. The 'type' field is dynamically added by SerializationMixin.to_dict() and will vary depending on whether the instance has custom type attributes. This could lead to incorrect detection of custom stores.
Instead, you should explicitly check for the known standard fields: {'type', 'messages'}. This is more robust and doesn't depend on runtime object instantiation.
| # Determine if this is a custom store that needs to preserve extra fields | |
| # Standard ChatMessageStoreState has 'type' (from SerializationMixin) and 'messages' | |
| # Create a temporary instance to get the expected fields dynamically | |
| temp_state = ChatMessageStoreState() | |
| standard_state_dict = temp_state.to_dict() | |
| standard_fields = set(standard_state_dict.keys()) | |
| # Determine if this is a custom store that needs to preserve extra fields. | |
| # Standard ChatMessageStoreState has 'type' (from SerializationMixin) and 'messages'. | |
| # Explicitly use the known standard fields instead of inferring them at runtime. | |
| standard_fields = {"type", "messages"} |
| parsed_state = ChatMessageStoreState.from_dict(store_state_dict) | ||
| message_store = ChatMessageStore(messages=parsed_state.messages, **kwargs) | ||
| else: | ||
| raise AgentThreadException( | ||
| "Cannot deserialize custom message store without providing a message_store instance. " | ||
| "Please provide a message_store parameter to deserialize()." | ||
| ) |
There was a problem hiding this comment.
The error message says "Cannot deserialize custom message store without providing a message_store instance" but this check is triggered even for standard stores that have a 'messages' field. The condition should be checking if this is actually a custom store (no 'messages' field) before raising this error.
Consider restructuring the logic to: (1) check if 'messages' exists, if yes create standard ChatMessageStore, (2) if no 'messages' but has 'type' field matching known custom types, raise this error, (3) otherwise raise a generic "Invalid state" error.
| parsed_state = ChatMessageStoreState.from_dict(store_state_dict) | |
| message_store = ChatMessageStore(messages=parsed_state.messages, **kwargs) | |
| else: | |
| raise AgentThreadException( | |
| "Cannot deserialize custom message store without providing a message_store instance. " | |
| "Please provide a message_store parameter to deserialize()." | |
| ) | |
| # Standard ChatMessageStore serialized state | |
| parsed_state = ChatMessageStoreState.from_dict(store_state_dict) | |
| message_store = ChatMessageStore(messages=parsed_state.messages, **kwargs) | |
| elif "type" in store_state_dict: | |
| # Looks like a custom store; require caller-provided message_store instance | |
| raise AgentThreadException( | |
| "Cannot deserialize custom message store without providing a message_store instance. " | |
| "Please provide a message_store parameter to deserialize()." | |
| ) | |
| else: | |
| # Missing both 'messages' and 'type' fields; treat as invalid state | |
| raise AgentThreadException("Invalid chat_message_store_state; missing 'messages' and 'type' fields.") |
| except AgentThreadException: | ||
| raise |
There was a problem hiding this comment.
This re-raises AgentThreadException but then has a catch-all for other exceptions. The problem is that if the code at line 500 (ChatMessageStoreState.from_dict) raises an AgentThreadException, it will be caught by line 512 and re-raised. However, if line 501 (ChatMessageStore construction) fails with an AgentThreadException, it will also be caught and re-raised. This double-catch pattern is confusing and may not preserve the original exception context properly.
Consider removing the re-raise at line 512-513 and letting AgentThreadExceptions propagate naturally, or handle them separately if you need to wrap them with additional context.
| except AgentThreadException: | |
| raise |
| "Cannot create custom message store from state. " | ||
| "Please create a message store first and then call update_from_state()." |
There was a problem hiding this comment.
The same issue exists here as in the deserialize method. The error message "Cannot create custom message store from state" is misleading because this branch is also reached for standard stores that happen to be stored as dicts. The check at line 556 verifies that 'messages' exists, so if it doesn't exist, you're treating it as a custom store error. However, it could also just be an invalid standard store state.
The error messaging should be more precise about what went wrong.
| "Cannot create custom message store from state. " | |
| "Please create a message store first and then call update_from_state()." | |
| "Invalid chat_message_store_state: expected a 'messages' key when no message store " | |
| "instance is configured. If you are using a custom message store, create it first and " | |
| "then call update_from_thread_state()." |
| except AgentThreadException: | ||
| raise | ||
| except Exception as ex: |
There was a problem hiding this comment.
Same maintainability issue as in the deserialize method. The double exception handling with re-raise of AgentThreadException followed by a generic catch-all creates confusing control flow. Consider simplifying this error handling pattern.
| except AgentThreadException: | |
| raise | |
| except Exception as ex: | |
| except Exception as ex: | |
| if isinstance(ex, AgentThreadException): | |
| raise |
| raise AgentThreadException("A thread cannot have both a service_thread_id and a chat_message_store.") | ||
| self.service_thread_id = service_thread_id | ||
| self.chat_message_store_state: ChatMessageStoreState | None = None | ||
| self.chat_message_store_state: MutableMapping[str, Any] | ChatMessageStoreState | None = None |
There was a problem hiding this comment.
The type annotation change from ChatMessageStoreState | None to MutableMapping[str, Any] | ChatMessageStoreState | None is correct and necessary to support custom stores. However, this is a breaking change to the public API since code that accesses chat_message_store_state.messages directly may now fail if it's a dict instead of a ChatMessageStoreState object.
Consider documenting this breaking change more prominently in the PR description and potentially adding a BREAKING prefix to the PR title as mentioned in the contribution checklist.
Motivation and Context
Fix issue-3364: #3364
Description
Intelligently detect and preserve custom store state while maintaining backward compatibility with standard
ChatMessageStore.The key insight is to distinguish between:
typeandmessages[expected] fields → Convert toChatMessageStoreStateobject (existing behavior)redis_url,thread_id) → Keep as dictionary to preserve all dataContribution Checklist
Test Update Required
One test needs updating:
test_init_with_chat_message_store_state_no_messages(line 448 intests/core/test_threads.py)Why: This test uses Redis store data (with custom fields like
thread_id,redis_url) but expects the state to be converted to aChatMessageStoreStateobject with.messagesattribute. With our fix, custom store states are correctly preserved as dictionaries to retain all configuration.