Python: Fix Cosmos DB for MongoDB vector index kind#14105
Open
EsraaKamel11 wants to merge 1 commit into
Open
Conversation
CosmosMongoCollection._get_index_definitions set cosmosSearchOptions["kind"] from DISTANCE_FUNCTION_MAP_MONGODB (a similarity code, e.g. "COS") instead of INDEX_KIND_MAP_MONGODB (the index kind, e.g. "vector-hnsw"). As a result the created vector index used an invalid kind, "kind" equalled "similarity", and the HNSW/IVF/DiskANN tuning options (m, efConstruction, numList, maxDegree, lBuild) were silently dropped because the `match index_kind` block could never match. Use INDEX_KIND_MAP_MONGODB[field.index_kind] for the index kind and keep the distance-function map for "similarity". Update the existing index test (which asserted the buggy "COS" value) and add a test covering the tuning options.
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes Azure Cosmos DB for MongoDB vector index creation by correctly populating cosmosSearchOptions["kind"] from the index-kind mapping (e.g., vector-hnsw) while keeping cosmosSearchOptions["similarity"] sourced from the distance-function mapping (e.g., COS). This aligns the MongoDB vCore path with the existing NoSQL path and ensures vector index tuning options are applied.
Changes:
- Corrected
_get_index_definitionsto useINDEX_KIND_MAP_MONGODB[field.index_kind]forcosmosSearchOptions["kind"]. - Updated the existing unit test to assert the correct
kind/similarityvalues. - Added a unit test verifying HNSW tuning options (
m,efConstruction) are forwarded intocosmosSearchOptions.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| python/semantic_kernel/connectors/azure_cosmos_db.py | Fixes MongoDB vector index kind mapping so index creation uses valid vector-* kinds and tuning options apply. |
| python/tests/unit/connectors/memory/azure_cosmos_db/test_azure_cosmos_db_mongodb_collection.py | Updates/extends tests to validate correct kind vs similarity and verify HNSW tuning options propagation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Author
|
@microsoft-github-policy-service agree |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation and Context
CosmosMongoCollection._get_index_definitionspopulatedcosmosSearchOptions["kind"]from the distance-function map(a similarity code like
"COS") instead of the index-kind map. As a result the created vector index used an invalidkind,kindequalledsimilarity, and all HNSW/IVF/DiskANN tuning options (m,efConstruction,numList,maxDegree,lBuild) were silently dropped because thematch index_kindblock could never match avector-*case.Cosmos DB for MongoDB vCore requires
cosmosSearchOptions.kind ∈ {vector-ivf, vector-hnsw, vector-diskann}, so indexcreation also fails against a live account.
Fixes #14104
Description
One-line fix: use
INDEX_KIND_MAP_MONGODB[field.index_kind]for the indexkind, keeping the distance-function mapfor
similarity(mirrors the NoSQL path, which already does this correctly).INDEX_KIND_MAP_MONGODB's value waspreviously unused.
kind="COS",similarity="COS", tuning options droppedkind="vector-hnsw",similarity="COS", tuning options forwardedTests: updated the existing index test (which asserted the buggy
"COS"value) and added a test verifying HNSW tuningoptions (
m,efConstruction) reachcosmosSearchOptions.Contribution Checklist