-
Notifications
You must be signed in to change notification settings - Fork 430
Feat: add volcengine embedding support #57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for VolcEngine embeddings by introducing an ArkEmbedder implementation, integrating it into the existing factory and configuration system, and providing basic unit tests.
- Implemented
ArkEmbedderwith text, image, and multimodal embedding support. - Registered
ArkEmbedderin the factory and added its config class. - Added initial tests for single and batch text embedding and updated dependencies.
Reviewed Changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/embedders/test_ark.py | Added tests for single and batch text embedding |
| src/memos/memories/textual/general.py | Updated imports and type annotation to include ArkEmbedder |
| src/memos/embedders/factory.py | Registered ark backend in EmbedderFactory |
| src/memos/embedders/ark.py | New ArkEmbedder implementation |
| src/memos/configs/embedder.py | Added ArkEmbedderConfig and mapped it to "ark" |
| pyproject.toml | Added volcengine-python-sdk dependency |
Comments suppressed due to low confidence (4)
src/memos/embedders/ark.py:38
- This comment refers to initializing the Ollama client but you are creating an Ark client here. Please update the comment accordingly.
# Initialize ollama client
src/memos/embedders/ark.py:59
- [nitpick] There are no unit tests covering
embed_images. Consider adding tests to verify image embedding functionality.
def embed_images(self, urls: list[str], chunk_size: int | None = None) -> list[list[float]]:
src/memos/embedders/ark.py:56
- [nitpick] The
embed_querymethod is not exercised by existing tests. Consider adding a test to confirm its behavior.
def embed_query(self, text: str) -> list[float]:
src/memos/embedders/ark.py:57
- The method
embed_documentsis not defined onArkEmbedder. You likely meant to callself.embed([text])or implementembed_documents.
return self.embed_documents([text])[0]
src/memos/embedders/ark.py
Outdated
|
|
||
|
|
||
| class ArkEmbedder(BaseEmbedder): | ||
| """Arl Embedder class.""" |
Copilot
AI
Jul 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring on the next line spells "Arl Embedder" instead of "Ark Embedder". Please correct the typo.
| """Arl Embedder class.""" | |
| """Ark Embedder class.""" |
Ki-Seki
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the project continues to expand, its dependencies have become increasingly cumbersome, leading to numerous dependency conflicts. Would you consider implementing VolcEngine's Embedding support by directly using their API documentation (https://www.volcengine.com/docs/84313/1254554) instead of their Python SDK? This approach could significantly reduce dependencies.
Additionally, GitHub Copilot suggested a minor typo fix.
Lastly, please merge the dev branch or perform a rebase, as the current branch has fallen behind.
随着项目规模不断扩大,其依赖项变得日益臃肿,导致大量依赖冲突。您是否考虑通过直接调用火山引擎提供的API文档 https://www.volcengine.com/docs/84313/1254554 来替代他们的Python SDK以实现Embedding功能?这种方式可以大幅降低依赖复杂度。
另外GitHub Copilot提示了一个拼写错误需要修正。
最后请合并dev分支或进行变基操作,当前分支已落后主分支。
ef6f854 to
45d8fea
Compare
您好,我已按照要求修改拼写错误,并且添加image embedding test case。
|
|
ok。我可以同意使用Python SDK。 这个PR还需要修复下CI checks的bug; 另外,希望能先删除image和多模态embedding的部分,因为这个地方需要我们先定义好base类中的接口才可以。并且目前仓库还使用不到多模态的内容。 |
非常感谢您的回复,我预计今晚更新PR |
|
已合并,建议您可以同时在https://github.com/MemTensor/MemOS-Docs中增加相应的文档。 |
* feat: add volcengine embedding support * test: add volcengine embedding test * feat(embedder): add support for ArkEmbedder * fixed: minor typo fix and remove unnecessary function * test: add image embedding test * fixed: remove images embed support and test * fixed: change chunk_size default value
Description
Summary: This PR introduces support for VolcEngine embeddings. The main changes include:
Added a new embedding option to integrate with the VolcEngine service.
Implemented the necessary client logic to handle API requests and responses from VolcEngine.
Included a comprehensive test suite to ensure the functionality and reliability of the new VolcEngine embedding integration.
Checklist: