fix&refactor(provider): 集成 MIME 三级回退检测与 client 生命周期防御#7040
fix&refactor(provider): 集成 MIME 三级回退检测与 client 生命周期防御#7040Tz-WIND wants to merge 8 commits intoAstrBotDevs:masterfrom
Conversation
Refactor OpenAI source code by removing unused methods and improving image handling logic. Ensure client initialization is more robust and consistent.
There was a problem hiding this comment.
Hey - I've found 3 issues, and left some high level feedback:
- The base64 sampling logic and constants for header inspection (
_HEADER_READ_SIZE/_BASE64_SAMPLE_CHARS) are now duplicated betweenimage_utils.encode_image_to_base64_urlandOpenAIProvider._image_ref_to_data_url; consider centralizing this inimage_utilsto avoid drift and keep MIME detection consistent. encode_image_to_base64_urlis declared asasyncbut only performs synchronous file I/O; either make it synchronous (and call it accordingly) or switch to non-blocking file access to avoid blocking the event loop in hot paths.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The base64 sampling logic and constants for header inspection (`_HEADER_READ_SIZE` / `_BASE64_SAMPLE_CHARS`) are now duplicated between `image_utils.encode_image_to_base64_url` and `OpenAIProvider._image_ref_to_data_url`; consider centralizing this in `image_utils` to avoid drift and keep MIME detection consistent.
- `encode_image_to_base64_url` is declared as `async` but only performs synchronous file I/O; either make it synchronous (and call it accordingly) or switch to non-blocking file access to avoid blocking the event loop in hot paths.
## Individual Comments
### Comment 1
<location path="astrbot/core/provider/sources/openai_source.py" line_range="476-485" />
<code_context>
+ def _ensure_client(self) -> None:
</code_context>
<issue_to_address>
**issue (bug_risk):** Reliance on the OpenAI client’s private `_client.is_closed` is brittle and may silently stop working on SDK upgrades.
Accessing `_client.is_closed` remains fragile: other SDK versions (or Azure) may not expose `_client`/`is_closed`, and the `AttributeError` path currently fails silently, hiding subtle issues. Please at least log when this attribute access fails, and consider tracking the closed state yourself (e.g., a wrapper or flag updated in `terminate`) instead of relying on private internals.
</issue_to_address>
### Comment 2
<location path="astrbot/core/utils/image_utils.py" line_range="67-76" />
<code_context>
+async def encode_image_to_base64_url(image_url: str) -> str:
</code_context>
<issue_to_address>
**suggestion (performance):** This async API does only blocking file I/O; either make it sync or use non-blocking I/O to match the signature.
Because this function is `async` but only does synchronous file I/O (via `open()` / `read()`), it will block the event loop under load. Please either make it a regular synchronous function and call it as such, or switch to a truly async file I/O library (e.g., `aiofiles`) so the async signature is accurate and non-blocking.
Suggested implementation:
```python
def encode_image_to_base64_url(image_url: str) -> str:
```
To fully apply this change, you'll also need to:
1. Update all call sites to `encode_image_to_base64_url` to call it as a regular synchronous function (remove any `await` and, if needed, wrap in `run_in_executor` or similar if you must keep callers async and avoid blocking their event loop).
2. If this function is part of a public async API, consider documenting that it is now synchronous, or provide an async wrapper that runs it in a thread pool for compatibility.
</issue_to_address>
### Comment 3
<location path="astrbot/core/provider/sources/openai_source.py" line_range="43" />
<code_context>
from ..register import register_provider_adapter
+# SVG 检测所需的最小头部字节数。
+# 其他二进制格式的魔术字节最多 16 字节,SVG 需要更多以跳过可能的 XML 声明。
+_HEADER_READ_SIZE = 256
</code_context>
<issue_to_address>
**issue (complexity):** Consider centralizing the base64/MIME detection logic and refactoring `_ensure_client` into a smaller, helper-based design to reduce duplication and branching complexity while preserving behavior.
Two areas look worth simplifying without losing any behavior: the base64/MIME handling and `_ensure_client`.
---
### 1. Deduplicate base64 header sampling & MIME detection
You now have:
- `_BASE64_SAMPLE_CHARS` defined here.
- Very similar base64 sampling/decoding + `detect_image_mime_type` logic that already exists in `image_utils.encode_image_to_base64_url` for `base64://` inputs.
You can centralize this in `image_utils` and make `_image_ref_to_data_url` a thin wrapper. For example:
**In `image_utils.py` (or wherever appropriate):**
```python
# image_utils.py
_HEADER_READ_SIZE = 256
_BASE64_SAMPLE_CHARS = 344 # keep single source of truth
def detect_mime_type_from_base64_str(raw_b64: str, source_hint: str = "") -> str:
sample = raw_b64[:_BASE64_SAMPLE_CHARS]
missing_padding = len(sample) % 4
if missing_padding:
sample += "=" * (4 - missing_padding)
try:
header_bytes = base64.b64decode(sample)
except Exception:
logger.debug(
"[%s] base64 解码失败,硬编码回退为 image/jpeg",
source_hint or "base64",
)
return "image/jpeg"
mime_type = detect_image_mime_type(header_bytes)
logger.debug(
"[%s] 魔术字节检测命中,识别为 %s 格式",
source_hint or "base64",
mime_type,
)
return mime_type
```
**Then in your adapter:**
```python
from astrbot.core.utils.image_utils import detect_mime_type_from_base64_str
async def _image_ref_to_data_url(
self,
image_ref: str,
*,
mode: Literal["safe", "strict"] = "safe",
) -> str | None:
if image_ref.startswith("base64://"):
raw_b64 = image_ref[len("base64://"):]
mime_type = detect_mime_type_from_base64_str(raw_b64, source_hint="base64://")
return f"data:{mime_type};base64,{raw_b64}"
# other branches unchanged...
```
This removes the local `_BASE64_SAMPLE_CHARS`, removes duplicated try/except, and keeps the existing “magic bytes → default jpeg on decode error” behavior in a single place.
If `encode_image_to_base64_url` also handles `base64://` inputs, you can make it call `detect_mime_type_from_base64_str` too, so all base64 MIME detection flows are unified.
---
### 2. Simplify client lifecycle guarding & hide private-attribute access
`_ensure_client` mixes a flag, nested conditionals, and a try/except around a private attribute. You can keep all behavior but make the flow clearer by extracting the “is closed” check and using an early-return pattern.
For example:
```python
def _is_underlying_client_closed(self) -> bool:
"""集中处理对 openai 私有属性的访问,便于未来替换为公开 API。"""
try:
# 依赖 openai SDK 当前结构: client._client.is_closed
return bool(self.client and self.client._client.is_closed)
except AttributeError:
return False
```
Then `_ensure_client` becomes more linear:
```python
def _ensure_client(self) -> None:
if self.client is None or self._is_underlying_client_closed():
logger.warning("检测到 OpenAI client 已关闭或未初始化,正在重新创建...")
self.client = self._create_openai_client()
self.default_params = inspect.signature(
self.client.chat.completions.create,
).parameters.keys()
```
This keeps:
- The “auto-reinit if `client` is `None` or underlying HTTP client is closed” behavior.
- The warning log and `default_params` refresh.
But it moves the fragile `_client.is_closed` access into a single helper that’s easy to audit/replace when the OpenAI SDK exposes a public way to check closure, and reduces branching complexity in `_ensure_client` itself.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Code Review
This pull request centralizes image processing by introducing a new utility module for robust MIME type detection and base64 encoding, replacing previous hardcoded "image/jpeg" logic across the codebase. It also significantly improves the OpenAI provider's reliability by implementing a client re-initialization mechanism and safer termination logic. Review feedback identifies a performance concern regarding synchronous file I/O within an asynchronous utility function and suggests narrowing exception handling during base64 header decoding to prevent masking unrelated errors.
Refactor image utility functions to support async operations and improve MIME type detection. Enhance base64 encoding logic for better compatibility with various image formats.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - I've found 4 issues, and left some high level feedback:
- The
_HEADER_READ_SIZE/_BASE64_SAMPLE_CHARSconstants and base64-sampling logic are now duplicated betweenopenai_source.pyandimage_utils.py; consider centralizing these inimage_utilsand reusing them to avoid divergence in future MIME-detection behavior. - In
detect_mime_type_from_base64_str, the debug log always states "魔术字节检测命中" even whendetect_image_mime_typefalls back toimage/jpegas a default; you may want to mirror the adapter’s heuristic (distinguish real JPEG vs. fallback) so logs clearly indicate when detection actually failed and a default was used.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `_HEADER_READ_SIZE` / `_BASE64_SAMPLE_CHARS` constants and base64-sampling logic are now duplicated between `openai_source.py` and `image_utils.py`; consider centralizing these in `image_utils` and reusing them to avoid divergence in future MIME-detection behavior.
- In `detect_mime_type_from_base64_str`, the debug log always states "魔术字节检测命中" even when `detect_image_mime_type` falls back to `image/jpeg` as a default; you may want to mirror the adapter’s heuristic (distinguish real JPEG vs. fallback) so logs clearly indicate when detection actually failed and a default was used.
## Individual Comments
### Comment 1
<location path="astrbot/core/provider/sources/openai_source.py" line_range="186-195" />
<code_context>
+ def _detect_mime_type_with_fallback(
</code_context>
<issue_to_address>
**issue (bug_risk):** 严格模式下不再对无效图片抛错,导致 `mode="strict"` 语义被弱化
此前 `mode="strict"` 下,`_encode_image_file_to_data_url` 在 `PILImage.open(...); image.verify()` 失败时会抛出 `ValueError("Invalid image file: ...")`,方便调用方识别“文件不是合法图片”。现在无论检测成败都会回退到 `image/jpeg`,且 `mode` 不再参与分支:
- 非法/损坏图片会被编码成 `data:image/jpeg;base64,...`,直到后续 API 报错才暴露问题;
- 依赖 strict 模式做提前校验的调用方将无法区分 safe/strict 行为。
建议:
- strict 模式保留“失败即抛错”的语义(例如在魔数+PIL 都失败时抛异常,而不是静默回退);或者
- 若不再需要 strict 语义,则移除 `mode` 参数,避免接口签名与实际行为不符。
</issue_to_address>
### Comment 2
<location path="astrbot/core/provider/sources/openai_source.py" line_range="50-53" />
<code_context>
+# 其他二进制格式的魔术字节最多 16 字节,SVG 需要更多以跳过可能的 XML 声明。
+_HEADER_READ_SIZE = 256
+
+# 对应 _HEADER_READ_SIZE 字节所需的 base64 字符数。
+# base64 每 3 字节编码为 4 字符,向上取整后再补充少量余量保证填充对齐。
+# ceil(256 / 3) * 4 = 344
+_BASE64_SAMPLE_CHARS = 344
+
+
</code_context>
<issue_to_address>
**suggestion:** 本地定义的 `_BASE64_SAMPLE_CHARS` 未被使用且与 utils 中的常量重复
该常量在当前文件未被使用,且与 `astrbot/core/utils/image_utils.py` 中的同名逻辑重复,容易引发 lint 警告和后续维护不一致。建议:若不需要则直接删除;若需要类似逻辑,改为复用 `image_utils` 中统一维护的常量,避免重复“魔术数字”。
```suggestion
```
</issue_to_address>
### Comment 3
<location path="astrbot/core/provider/sources/openai_source.py" line_range="467-476" />
<code_context>
+ http_client=self._create_http_client(self.provider_config),
+ )
+
+ def _is_underlying_client_closed(self) -> bool:
+ """集中处理对 openai SDK 私有属性的访问,便于未来替换为公开 API。
+
+ 注意:此处直接访问了 openai 库的私有属性 `_client`,
+ 依赖其内部实现(httpx.AsyncClient 实例暴露的 is_closed 属性)。
+ 这一做法存在脆弱性——若 openai 库未来版本调整了内部结构,
+ 此处可能在没有任何报错的情况下静默失效。
+ 目前 openai SDK 尚未提供检查底层连接是否已关闭的公开 API。
+ 若未来 SDK 提供了类似 self.client.is_closed() 的公开方法,
+ 应及时将此处替换为对应的公开接口。
+ """
+ try:
+ return bool(self.client and self.client._client.is_closed)
+ except AttributeError:
+ return False
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** 当 OpenAI SDK 内部结构变更时静默返回 False 可能掩盖 client 已关闭的问题
当前 `AttributeError` 分支直接返回 `False`,一旦 SDK 私有结构变更,这里会悄悄失效,导致关闭的 client 被视为“未关闭”,影响重建时机且难以排查。建议在 `AttributeError` 分支中至少打一个 `logger.warning`/`debug`,提示“无法检测 client 是否关闭,可能是 SDK 结构变更”;也可以改为保守返回 `True`,在检测失败时强制重建 client(视重建成本再权衡)。
Suggested implementation:
```python
def _is_underlying_client_closed(self) -> bool:
"""集中处理对 openai SDK 私有属性的访问,便于未来替换为公开 API。
注意:此处直接访问了 openai 库的私有属性 `_client`,
依赖其内部实现(httpx.AsyncClient 实例暴露的 is_closed 属性)。
这一做法存在脆弱性——若 openai 库未来版本调整了内部结构,
此处可能在没有任何报错的情况下静默失效。
目前 openai SDK 尚未提供检查底层连接是否已关闭的公开 API。
若未来 SDK 提供了类似 self.client.is_closed() 的公开方法,
应及时将此处替换为对应的公开接口。
当检测逻辑因 SDK 内部结构变更而抛出 AttributeError 时,会:
1. 记录 warning 日志,提示可能的 SDK 变更;
2. 保守地视为“已关闭”,触发后续的 client 重建逻辑。
"""
try:
return bool(self.client and self.client._client.is_closed)
except AttributeError:
logger = logging.getLogger(__name__)
logger.warning(
"无法检测 OpenAI client 是否已关闭,可能是 SDK 内部结构变更导致;"
"将保守视为已关闭并触发 client 重建。"
)
# 保守策略:在无法可靠检测时,视为已关闭,避免持续使用潜在无效的 client。
return True
```
1. 如果本文件尚未导入 `logging`,需要在文件顶部增加 `import logging`,或复用项目中既有的 logger(例如模块级 `logger = logging.getLogger(__name__)`),并在这里直接使用该 logger。
2. 确认调用 `_is_underlying_client_closed()` 的地方在返回 `True` 时确实会重建 client,以充分利用保守策略带来的安全性。
</issue_to_address>
### Comment 4
<location path="astrbot/core/provider/sources/openai_source.py" line_range="48" />
<code_context>
+# SVG 检测所需的最小头部字节数。
+# 其他二进制格式的魔术字节最多 16 字节,SVG 需要更多以跳过可能的 XML 声明。
+_HEADER_READ_SIZE = 256
+
+# 对应 _HEADER_READ_SIZE 字节所需的 base64 字符数。
</code_context>
<issue_to_address>
**issue (complexity):** Consider removing the custom MIME-detection logic in this adapter and delegating image handling to the shared image_utils helpers to avoid duplication and keep behavior consistent.
You can drop most of the new image MIME-handling logic in this adapter and delegate to the shared utilities, which will reduce duplication and complexity while keeping behavior consistent across providers.
### 1. Remove `_detect_mime_type_with_fallback` and the local header constants
`_detect_mime_type_with_fallback` plus `_HEADER_READ_SIZE` / `_BASE64_SAMPLE_CHARS` re-implement the detection pipeline already centralized in `image_utils.py`. If no other code in this file uses these constants, you can delete:
```python
# at top of file
_HEADER_READ_SIZE = 256
_BASE64_SAMPLE_CHARS = 344
```
and the whole `_detect_mime_type_with_fallback` method.
Instead, rely on `encode_image_to_base64_url` (see below) which already uses `detect_image_mime_type` and other fallbacks.
### 2. Delegate `_encode_image_file_to_data_url` to `encode_image_to_base64_url`
Import the high-level helper:
```python
from astrbot.core.utils.image_utils import (
detect_image_mime_type,
detect_mime_type_from_base64_str,
encode_image_to_base64_url,
)
```
Then simplify `_encode_image_file_to_data_url` to delegate directly:
```python
@classmethod
def _encode_image_file_to_data_url(
cls,
image_path: str,
*,
mode: Literal["safe", "strict"],
) -> str | None:
# Delegate to shared utility for:
# - file reading
# - header sampling
# - MIME detection (incl. SVG/AVIF/HEIC)
# - data URL assembly
return encode_image_to_base64_url(image_path, mode=mode)
```
This preserves all functionality (and likely improves it, since `image_utils` supports more formats and is used by other providers) while removing:
- manual `Path.read_bytes`
- the custom MIME map
- the custom fallback chain.
Your `base64://` branch in `_image_ref_to_data_url` already uses `detect_mime_type_from_base64_str`, so after this change both `base64://` and file paths will go through the same shared detection logic, reducing divergence and making future fixes centralized.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Refactor image MIME type detection and encoding logic. Introduce new method to encode image files to base64 Data URLs with improved error handling and fallback mechanisms.
Modifications / 改动点
MIME 类型检测采用三级回退策略:
各检测阶段均输出 debug 级别日志,记录来源与识别结果。
strict 模式下保留"失败即抛错"语义:当魔术字节与 PIL均无法识别图片格式时,抛出 ValueError 而非静默回退,确保 mode="strict" 的调用方能及时感知无效图片。
base64 输入的 MIME 检测统一委托给detect_mime_type_from_base64_str(image_utils),移除适配器内本地的 _BASE64_SAMPLE_CHARS 常量和重复的采样/解码/异常处理代码,所有 base64 MIME 检测流程收敛到单一实现。
移除适配器内的 _detect_mime_type_with_fallback 方法,三级回退逻辑内联于 _encode_image_file_to_data_url 中,与 mode 参数的 safe/strict 分支自然融合。
新增 _create_openai_client() 将 client 初始化逻辑解耦,新增 _ensure_client() 在每次使用前检查 client 状态,若为 None 或底层连接已关闭则自动重建,提供多层防御。erminate() 通过 try/finally 确保 client 引用始终被清空,避免配置重载期间复用已关闭 client 导致 APIConnectionError。
提取 _is_underlying_client_closed() 辅助方法,将对openai SDK 私有属性 _client.is_closed 的访问隔离到单一审计点。当 SDK 内部结构变更导致 AttributeError 时,记录 warning 日志并保守返回 True 触发 client 重建,避免静默失效导致持续使用已关闭的 client。
base64 解码处的异常捕获收窄为(binascii.Error, ValueError),避免过于宽泛的except Exception 掩盖意料之外的错误。
encode_image_to_base64_url 中文件路径分支的同步文件 I/O 通过 run_in_executor 移至线程池执行,避免高并发场景下阻塞 asyncio 事件循环。
ProviderRequest.assemble_context 补充 debug 级别日志,记录图片下载来源、编码引用路径及最终 Data URL 前缀。
经排查发现,实际请求流程中图片消息的组装入口是ProviderRequest.assemble_context(entities.py),而非 openai_source.py 中的同名方法——后者仅在直接调用Provider 适配器时才会触发。因此之前仅针对openai_source 的 MIME 改进在主流程中未能实际生效,本次已将相同的检测逻辑同步覆盖至 entities.py,确保两条路径均使用统一的公共工具函数。
注:_is_underlying_client_closed 中通过self.client._client.is_closed 访问了 openai 库的私有属性,依赖其内部实现,已添加注释说明此依赖关系,待 SDK 提供公开 API 后应及时替换。
Screenshots or Test Results / 运行截图或测试结果
Checklist / 检查清单
😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in
requirements.txtandpyproject.toml./ 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到
requirements.txt和pyproject.toml文件相应位置。😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。
Summary by Sourcery
Improve image MIME type detection and OpenAI client lifecycle handling in the provider layer.
New Features:
Bug Fixes:
Enhancements:
Summary by Sourcery
Improve image MIME type handling and OpenAI client lifecycle management in the provider layer.
New Features:
Bug Fixes:
Enhancements: