fix: improve multimodal image processing and message handling #545

dingyi222666 · 2025-09-15T15:36:00Z

Summary

Fixed multimodal image processing workflow by improving message handling
Removed unused base.ts export file and added proper content validation
Enhanced image service with better type compatibility and disposal handling
Bumped image-service version to 1.2.7

Changes

core: Added getMessageContent validation in read_chat_message middleware
image-service: Improved image processing by using separate message objects
image-service: Fixed image content type from 'image' to 'image_url' for compatibility
image-service: Added proper disposal handling for message transformer interceptor
image-service: Enhanced text content handling for both string and complex message types

…age handling - Remove unused base.ts export file - Add getMessageContent import and validation in read_chat_message middleware - Improve image processing workflow by using separate message object - Fix image content type from 'image' to 'image_url' for proper compatibility - Update image prompt templates for better descriptions - Add proper disposal handling for message transformer interceptor - Enhance text content handling for both string and complex message types

coderabbitai · 2025-09-15T15:36:07Z

Walkthrough

移除 model/base.ts 对 in_memory 的通配再导出；聊天中间件在内容判空时新增基于 getMessageContent 的空白检查并更新图片提示文案；图像服务重构拦截与处理流程，使用独立的假消息承载图像并在生成结果后向原消息追加文本，调整若干提示与日志。

Changes

Cohort / File(s)	Summary
导出清理（模型基础层） `packages/core/src/llm-core/model/base.ts`	删除 `export * from './in_memory'` 的再导出，base.ts 不再转出 `in_memory` 的符号。
聊天中间件校验与文案 `packages/core/src/middlewares/chat/read_chat_message.ts`	新增 `getMessageContent` 导入；在原有 `content.length < 1` 判定上增加 `getMessageContent(transformedMessage.content).trim().length < 1` 检查；将图片输入提示文案改为引用 `chatluna-image-service`；调整导入为多行。
图像服务拦截与处理重构 `packages/image-service/src/index.ts`	去掉等待循环，直接注册拦截并通过 `ctx.effect` 清理；拦截时创建独立 `fakeMessage(content: [])`，用 `addImageToContent(fakeMessage, imageData.base64Source)` 填充，不再修改原消息；用 `fakeMessage` 调用 `processImageWithModel`，若有结果则 `addTextToContent(message, '\n\n' + result)` 追加到原消息；移除 `ensureContentArray`；`addImageToContent` 改为推入 `type: 'image_url'`；`addTextToContent` 支持字符串直接追加；`processImageWithModel` 增加调试日志并内联构造 `HumanMessage`；更新 `imagePrompt`（150–400 字、适配单复数）与 `imageInsertPrompt` 文案；微调日志位置。

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as 用户
  participant MW as Chat 中间件
  participant Bot as 下游处理

  U->>MW: 发送消息
  MW->>MW: transformedMessage.content 判空
  MW->>MW: getMessageContent(transformedMessage.content).trim() 判空
  alt 内容为空
    MW-->>Bot: 停止后续处理
  else 有内容
    MW-->>Bot: 继续后续流程
  end

sequenceDiagram
  autonumber
  participant Ctx as Koishi 上下文
  participant ISP as Image Service 插件
  participant Img as 图片来源
  participant Model as 模型

  Ctx->>ISP: 注册消息拦截（返回 disposable）
  note right of ISP: ctx.effect 管理清理

  Img-->>ISP: 命中图片消息
  ISP->>ISP: 创建 fakeMessage(content: [])
  ISP->>ISP: addImageToContent(fakeMessage, base64)
  ISP->>Model: processImageWithModel(fakeMessage)
  note over Model: 内联构造 HumanMessage；输出调试日志
  alt 生成结果
    Model-->>ISP: 文本结果
    ISP->>ISP: addTextToContent(originalMessage, "\n\n" + result)
  else 无结果
    Model-->>ISP: 无输出
  end
  ISP-->>Ctx: 继续后续中间件/处理

Estimated code review effort

🎯 4 (复杂) | ⏱️ ~45 minutes

Poem

我把月光揉成一行行代码，
耳朵竖起，等图像悄悄来。
假信封装下光影，再递给你看，
轻贴两行字，喃喃如晚风叹。
咚一跳，审阅也变得可爱。

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	标题“fix: improve multimodal image processing and message handling”简洁且直接反映了变更集的主要目标——改进多模态图像处理与消息处理逻辑，且没有包含文件列表、表情或含糊措辞，因此能让审阅者快速理解主要意图。
Description Check	✅ Passed	PR 描述与变更集内容一致，列出了关键改动点（改进多模态图像处理、移除 base.ts 的 re-export、在 read_chat_message 中加入 getMessageContent 验证、image-service 的处理与版本变更等），因此描述与代码变更相关并提供了足够的上下文以便审查。

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/multimodal-image-read

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9f1616f and 6dc3f4e.

📒 Files selected for processing (1)

packages/core/src/middlewares/chat/read_chat_message.ts (3 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

packages/core/src/middlewares/chat/read_chat_message.ts

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

packages/core/src/middlewares/chat/read_chat_message.ts (2)

120-124: 修复文件名生成：重复空合并、自增花括号、多数情况下缺失扩展名

左右两侧都取 filename 无意义。
模板串多出一个 }。
未拼接推断的扩展名，可能影响下游存储/识别。

应用如下补丁：

-            let fileName =
-                element.attrs['filename'] ?? element.attrs['filename']
+            let fileName = element.attrs['filename'] as string | undefined
@@
-            if (fileName == null || fileName.length > 50) {
-                fileName = `${await hashString(url, 8)}}`
-            }
+            if (!fileName || fileName.length > 50) {
+                fileName = `${await hashString(url, 8)}.${ext}`
+            }

169-174: 音频拦截器在 content 为数组时会破坏结构

message.content += content 假定为字符串，若为数组会被隐式转为字符串，破坏消息结构。应复用已有的追加函数。

应用如下补丁：

-                    const content = await ctx.sst.audio2text(session)
-                    logger.debug(`audio2text: ${content}`)
-                    message.content += content
+                    const content = await ctx.sst.audio2text(session)
+                    logger.debug(`audio2text: ${content}`)
+                    addMessageContent(message, content)

🧹 Nitpick comments (5)

packages/core/src/middlewares/chat/read_chat_message.ts (2)
188-195: 为外部 HTTP 拉取增加超时以避免请求悬挂

建议添加合理超时（例如 10s）。如 ctx.http 兼容 axios 选项，可直接使用 timeout。

应用如下补丁（请确认 timeout 配置项在当前 Koishi 版本有效）：
     const response = await ctx.http(url, {
       responseType: 'arraybuffer',
       method: 'get',
       headers: {
         'User-Agent':
           'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'
-      }
+      },
+      timeout: 10000
     })
97-101: 文案小修（可选）

“安装了”语法更自然，便于用户理解。

应用如下补丁：
-If you are install chatluna-image-service plugin, please ignore this warning.
+If you have installed the chatluna-image-service plugin, you can ignore this warning.
packages/image-service/src/index.ts (3)
160-161: 替换 console.log 为框架日志

与其他日志保持一致，便于级别控制与收集。

应用如下补丁：
-    console.log(images)
+    logger.debug('extracted images', images)
130-137: 为图片拉取增加超时

同上，避免外部请求悬挂。

应用如下补丁（请确认超时配置项有效）：
   const response = await ctx.http(url, {
     responseType: 'arraybuffer',
     method: 'get',
     headers: {
       'User-Agent':
         'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'
-    }
+    },
+    timeout: 10000
   })
122-152: （可选）根据响应 Content-Type 推断扩展名

当前仅从 URL 推断，遇到无扩展名/带查询串的 URL 可能退化为 jpeg。可优先使用 response.headers['content-type'] 映射为 png/jpeg。

如需我补一个 MIME→扩展名的小映射表，请告知。

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ef80715 and 9f1616f.

⛔ Files ignored due to path filters (1)

packages/image-service/package.json is excluded by !**/*.json

📒 Files selected for processing (3)

packages/core/src/llm-core/model/base.ts (0 hunks)
packages/core/src/middlewares/chat/read_chat_message.ts (3 hunks)
packages/image-service/src/index.ts (6 hunks)

💤 Files with no reviewable changes (1)

packages/core/src/llm-core/model/base.ts

🧰 Additional context used

🧬 Code graph analysis (2)

packages/core/src/middlewares/chat/read_chat_message.ts (1)

packages/core/src/utils/string.ts (1)

getMessageContent (91-107)

packages/image-service/src/index.ts (1)

packages/core/src/types.ts (1)

Message (53-67)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: build
GitHub Check: lint
GitHub Check: build

🔇 Additional comments (5)

packages/core/src/middlewares/chat/read_chat_message.ts (2)

214-221: 改用 image_url 结构推图 LGTM

与下游 isMessageContentImageUrl 对齐，结构正确。

238-252: 文本追加逻辑兼容字符串与复合消息 LGTM

能够在混合内容场景下稳健追加文本。

packages/image-service/src/index.ts (3)

35-70: 使用“伪消息”承载图片并在完成后回填文本的设计合理

不污染原消息，且通过 ctx.effect(() => disposable) 正确清理拦截器。

184-191: image_url 结构对接 LangChain/下游检测函数 LGTM

类型与提取逻辑一致，兼容性良好。

193-210: 文本回填策略 LGTM

优先合并末尾文本块，避免碎片化。

packages/core/src/middlewares/chat/read_chat_message.ts

Change message content validation from OR to AND condition and add trim() to properly handle empty messages with whitespace content.

dingyi222666 added 2 commits September 15, 2025 23:31

chore(image-service): bump version to 1.2.7

9f1616f

coderabbitai bot reviewed Sep 15, 2025

View reviewed changes

packages/core/src/middlewares/chat/read_chat_message.ts Show resolved Hide resolved

fix(core): improve message validation logic with AND condition

6dc3f4e

Change message content validation from OR to AND condition and add trim() to properly handle empty messages with whitespace content.

dingyi222666 merged commit 802499c into v1-dev Sep 15, 2025
8 checks passed

dingyi222666 deleted the fix/multimodal-image-read branch September 15, 2025 16:00

coderabbitai bot mentioned this pull request Sep 16, 2025

[Refactor] 响应式系统实现，更好的重载实现，减少 async 传播性 #548

Merged

coderabbitai bot mentioned this pull request Oct 5, 2025

[Feature] Add GIF animation support and improve image handling #572

Merged

coderabbitai bot mentioned this pull request Oct 18, 2025

[Fix] Add error handling for image reading failures #603

Merged

This was referenced Nov 30, 2025

[Fix] Optimize GIF frame extraction with incremental decoding #637

Merged

[Feature] Add DeepSeek reasoning content support #641

Merged

[Feature] Support Dify image uploads with data URL normalizatio #653

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix: improve multimodal image processing and message handling #545

fix: improve multimodal image processing and message handling #545

Uh oh!

dingyi222666 commented Sep 15, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Sep 15, 2025 •

edited

Loading

Review ran into problems

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

fix: improve multimodal image processing and message handling #545

fix: improve multimodal image processing and message handling #545

Uh oh!

Conversation

dingyi222666 commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Uh oh!

coderabbitai bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Review ran into problems

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dingyi222666 commented Sep 15, 2025 •

edited

Loading

coderabbitai bot commented Sep 15, 2025 •

edited

Loading