Skip to content

Conversation

@dingyi222666
Copy link
Member

This PR introduces model availability testing functionality and improves token usage tracking across multiple adapters.

New Features

  • Model Test Command: Add chatluna.model.test command for verifying model availability and connectivity

    • Test models by sending a simple "Hello" request
    • Validate platform adapter existence and availability
    • Report response time and sample output
    • Support both full model names (e.g., openai/gpt-3.5-turbo) and platform names
    • Bilingual support (English and Chinese) with comprehensive error messages
  • Enhanced Token Usage Tracking:

    • Add streaming token usage metadata support for Gemini, Qwen, and Zhipu adapters
    • Standardize token usage reporting across shared-adapter requester
    • Improve token usage logging in ChatLunaChatModel for both streaming and non-streaming modes
  • New Model Support:

    • Add Claude Sonnet 4.5 model (claude-sonnet-4-5-20250929)
    • Add GLM-4.6 model with 200k context window

Bug Fixes

  • Fix OpenAI tools agent output parser to prioritize additional_kwargs.tool_calls over regex parsing for more reliable tool call detection

Other Changes

  • Enhance agent executor debugging with better step logging
  • Update Claude max tokens from 2M to 200k for consistency
  • Improve error handling and user feedback across model testing workflows

These improvements provide better visibility into token consumption, enhance model compatibility, and make it easier for users to verify their model configurations and diagnose connectivity issues.

Add a new `chatluna.model.test` command that allows users to test whether a specific model or adapter is available and working correctly. This command sends a simple test request to the model and reports the response time and result.

Features:
- Test model availability by sending a "Hello" test message
- Validate platform adapter existence and availability
- Report response time and sample output
- Support both full model names (e.g., openai/gpt-3.5-turbo) and platform names (e.g., openai)
- Comprehensive error handling with user-friendly messages
- Bilingual support (English and Chinese)

This enhancement helps users quickly verify their model configurations and diagnose connectivity issues.
Enhance token usage tracking across multiple adapters and improve debugging capabilities for agent execution.

Changes:
- Add streaming token usage metadata support for Gemini, Qwen, and Zhipu adapters
- Add new Claude Sonnet 4.5 model (claude-sonnet-4-5-20250929)
- Add GLM-4.6 model support with 200k context window
- Improve token usage logging in ChatLunaChatModel for both streaming and non-streaming modes
- Fix OpenAI tools agent output parser to prioritize additional_kwargs.tool_calls
- Enhance agent executor debugging with better step logging
- Standardize token usage reporting across shared-adapter requester
- Update Claude max tokens from 2M to 200k for consistency

These improvements provide better visibility into token consumption and enhance model compatibility across the platform.
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @dingyi222666, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the robustness and observability of the system by introducing a dedicated command for testing model availability and enhancing token usage tracking across various adapters. These changes aim to provide users with clearer insights into their model configurations and consumption, while also expanding model compatibility and refining debugging capabilities.

Highlights

  • Model Testing Command: Introduced a new chatluna.model.test command to verify model availability and connectivity, supporting both full model names (e.g., openai/gpt-3.5-turbo) and platform names (e.g., openai), with bilingual error messages.
  • Enhanced Token Usage Tracking: Implemented streaming token usage metadata support for Gemini, Qwen, and Zhipu adapters, and standardized token usage reporting across shared adapters and the ChatLunaChatModel for both streaming and non-streaming modes.
  • New Model Support: Added Claude Sonnet 4.5 (claude-sonnet-4-5-20250929) and GLM-4.6 (with a 200k context window) models, expanding the available model options.
  • OpenAI Tools Agent Fix: Corrected the OpenAI tools agent output parser to reliably detect tool calls by prioritizing additional_kwargs.tool_calls over regex parsing.
  • Improved Debugging & Error Handling: Enhanced agent executor debugging with better step logging and updated Claude's max tokens from 2M to 200k for consistency.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 10, 2025

Important

Review skipped

Review was skipped due to path filters

⛔ Files ignored due to path filters (26)
  • packages/adapter-azure-openai/package.json is excluded by !**/*.json
  • packages/adapter-claude/package.json is excluded by !**/*.json
  • packages/adapter-deepseek/package.json is excluded by !**/*.json
  • packages/adapter-dify/package.json is excluded by !**/*.json
  • packages/adapter-doubao/package.json is excluded by !**/*.json
  • packages/adapter-gemini/package.json is excluded by !**/*.json
  • packages/adapter-hunyuan/package.json is excluded by !**/*.json
  • packages/adapter-ollama/package.json is excluded by !**/*.json
  • packages/adapter-openai-like/package.json is excluded by !**/*.json
  • packages/adapter-openai/package.json is excluded by !**/*.json
  • packages/adapter-qwen/package.json is excluded by !**/*.json
  • packages/adapter-rwkv/package.json is excluded by !**/*.json
  • packages/adapter-spark/package.json is excluded by !**/*.json
  • packages/adapter-wenxin/package.json is excluded by !**/*.json
  • packages/adapter-zhipu/package.json is excluded by !**/*.json
  • packages/core/package.json is excluded by !**/*.json
  • packages/extension-long-memory/package.json is excluded by !**/*.json
  • packages/extension-mcp/package.json is excluded by !**/*.json
  • packages/extension-tools/package.json is excluded by !**/*.json
  • packages/extension-variable/package.json is excluded by !**/*.json
  • packages/renderer-image/package.json is excluded by !**/*.json
  • packages/service-embeddings/package.json is excluded by !**/*.json
  • packages/service-image/package.json is excluded by !**/*.json
  • packages/service-search/package.json is excluded by !**/*.json
  • packages/service-vector-store/package.json is excluded by !**/*.json
  • packages/shared-adapter/package.json is excluded by !**/*.json

CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including **/dist/** will override the default block on the dist directory, by removing the pattern from both the lists.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

本次变更包含:更新多个适配器的模型清单与 maxTokens,统一并提前在流式响应中下发 tokenUsage,新增并导出用于流式用量的类型,添加 test_model 命令及中间件,重构 OpenAI 工具调用解析,并移除 extension-tools 的 knowledge 功能(配置、插件与实现)。

Changes

Cohort / File(s) Change Summary
Claude 模型清单
packages/adapter-claude/src/client.ts
新增模型 claude-sonnet-4-5-20250929;刷新模型列表项加入 type: ModelType.llm;modelInfo 构建改为使用当前 info;将所有模型 maxTokens 从 2,000,000 降为 200_000,调整相关 token 限制计算。
Zhipu 模型清单
packages/adapter-zhipu/src/client.ts
在 rawModels 中新增 GLM-4.6(maxTokens 200_000);微小格式调整。
共享适配器:流式用量提前下发
packages/shared-adapter/src/requester.ts
当响应分片含 data.usage 时,立即产出带 tokenUsage 的空消息 ChatGenerationChunk 并继续,移除之前在消息后再发用量块的逻辑。
Zhipu 请求器:流式用量下发
packages/adapter-zhipu/src/requester.ts
在 completionStreamInternal 中检测 data.usage 时先发空消息的 tokenUsage chunk,再继续处理后续 choice;新增对 AIMessageChunk 的导入。
Qwen 请求器:用量与日志
packages/adapter-qwen/src/requester.ts
当响应含用量时发出包含 tokenUsage 的 ChatGenerationChunk(同时推空 AIMessageChunk),并将推理内容日志由 console.log 改为 logger.debug(新增导入)。
Gemini:流式用量与类型公开
packages/adapter-gemini/src/requester.ts, packages/adapter-gemini/src/types.ts
在流式处理链识别并解析 usageMetadata,计算 prompt/completion/total tokens 并通过流下发;新增并导出 ChatUsageMetadataPart,将其并入 ChatPart 联合类型。
平台层:采集与记录 token 用量
packages/core/src/llm-core/platform/model.ts
在流式与非流式生成路径中引入并填充 latestTokenUsage / generationInfo.tokenUsage,在流结束或非流路径存在用量时记录日志;无用量时尝试计算并填充。
新增命令与中间件:test_model
packages/core/src/commands/model.ts, packages/core/src/middleware.ts, packages/core/src/middlewares/model/test_model.ts
新增命令 chatluna.model.test <model> 并注册中间件 test_model:解析平台/模型、列出并选择模型、等待客户端就绪、创建模型实例并以 invoke("Hello", maxTokens=10, timeout=60s) 进行测试,测时并通过会话反馈结果;新增类型声明以暴露中间件名与可选 model 字段。
Agent 执行器日志条件调整
packages/core/src/llm-core/agent/executor.ts
扩大触发日志条件(新增 newSteps.length === 0);记录完整 output 对象而非仅 actions。
OpenAI 输出解析(工具调用)
packages/core/src/llm-core/agent/openai/output_parser.ts
优先从 message.additional_kwargs.tool_calls 读取 tool call,使用 function.namefunction.arguments(JSON 解析)构建工具名与输入;保留旧字段回退分支并增强错误处理与日志信息。
Gemini / 其它适配器的类型与导出
packages/adapter-gemini/src/types.ts, ...adapter-*/src/requester.ts
在 Gemini types 中新增并导出 ChatUsageMetadataPart,并在多个适配器中消费/传播该用量类型。
移除 knowledge 功能(extension-tools)
packages/extension-tools/src/config.ts, packages/extension-tools/src/plugin.ts, packages/extension-tools/src/plugins/knowledge.ts
从配置接口与 schema 中移除 knowledge 相关字段;从插件中移除对 knowledge 的导入与注册;删除整个 plugins/knowledge.ts(KnowledgeTool、createSearchChain 与 apply 等被移除)。

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as 用户
  participant C as 命令系统
  participant MW as 中间件 test_model
  participant P as 平台客户端
  participant M as 模型实例

  U->>C: 触发 `chatluna.model.test <model>`
  C->>MW: 转发 test_model 动作
  MW->>P: 获取/等待平台客户端与模型列表
  alt 仅提供平台名
    P-->>MW: 返回模型列表
    MW->>MW: 随机或选择模型
  end
  MW->>M: invoke("Hello", maxTokens=10, timeout=60s)
  M-->>MW: 返回响应或错误
  MW-->>U: 反馈测试结果(内容/无内容/错误)与耗时
Loading
sequenceDiagram
  autonumber
  participant Prov as Provider API
  participant Req as Requester(shared/adapter)
  participant C as Consumer(平台层/上游)

  Prov-->>Req: 发送流式分片(可能含 usage)
  alt 分片含 usage
    Req-->>C: 先发 ChatGenerationChunk(tokenUsage, 空消息)
    Req->>Req: 跳过本次增量解析
  else 普通增量/内容
    Req-->>C: 发消息增量/choice delta(含文本/打包信息)
  end
  loop 持续直到完成
    Prov-->>Req: 更多分片
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

我是小兔代码郎,嗅到改动像鲜香,
模型添新与用量报,流里先行把数尝。
测试一呼即回响,知识旧枝已放下,🐰✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed 标题准确地突出“模型测试”和“令牌使用跟踪”两大核心功能改进,与变更集主要目标高度契合且表述简洁明了。
Description Check ✅ Passed 拉取请求描述详细列出了新增模型测试命令、增强的令牌使用跟踪、新模型支持、修复项及其他改进,与变更集内容保持高度一致且信息清晰。

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several valuable enhancements, including a new model testing command and improved token usage tracking across various adapters. The addition of new models and the bug fix for the OpenAI tools agent parser are also great improvements.

My review focuses on improving code robustness and maintainability. I've identified a potential runtime error in the OpenAI output parser due to unsafe JSON parsing and suggested a safer approach. I've also pointed out opportunities for refactoring to reduce code duplication in the new model testing middleware and to strengthen a type guard in the Gemini adapter.

Overall, this is a solid contribution that significantly improves the project's functionality and observability. The changes are well-structured and the new features are thoughtfully implemented.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (9)
packages/core/src/llm-core/agent/openai/output_parser.ts (3)

154-176: 优先从 additional_kwargs.tool_calls 解析:稳健性再加强(空白参数与逐条报错)

方向对。建议两点小改进以减少解析噪音并提升定位能力:

  • 在 JSON.parse 前对 arguments 做 trim,避免仅空白字符串导致抛错;
  • 将 try/catch 下沉到 map 内,携带出错的下标与函数名,便于快速定位;
  • 日志里 arguments 为空或仅空白时统一回退为 "{}"。

可按下述方式调整映射体:

-            return toolCalls.map((toolCall, i) => {
-                    const toolInput = toolCall.function.arguments
-                        ? JSON.parse(toolCall.function.arguments)
-                        : {}
-                    const messageLog = i === 0 ? [message] : []
-                    return {
-                        tool: toolCall.function.name as string,
-                        toolInput,
-                        toolCallId: toolCall.id,
-                        log:
-                            message.content?.length > 0
-                                ? (message.content as string)
-                                : `Invoking "${toolCall.function.name}" with ${
-                                      toolCall.function.arguments ?? '{}'
-                                  }`,
-                        messageLog
-                    }
-                })
+            return toolCalls.map((toolCall, i) => {
+                try {
+                    const rawArgs = toolCall.function?.arguments
+                    const toolInput =
+                        rawArgs && rawArgs.trim().length > 0
+                            ? JSON.parse(rawArgs)
+                            : {}
+                    const messageLog = i === 0 ? [message] : []
+                    return {
+                        tool: toolCall.function?.name as string,
+                        toolInput,
+                        toolCallId: toolCall.id,
+                        log:
+                            message.content?.length > 0
+                                ? (message.content as string)
+                                : `Invoking "${toolCall.function?.name}" with ${
+                                      rawArgs && rawArgs.trim().length > 0 ? rawArgs : '{}'
+                                  }`,
+                        messageLog
+                    }
+                } catch (e) {
+                    throw new OutputParserException(
+                        `Failed to parse tool arguments for call[${i}] "${toolCall?.function?.name}". ${e}`
+                    )
+                }
+            })

159-160: 冗余类型注解,可简化

左侧已声明显式类型时,右侧 as ChatCompletionMessageToolCall[] 多余;或保留右侧断言,移除左侧注解,二者留其一即可,减少重复。

-            const toolCalls: ChatCompletionMessageToolCall[] = message
-                .additional_kwargs.tool_calls as ChatCompletionMessageToolCall[]
+            const toolCalls = message
+                .additional_kwargs.tool_calls as ChatCompletionMessageToolCall[]

201-209: 请为 toolCallId 增加兼容性回退

当前仅使用 toolCall.id,在部分旧版/流式消息中可能仅包含 tool_call_id 或无 ID,导致下游无法关联工具调用结果。建议改为:

-    toolCallId: toolCall.id /* ?? `tool_call_${i}` */,
+    toolCallId:
+        (toolCall as any).id ??
+        (toolCall as any).tool_call_id ??
+        `tool_call_${i}`,
packages/core/src/llm-core/agent/executor.ts (1)

628-634: 避免在 debug 日志中序列化超大对象(性能/隐私)

此处在条件放宽的同时,将完整 output 序列化输出。JSON.stringify 会在 debug 级别判断前就执行,可能带来性能开销且泄漏敏感内容。建议截断/惰性化处理。

-            if (newSteps.length === 0 || lastStep == null) {
-                logger.debug(
-                    'last Step:',
-                    JSON.stringify(lastStep),
-                    'output',
-                    JSON.stringify(output)
-                )
-            }
+            if (newSteps.length === 0 || lastStep == null) {
+                const lastStepStr = JSON.stringify(lastStep)?.slice(0, 2000)
+                const outputStr = JSON.stringify(output)?.slice(0, 2000)
+                logger.debug('last Step:', lastStepStr, 'output', outputStr)
+            }
packages/adapter-qwen/src/requester.ts (2)

34-36: 统一日志器使用,移除全局 logger 依赖

本类已提供 this.logger,建议移除全局 logger 引入,保持日志前缀一致。

-import { AIMessageChunk } from '@langchain/core/messages'
-import { logger } from 'koishi-plugin-chatluna'
+import { AIMessageChunk } from '@langchain/core/messages'

203-210: 使用实例级日志器

与文件其余部分保持一致,改为 this.logger.debug。

-                logger.debug(
+                this.logger.debug(
                     'reasoningContent: ' +
                         reasoningContent +
                         ', reasoningTime: ' +
                         reasoningTime / 1000 +
                         's'
                 )
packages/core/src/llm-core/platform/model.ts (1)

253-261: 提取统一的 token 用量日志方法(去重/更稳健)

此处与非流式路径(行296-301)日志格式重复,建议提取私有方法并使用可选链,避免硬编码键名与重复代码。

示例:

private logTokenUsage(tu?: { promptTokens?: number; completionTokens?: number; totalTokens?: number }) {
  if (!tu?.totalTokens) return
  logger.debug(
    'Token usage from API: Prompt Token = %d, Completion Token = %d, Total Token = %d',
    tu.promptTokens ?? 0, tu.completionTokens ?? 0, tu.totalTokens ?? 0
  )
}
packages/core/src/middlewares/model/test_model.ts (2)

24-53: 可以考虑移除前导分号。

第 52 行的前导分号虽然在语法上有效(用于防止自动分号插入问题),但在此处不是必需的,因为前一条语句已经明确结束。

                 const selectedModel = platformModels.value[randomIndex]
                 platformName = model
                 modelName = selectedModel.name
             } else {
                 // Parse the full model name
-                ;[platformName, modelName] = parseRawModelName(model)
+                [platformName, modelName] = parseRawModelName(model)
             }

82-112: 考虑增加测试的 maxTokens 限制。

测试调用使用 maxTokens: 10,这可能对某些模型来说太低,无法生成有意义的响应。虽然这对于可用性测试来说可能足够,但稍微增加这个值(例如 20-50)可能会提供更可靠的测试结果。

-                    const response = await chatModel.invoke('Hello', {
-                        maxTokens: 10,
+                    const response = await chatModel.invoke('Hello', {
+                        maxTokens: 20,
                         signal: AbortSignal.timeout(60000)
                     })
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2c01b60 and 74c7b47.

⛔ Files ignored due to path filters (2)
  • packages/core/src/locales/en-US.yml is excluded by !**/*.yml
  • packages/core/src/locales/zh-CN.yml is excluded by !**/*.yml
📒 Files selected for processing (13)
  • packages/adapter-claude/src/client.ts (1 hunks)
  • packages/adapter-gemini/src/requester.ts (3 hunks)
  • packages/adapter-gemini/src/types.ts (1 hunks)
  • packages/adapter-qwen/src/requester.ts (3 hunks)
  • packages/adapter-zhipu/src/client.ts (1 hunks)
  • packages/adapter-zhipu/src/requester.ts (2 hunks)
  • packages/core/src/commands/model.ts (1 hunks)
  • packages/core/src/llm-core/agent/executor.ts (1 hunks)
  • packages/core/src/llm-core/agent/openai/output_parser.ts (2 hunks)
  • packages/core/src/llm-core/platform/model.ts (3 hunks)
  • packages/core/src/middleware.ts (2 hunks)
  • packages/core/src/middlewares/model/test_model.ts (1 hunks)
  • packages/shared-adapter/src/requester.ts (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (6)
packages/core/src/commands/model.ts (1)
packages/core/src/llm-core/chain/plugin_chat_chain.ts (1)
  • model (272-274)
packages/core/src/llm-core/platform/model.ts (1)
packages/core/src/index.ts (1)
  • logger (38-38)
packages/adapter-qwen/src/requester.ts (3)
packages/adapter-gemini/src/requester.ts (1)
  • logger (737-739)
packages/adapter-zhipu/src/client.ts (1)
  • logger (29-31)
packages/adapter-zhipu/src/requester.ts (1)
  • logger (236-238)
packages/core/src/llm-core/agent/openai/output_parser.ts (1)
packages/core/src/llm-core/agent/types.ts (1)
  • ChatCompletionMessageToolCall (6-21)
packages/core/src/middlewares/model/test_model.ts (3)
packages/core/src/commands/model.ts (1)
  • apply (5-38)
packages/core/src/chains/chain.ts (1)
  • ChatChain (14-366)
packages/core/src/llm-core/utils/count_tokens.ts (1)
  • parseRawModelName (195-205)
packages/adapter-gemini/src/requester.ts (1)
packages/adapter-gemini/src/types.ts (1)
  • ChatUsageMetadataPart (20-26)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (15)
packages/adapter-qwen/src/requester.ts (1)

142-155: 在流式中提前产出 usage chunk:实现正确

按供应商 usage 事件独立产出一个空消息的 ChatGenerationChunk,并填充 tokenUsage,便于上游统计。与其他适配器保持一致,LGTM。

packages/shared-adapter/src/requester.ts (1)

153-166: 将 usage 提前产出并跳过当次 delta 解析:LGTM

提前发出 tokenUsage chunk,并用 continue 跳过本轮解析,避免重复。与非流式路径的 usage 返回保持一致。

packages/core/src/llm-core/platform/model.ts (3)

199-207: latestTokenUsage 缓存初始化:LGTM

为流式聚合 usage 做好占位,后续用于统一日志输出。


235-245: 基于每个 chunk 聚合 tokenUsage:LGTM

按 generationInfo.tokenUsage 同步数值,满足多供应商 usage 事件的兼容。


296-301: 非流式路径的 usage 日志:LGTM

当 API 已返回 usage 时直接记录,和流式保持一致。

packages/adapter-claude/src/client.ts (1)

44-45: 确认 'claude-sonnet-4-5-20250929' 为官方模型 ID
已确认该 ID 为 Sonnet 4.5 版本的供应商特定标识,标准上下文窗口 200K tokens(可选 beta 扩展至 1M)。

packages/core/src/middleware.ts (1)

33-33: LGTM!

新增的 test_model 中间件导入和注册与现有模式一致,实现正确。

Also applies to: 98-98

packages/core/src/commands/model.ts (1)

31-37: LGTM!

新增的 chatluna.model.test 命令实现正确,与现有命令模式一致,正确地将模型参数转发到中间件链。

packages/adapter-zhipu/src/requester.ts (1)

137-150: 验证 token 用量字段的存在性。

代码直接访问 data.usage.prompt_tokensdata.usage.completion_tokensdata.usage.total_tokens,未进行字段存在性检查。如果 API 响应中这些字段缺失或为 null,可能导致运行时错误。

请确认智谱 API 是否保证在 data.usage 存在时这些字段始终存在且非空。如果不能保证,建议添加可选链或字段验证:

 if (data.usage) {
     yield new ChatGenerationChunk({
         message: new AIMessageChunk(''),
         text: '',
         generationInfo: {
             tokenUsage: {
-                promptTokens: data.usage.prompt_tokens,
-                completionTokens: data.usage.completion_tokens,
-                totalTokens: data.usage.total_tokens
+                promptTokens: data.usage.prompt_tokens ?? 0,
+                completionTokens: data.usage.completion_tokens ?? 0,
+                totalTokens: data.usage.total_tokens ?? 0
             }
         }
     })
     continue
 }
packages/adapter-gemini/src/types.ts (1)

12-13: LGTM!

新增的 ChatUsageMetadataPart 类型定义清晰,正确地集成到 ChatPart 联合类型中,为 token 用量元数据提供了类型安全支持。

Also applies to: 20-26

packages/adapter-gemini/src/requester.ts (2)

415-432: 验证 token 计算的边界情况。

candidatesTokenCount 未定义时,代码使用 totalTokens - promptTokens 作为 completionTokens 的回退值。如果 totalTokens <= promptTokens,这可能产生零或负数,可能会破坏期望正数的下游逻辑。

请确认 Gemini API 是否保证 totalTokens >= promptTokens。如果不能保证,建议添加边界检查:

 const completionTokens =
     transformValue.usageMetadata.candidatesTokenCount ??
-    totalTokens - promptTokens
+    Math.max(0, totalTokens - promptTokens)

492-508: LGTM!

token 用量检测和生成块发射的实现正确,使用类型守卫确保了类型安全,并正确地将用量信息传递到 generationInfo

packages/core/src/middlewares/model/test_model.ts (2)

8-133: 整体实现优秀!

test_model 中间件的实现具有以下优点:

  • 完善的错误处理,涵盖平台不存在、模型不可用、调用失败等多种场景
  • 合理的超时机制(10s 等待平台加载,60s 模型调用超时)
  • 清晰的用户反馈,包括响应时间和内容预览
  • 支持平台名和完整模型名两种输入格式
  • 适当的生命周期钩子配置

135-143: LGTM!

TypeScript 模块扩展正确地声明了新的中间件名称和上下文选项,确保了类型安全。

packages/adapter-zhipu/src/client.ts (1)

77-77: 确认 GLM-4.6 的 token 限制正确
GLM-4.6 支持 200,000 tokens 的上下文长度,与代码中配置一致。

…nd remove knowledge plugin

- Fix Claude adapter to use correct model info instead of defaulting to first model
- Refactor test_model middleware to eliminate responseTime calculation duplication
- Remove deprecated knowledge plugin and related configuration
- Clean up plugin imports and configuration schema

Breaking Changes:
- Removed knowledge plugin functionality from extension-tools
- Removed knowledgeId, knowledgeSelector configuration options

The knowledge plugin has been removed as it's being replaced by dedicated
retriever implementations in other packages.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
packages/core/src/middlewares/model/test_model.ts (3)

14-17: 建议添加 model 参数验证。

当前代码直接解构 model 参数,但未验证其是否存在或为空。虽然外层的 try-catch(line 122)会捕获相关错误,但最好在处理前显式检查并提供更明确的错误消息。

可以在 line 17 之后添加验证:

 const {
     command,
     options: { model }
 } = context

+if (!model || model.trim().length === 0) {
+    context.message = session.text('.model_required')
+    return ChainMiddlewareRunStatus.STOP
+}
+
 if (command !== 'test_model')
     return ChainMiddlewareRunStatus.SKIPPED

72-74: 类型断言可能存在风险。

使用 as ChatLunaChatModel 进行类型断言,假设平台返回的模型类型正确。虽然在当前上下文中这可能是安全的,但如果平台返回了不兼容的模型类型,可能导致运行时错误。

考虑添加类型检查或使用更安全的类型守卫:

 // Create the model
-const chatModel = client.value.createModel(
-    modelName
-) as ChatLunaChatModel
+const model = client.value.createModel(modelName)
+const chatModel = model instanceof ChatLunaChatModel ? model : null

-if (!chatModel) {
+if (!chatModel || !(chatModel instanceof ChatLunaChatModel)) {
     context.message = session.text('.model_not_found', [
         `${platformName}/${modelName}`
     ])

或者,如果类型断言是预期行为,可以保持现状,但建议在文档中说明。


13-129: 考虑重构以降低函数复杂度。

中间件函数较长(117 行),包含多个逻辑分支。虽然代码功能正确,但可以考虑提取辅助函数来提升可维护性:

  1. 提取模型名称解析逻辑(lines 27-54)
  2. 提取客户端获取逻辑(lines 57-69)
  3. 提取模型测试逻辑(lines 83-121)

这样可以降低认知复杂度,使代码更易于测试和维护。

示例重构结构:

async function parseModelInput(
    model: string,
    services: PlatformService
): Promise<{ platformName: string; modelName: string } | null> {
    // 提取 lines 27-54 的逻辑
}

async function getOrWaitForClient(
    ctx: Context,
    services: PlatformService,
    platformName: string
): Promise<PlatformClient | null> {
    // 提取 lines 57-69 的逻辑
}

async function testModel(
    chatModel: ChatLunaChatModel,
    platformName: string,
    modelName: string
): Promise<TestResult> {
    // 提取 lines 83-121 的逻辑
}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 74c7b47 and d8b644b.

⛔ Files ignored due to path filters (2)
  • packages/extension-tools/src/locales/en-US.schema.yml is excluded by !**/*.yml
  • packages/extension-tools/src/locales/zh-CN.schema.yml is excluded by !**/*.yml
📒 Files selected for processing (5)
  • packages/adapter-claude/src/client.ts (2 hunks)
  • packages/core/src/middlewares/model/test_model.ts (1 hunks)
  • packages/extension-tools/src/config.ts (0 hunks)
  • packages/extension-tools/src/plugin.ts (1 hunks)
  • packages/extension-tools/src/plugins/knowledge.ts (0 hunks)
💤 Files with no reviewable changes (2)
  • packages/extension-tools/src/config.ts
  • packages/extension-tools/src/plugins/knowledge.ts
🧰 Additional context used
🧬 Code graph analysis (2)
packages/extension-tools/src/plugin.ts (1)
packages/core/src/command.ts (1)
  • command (13-27)
packages/core/src/middlewares/model/test_model.ts (3)
packages/core/src/commands/model.ts (1)
  • apply (5-38)
packages/core/src/chains/chain.ts (1)
  • ChatChain (14-366)
packages/core/src/llm-core/utils/count_tokens.ts (1)
  • parseRawModelName (195-205)
🪛 GitHub Check: CodeFactor
packages/core/src/middlewares/model/test_model.ts

[notice] 13-141: packages/core/src/middlewares/model/test_model.ts#L13-L141
Complex Method

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build
  • GitHub Check: lint
🔇 Additional comments (3)
packages/core/src/middlewares/model/test_model.ts (1)

92-121: 确认:responseTime 重复计算问题已解决。

当前实现正确地在 try-catch 块之后统一计算了 responseTime(lines 101-102),避免了过往评审中提到的代码重复问题。实现逻辑清晰合理。

packages/adapter-claude/src/client.ts (2)

61-76: 已正确修复 modelInfo 使用问题

Line 65 现在使用当前模型的 info(从 Line 62 获取),而非之前评审中指出的"始终使用第一个 modelInfo 条目"的问题。这确保了:

  • modelInfo 与实际选择的模型一致
  • Lines 67-70 的 maxTokenLimitmodelMaxContextSize 正确基于当前模型的 token 限制计算

此修改解决了之前评审中的关键问题。


44-51: 确认 Claude 模型的 token 限制并更新文档

  • Line 44:claude-sonnet-4-5-20250929 添加合理,符合 PR 目标。
  • Line 51:所有模型的 maxTokens 已统一为 200 000,与官方 GA 限制一致。注意 Claude Sonnet 4.5 预览版支持最高 1 000 000 tokens,但 GA 模式下限制为 200 000。
  • 建议在更新日志或迁移指南中明确此次破坏性变更,并提醒用户调整上下文使用。

…nowledge plugin dependency

- Add validation for model format in test_model command
- Add error message for invalid model format (platform/model)
- Remove knowledge plugin dependency from extension-tools
- Update locales with new invalid_model_format message

This commit improves error handling in the model test command by validating
the parsed model name format and providing clear error messages when the
format is incorrect. Additionally, it cleans up the extension-tools package
by removing the optional knowledge plugin dependency.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
packages/core/src/middlewares/model/test_model.ts (2)

119-124: 考虑更防御性的内容处理。

虽然 toString() 可以处理各种内容类型,但 response.content 在某些模型中可能是数组或复杂对象。建议添加类型检查以提高健壮性。

可以应用此改进:

                } else if (response && response.content) {
+                    const contentStr = typeof response.content === 'string' 
+                        ? response.content 
+                        : response.content.toString()
                    context.message = session.text('.test_success', [
                        `${platformName}/${modelName}`,
                        responseTime.toString(),
-                        response.content.toString().substring(0, 50)
+                        contentStr.substring(0, 50)
                    ])

13-141: 考虑提取辅助函数以降低复杂度。

静态分析工具标记此方法为"复杂方法"(128 行)。虽然当前实现逻辑清晰且可维护,但可以考虑将其拆分为更小的辅助函数以提高可测试性和可读性。

可以考虑提取以下辅助函数:

  • resolvePlatformAndModel() - 处理平台和模型名称的解析(lines 25-63)
  • acquireClient() - 获取平台客户端(lines 65-78)
  • testModelRequest() - 执行模型测试并测量响应时间(lines 92-130)

这样可以使主函数更简洁,同时提高代码的可测试性。

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d8b644b and 7b4916d.

⛔ Files ignored due to path filters (3)
  • packages/core/src/locales/en-US.yml is excluded by !**/*.yml
  • packages/core/src/locales/zh-CN.yml is excluded by !**/*.yml
  • packages/extension-tools/package.json is excluded by !**/*.json
📒 Files selected for processing (2)
  • packages/core/src/middlewares/model/test_model.ts (1 hunks)
  • packages/extension-tools/src/config.ts (0 hunks)
💤 Files with no reviewable changes (1)
  • packages/extension-tools/src/config.ts
🧰 Additional context used
🧬 Code graph analysis (1)
packages/core/src/middlewares/model/test_model.ts (2)
packages/core/src/commands/model.ts (1)
  • apply (5-38)
packages/core/src/llm-core/utils/count_tokens.ts (1)
  • parseRawModelName (195-205)
🪛 GitHub Check: CodeFactor
packages/core/src/middlewares/model/test_model.ts

[notice] 13-141: packages/core/src/middlewares/model/test_model.ts#L13-L141
Complex Method

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: lint
  • GitHub Check: build
🔇 Additional comments (3)
packages/core/src/middlewares/model/test_model.ts (3)

55-62: 验证逻辑已正确实现。

此前的审查意见已得到妥善处理:代码现在正确验证了 parseRawModelName 的返回值,避免了使用 undefined 值的风险。


97-111: 响应时间计算已优化。

此前关于 responseTime 计算重复的问题已得到解决。当前实现将时间计算统一放在 try-catch 块之后,避免了代码重复,提高了可维护性。


139-151: 生命周期钩子和类型声明配置正确。

中间件的生命周期钩子位置恰当,类型声明也正确扩展了链中间件接口。实现符合框架约定。

- Bump core package version from 1.3.0-alpha.55 to 1.3.0-alpha.56
- Bump shared-adapter version from 1.0.12 to 1.0.13
- Update peer dependency references across all adapter packages
- Update peer dependency references across all extension packages
- Update peer dependency references across all service packages

This release includes model validation improvements and knowledge plugin cleanup.
@dingyi222666 dingyi222666 merged commit b8406fb into v1-dev Oct 10, 2025
2 of 3 checks passed
@dingyi222666 dingyi222666 deleted the feat/core-commands branch October 10, 2025 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants