-
-
Notifications
You must be signed in to change notification settings - Fork 41
[Feature] support model thinking levels and dynamic model fetching #665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add support for new Zhipu AI model variants: - GLM-4.6V (128k context) - GLM-4.6V-FlashX (128k context) - GLM-4.6V-Flash (128k context) - GLM-4.7 (200k context)
Add support for fetching available Claude models dynamically from the Anthropic /v1/models API endpoint with pagination support. The adapter now attempts to fetch the latest model list from the API and falls back to a built-in model list if the API call fails. Changes: - Add listModels() method to ClaudeRequester with pagination support - Implement ClaudeListModelsResponse and ClaudeModelInfo types - Update refreshModels() to fetch models from API with fallback - Add error handling and logging for API failures - Preserve API ordering (newer models first)
…mal-thinking support Refactor the refreshModels() method to use a cleaner, more maintainable approach for generating model variants. Add support for the new 'minimal-thinking' level for Gemini 3 models. Changes: - Refactor model variant generation logic with helper functions - Add 'gemini-3-flash' to thinking model list - Add 'minimal-thinking' suffix support for Gemini 3 models - Differentiate thinking level suffixes between 3-pro and 3-flash - Update thinking level detection to handle 'minimal' mode - Improve code readability and reduce duplication
…el normalization - Implement `parseOpenAIModelNameWithReasoningEffort` to extract reasoning levels from model names (e.g., `-thinking`, `-minimal-thinking`). - Add `normalizeOpenAIModelName` to strip thinking suffixes for compatibility. - Update `buildChatCompletionParams` to pass `reasoning_effort` and use normalized model names. - Ensure reasoning models (o1, o3, etc.) correctly bypass temperature and penalty parameters. - Update image input support and context size lookups to use normalized model names.
…enAI-style reasoning models - In `getModels`, automatically expand detected reasoning models (o1, o3, o4, gpt-5) with thinking level suffixes (`-low-thinking`, `-high-thinking`, etc.). - These virtual variants allow users to select reasoning effort levels via the model name. - Fix minor indentation and formatting in `GeminiClient` and `prepareModelConfig`.
|
Warning Rate limit exceeded@dingyi222666 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 7 minutes and 37 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Walkthrough此PR引入动态模型发现与模型名变体生成:Claude 通过 /v1/models 分页发现模型并回退到内置列表,Gemini 与 OpenAI 相关逻辑生成/规范化多种 thinking 变体,Zhipu 增补 GLM 系列变体,相关类型与请求器接口扩展用于支持新流程。 Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Claude Client
participant Requester as ClaudeRequester
participant API as Claude /v1/models
rect rgb(230, 248, 255)
Note right of Client: refreshModels()
end
Client->>Requester: listModels(afterId?, limit=100)
alt paginated results
loop while has_more
Requester->>API: GET /v1/models?after_id=last_id&limit=100
API-->>Requester: 200 { data[], has_more, last_id }
end
Requester->>Client: ClaudeListModelsResponse (aggregated)
Client->>Client: map ids -> ModelInfo (maxTokens=200000, capabilities)
else API error or empty
Requester--xClient: throw / return error
Client->>Client: log warning, use built-in fallback list
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @dingyi222666, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly upgrades the model interaction capabilities across various adapters by integrating advanced reasoning effort configurations and dynamic model discovery. It enables users to fine-tune model behavior through virtual model names, ensures the Claude adapter stays current with the latest Anthropic models, and improves overall model handling and compatibility across the platform. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces dynamic model listing for Claude by adding a new API endpoint (/v1/models) to fetch available models, with a fallback to a hardcoded list if the API call fails. It also refactors the Gemini adapter's refreshModels logic for better model expansion and adds minimal-thinking and medium-thinking suffixes for Gemini-3 models. The Zhipu adapter updates its hardcoded model list with new GLM-4.6V and GLM-4.7 models. Furthermore, the shared adapter gains new utilities to parse and normalize OpenAI-style model names that include 'reasoning effort' suffixes (e.g., -minimal-thinking), applying this normalization to context size calculation, image input support, and chat completion parameter building, including adding a reasoning_effort parameter to requests. The getModels function in the shared adapter is enhanced to generate these virtual model names for OpenAI-style reasoning models. A review comment suggests simplifying the error handling in the Claude requester's listModels method by using the cause option in the Error constructor for cleaner error chaining and improved type safety.
- Wrap long line in `overrideAdditionalArgs` call for better readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/adapter-gemini/src/utils.ts (1)
413-423: 修复thinking level参数的API兼容性问题。Line 413-422 存在多个严重问题,会导致API调用失败:
大小写不匹配:Line 422 将捕获的小写字符串(
'low'、'medium'等)直接赋给thinkingLevel,但Gemini 3 Pro API只接受大写的枚举值('LOW'、'HIGH')。而Line 394 初始化使用的是'THINKING_LEVEL_UNSPECIFIED'(大写),代码中大小写使用不一致。无效的thinking level值:
- Gemini 3 Pro仅支持
LOW和HIGH,但regex中捕获了'medium'、'tiny'、'minimal''tiny'不是任何Gemini API文档中的有效值'minimal'对Gemini 3 Pro无效,但Line 418-420 的特殊情况处理实际上是在隐藏这个问题特殊情况是一个workaround:Line 418-420 对
'minimal-3-pro'组合将thinkingLevel设为undefined,这是在补偿API不支持'minimal'的事实,但UI在 client.ts 中仍然向用户提供了'-minimal-thinking'后缀选项。缺少模型特定的值验证:没有检查捕获的level是否对目标模型有效。
需要修复:
- Line 422 应将captured level转换为大写枚举值,例如:
thinkingLevel = level.toUpperCase(),或维护一个模型特定的有效值映射- 从regex中移除
'medium'和'tiny'(对3-pro无效),或添加模型检查逻辑- 移除
'minimal-3-pro'的workaround,改为在regex或值转换阶段进行适当的模型特定过滤
🧹 Nitpick comments (3)
packages/adapter-claude/src/requester.ts (1)
176-185: 错误处理可改进:原始错误消息丢失当
this.get(url, headers)抛出网络错误时,raw为undefined,且原始错误消息未包含在新错误中。建议保留原始错误信息以便于调试。🔎 建议的改进
} catch (e) { + const originalMessage = (e as Error)?.message ?? String(e) const error = new Error( - 'Error when listing models, Response: ' + JSON.stringify(raw) + `Error when listing models: ${originalMessage}` + + (raw !== undefined ? `, Response: ${JSON.stringify(raw)}` : '') ) // eslint-disable-next-line @typescript-eslint/no-explicit-any ;(error as any).stack = (e as any)?.stack // eslint-disable-next-line @typescript-eslint/no-explicit-any ;(error as any).cause = (e as any)?.cause throw error }packages/adapter-claude/src/client.ts (1)
57-69: 分页逻辑正确,可考虑添加安全限制分页实现正确,使用
has_more和last_id作为游标。虽然 Anthropic 的模型列表不太可能无限大,但可以考虑添加最大迭代次数作为安全措施。🔎 可选的安全限制
const modelIds: string[] = [] let afterId: string | undefined + const MAX_PAGES = 50 // Safety limit // Page through /v1/models until has_more is false. - while (true) { + for (let page = 0; page < MAX_PAGES; page++) { const resp = await this._requester.listModels({ afterId, limit: 100 }) for (const item of resp.data ?? []) { if (item?.id) modelIds.push(item.id) } if (!resp.has_more || !resp.last_id) break afterId = resp.last_id }packages/shared-adapter/src/requester.ts (1)
470-485: 建议移除冗余的模型添加。Line 479 的
push(${model})是冗余的,因为 Line 471 已经通过push(model)添加了原始模型。虽然 Set 去重机制使其无害,但可以移除以提高代码清晰度。🔎 建议的优化
for (const model of rawModels) { push(model) if (!isOpenAIReasoningModel(model)) continue if (hasThinkingTag(model)) continue // OpenAI-style "thinking" via model suffixes. These are virtual // variants that map to request params (e.g. reasoning_effort). push(`${model}-non-thinking`) - push(`${model}`) push(`${model}-minimal-thinking`) push(`${model}-low-thinking`) push(`${model}-medium-thinking`) push(`${model}-high-thinking`) push(`${model}-xhigh-thinking`) }
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
packages/adapter-claude/src/client.tspackages/adapter-claude/src/requester.tspackages/adapter-claude/src/types.tspackages/adapter-gemini/src/client.tspackages/adapter-gemini/src/utils.tspackages/adapter-zhipu/src/client.tspackages/shared-adapter/src/client.tspackages/shared-adapter/src/requester.tspackages/shared-adapter/src/utils.ts
🧰 Additional context used
🧬 Code graph analysis (6)
packages/adapter-claude/src/requester.ts (1)
packages/adapter-claude/src/types.ts (1)
ClaudeListModelsResponse(111-116)
packages/shared-adapter/src/client.ts (1)
packages/core/src/llm-core/platform/model.ts (1)
modelName(781-783)
packages/shared-adapter/src/utils.ts (1)
packages/shared-adapter/src/client.ts (1)
normalizeOpenAIModelName(46-48)
packages/adapter-claude/src/client.ts (2)
packages/adapter-gemini/src/client.ts (1)
logger(29-31)packages/adapter-claude/src/requester.ts (1)
logger(196-198)
packages/shared-adapter/src/requester.ts (1)
packages/shared-adapter/src/client.ts (1)
parseOpenAIModelNameWithReasoningEffort(12-44)
packages/adapter-gemini/src/client.ts (3)
packages/core/src/llm-core/platform/types.ts (1)
ModelInfo(82-87)packages/adapter-gemini/src/types.ts (1)
GeminiModelInfo(135-147)packages/core/src/utils/error.ts (1)
ChatLunaError(13-41)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: lint
- GitHub Check: build
🔇 Additional comments (24)
packages/adapter-claude/src/types.ts (1)
104-116: LGTM!新增的类型定义结构清晰,与 Anthropic
/v1/modelsAPI 响应结构一致。ClaudeModelInfo和ClaudeListModelsResponse接口正确地将可选字段标记为可选类型。packages/adapter-claude/src/requester.ts (2)
17-21: LGTM!导入更新正确,新增了
ClaudeListModelsResponse类型以支持模型列表功能。
140-162: LGTM!方法签名和查询参数构建逻辑清晰正确。
URLSearchParams的使用和可选的anthropic-beta头处理都很规范。packages/adapter-claude/src/client.ts (4)
10-10: LGTM!正确导入了
logger以支持错误日志记录。
35-49: LGTM!回退模型列表涵盖了主要的 Claude 模型版本,作为 API 获取失败时的可靠备选方案。
71-83: LGTM!去重逻辑使用
Set高效正确。模型映射保持了与回退列表一致的maxTokens和capabilities设置。
84-100: LGTM!错误处理优雅,记录警告信息并回退到内置模型列表。这确保了即使 API 不可用,系统也能正常工作。
packages/adapter-gemini/src/client.ts (4)
22-22: LGTM!类型导入正确,为
rawModels提供了明确的类型定义。
55-67: 辅助函数设计合理。
includesAny和pushExpanded封装得当。注意pushExpanded先添加带后缀的变体,最后添加基础模型,确认这个顺序是预期的展示逻辑。
72-86: 错误处理实现完善。空模型列表校验和异常包装逻辑正确,保留了
ChatLunaError实例的原始信息,同时将其他异常统一包装为MODEL_INIT_ERROR。
91-101: 确认默认能力分配是否准确。所有非嵌入模型都被赋予
ImageInput和ToolCall能力。请确认 Gemini API 返回的所有模型是否都支持这些能力,特别是一些早期或特定用途的模型。如果
GeminiModelInfo中包含能力信息(如supportedGenerationMethods),可考虑基于实际返回数据动态设置capabilities。packages/adapter-zhipu/src/client.ts (1)
77-81: [rewritten comment]
[exactly ONE classification tag]packages/shared-adapter/src/requester.ts (5)
32-32: 导入语句正确。正确导入了模型名称解析函数,用于后续的推理努力参数提取。
51-52: 模型解析逻辑正确。引入了模型名称解析来提取推理努力级别,标准化后的模型名称在后续请求构建中被一致使用。
55-60: 标准化模型名称使用一致。在请求参数构建中一致使用标准化后的模型名称,确保去除推理级别后缀后的基础模型名称被正确传递给上游 API。
Also applies to: 70-70
81-86: 新增请求参数正确。添加了
reasoning_effort参数以支持推理努力级别控制,其他字段设置为 undefined 以保持 API 兼容性。
94-107: 推理模型参数处理正确。正确识别 OpenAI 推理模型(o1、o3、o4、gpt-5 系列),并删除这些模型不支持的采样参数(temperature、penalties、n、top_p),避免 API 调用错误。
packages/shared-adapter/src/utils.ts (3)
28-28: 导入语句正确。正确导入了模型名称标准化函数,用于统一处理带推理级别后缀的模型名称。
39-40: 模型标准化使用正确。在进行模型特性检测前先标准化模型名称,确保 DeepSeek Think 等特殊模型的检测不受推理级别后缀影响。
81-81: 标准化模型名称在特性检测中的应用一致。在视觉输入支持、模型能力检测等逻辑中统一使用标准化后的模型名称,确保带推理级别后缀的模型变体能正确匹配特性规则。
Also applies to: 94-98
packages/shared-adapter/src/client.ts (4)
4-10: 类型定义正确。
OpenAIReasoningEffort类型定义涵盖了从无推理到超高推理的所有级别,与 OpenAI 推理努力参数规范一致。
12-44: 模型名称解析逻辑完整。解析函数正确处理了多种推理级别后缀模式:
- 显式级别(-none-thinking、-minimal-thinking 等)
- 特殊处理 -tiny-thinking 映射为 minimal
- -non-thinking 映射为 none
- 默认 -thinking 映射为 medium
逻辑清晰且符合预期行为。
46-48: 标准化辅助函数设计简洁。提供了便捷的模型名称标准化接口,返回去除推理级别后缀的基础模型名称,API 设计清晰。
75-75: 在关键功能中正确应用模型标准化。在获取模型最大上下文大小和检测图像输入支持时,先标准化模型名称,确保带推理级别后缀的模型变体能正确查找到基础模型的配置和能力。
Also applies to: 149-149
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…ecks - Ensure `modelNameLower` is used consistently for checking existing `-thinking` suffixes. - Fix logic error where `model.name` (un-normalized) was used instead of `modelNameLower`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
packages/adapter-gemini/src/client.ts (2)
119-129: 大小写检查不一致。第 120 行使用
model.name.includes('-thinking')进行原始大小写匹配,但其他地方(如 103-112 行)使用modelNameLower进行大小写不敏感匹配。如果 API 返回的模型名称大小写不一致,此检查可能失效。🔎 建议修复
if (isThinking) { - if (model.name.includes('-thinking')) { + if (modelNameLower.includes('-thinking')) { models.push(baseInfo) } else {
131-142: 同样存在大小写检查不一致问题。第 132 行使用
model.name.includes('3-pro')应改为modelNameLower.includes('3-pro')以保持一致性。🔎 建议修复
if (isThinkingLevel) { - const suffixes = model.name.includes('3-pro') + const suffixes = modelNameLower.includes('3-pro') ? ['-low-thinking', '-high-thinking', '-minimal-thinking'] : [ '-low-thinking',
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/adapter-gemini/src/client.tspackages/core/src/llm-core/chat/infinite_context.ts
✅ Files skipped from review due to trivial changes (1)
- packages/core/src/llm-core/chat/infinite_context.ts
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-04T20:25:49.237Z
Learnt from: dingyi222666
Repo: ChatLunaLab/chatluna PR: 653
File: packages/core/src/llm-core/platform/model.ts:169-176
Timestamp: 2025-12-04T20:25:49.237Z
Learning: In packages/core/src/llm-core/platform/model.ts, the code intentionally uses unsafe nested property access (e.g., options?.variables_hide['built']['conversationId']) without full optional chaining to enable explicit TypeErrors for debugging and diagnosis when the expected structure is missing. This fail-fast approach is preferred over silently returning undefined.
Applied to files:
packages/adapter-gemini/src/client.ts
🧬 Code graph analysis (1)
packages/adapter-gemini/src/client.ts (3)
packages/core/src/llm-core/platform/types.ts (1)
ModelInfo(82-87)packages/adapter-gemini/src/types.ts (1)
GeminiModelInfo(135-147)packages/core/src/utils/error.ts (1)
ChatLunaError(13-41)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: lint
- GitHub Check: build
🔇 Additional comments (4)
packages/adapter-gemini/src/client.ts (4)
22-22: LGTM!正确引入了
GeminiModelInfo类型,用于后续的模型数据处理。
51-53: LGTM!模型分组数组组织清晰,为后续的变体扩展逻辑提供了良好的基础。
55-67: LGTM!辅助函数实现正确:
includesAny清晰地检查多个匹配项pushExpanded正确地扩展变体并保留基础模型作为私有辅助函数,内联定义是合理的。
75-86: LGTM!错误处理改进合理:
- 防御性检查空模型列表
- 正确保留
ChatLunaError并包装其他异常类型符合代码库的错误处理模式。
- Use `modelNameLower` instead of `model.name` for embedding type detection. - Remove `ImageInput` and `ToolCall` capabilities for embedding models.
This PR introduces support for thinking levels (reasoning effort) across multiple adapters and implements dynamic model fetching for the Claude adapter. It also expands the available model lists with virtual variants to allow users to easily select different reasoning configurations.
New Features
-minimal-thinking,-low-thinking, etc.-2K/-4Kresolution variants for supported models./v1/modelsAPI.reasoning_effortparameter.GLM-4.6V,GLM-4.6V-Flash,GLM-4.7, etc.Bug fixes
Other Changes
adapter-gemini.shared-adapter.Close #661