[Refactor] Replace maxTokens with maxContextRatio and version updates #565

dingyi222666 · 2025-09-29T18:36:11Z

This PR introduces a significant refactor to improve context management across all adapter packages and includes necessary version bumps.

New Features

Percentage-based context management: Replace absolute maxTokens with maxContextRatio (0-1) for more intuitive and adaptive context window management
Improved configuration UI: New slider-based configuration for context ratio with default value of 0.35 (35% of model context)
Better model adaptation: Automatic calculation of token limits based on each model's maximum context size

Bug fixes

Fix embeddings-service huggingface client initialization method (switch from initClientsWithPool to parseConfig)
Fix adapter method calls from initClients to initClient for consistency
Improve search-service model configuration and locale handling

Other Changes

Version bumps: Update all adapter packages to next alpha versions
Dependency updates: Bump shared adapter dependency from 1.0.10 to 1.0.11 across all packages
Localization updates: Update both English and Chinese locale files with new context ratio descriptions
Code consistency: Standardize token limit calculation across all adapter clients

Affected packages: azure-openai, claude, deepseek, dify, doubao, gemini, hunyuan, ollama, openai, openai-like, qwen, rwkv, spark, wenxin, zhipu, shared, search-service, embeddings-service

Breaking change: Configuration parameter maxTokens is replaced with maxContextRatio. Users will need to reconfigure their settings.

- Pass model configuration to search and web browser tools for proper model context - Fix typo in Chinese locale: 'mulitSourceMode' -> 'multiSourceMode' - Add missing multiSourceMode translation to English locale - Ensure tools receive proper model configuration for better performance Fixes model context passing and locale consistency issues.

- Update default maxTokens from 4096 to 12000 for better model performance - Affects all adapter packages: azure-openai, claude, gemini, hunyuan, ollama, openai-like, qwen, rwkv, spark, wenxin, and zhipu - Provides more generous token limits for complex tasks and longer conversations - Maintains existing min/max constraints for each adapter This change improves the default user experience by allowing longer context without requiring manual configuration adjustments.

…ndling - Rename PlatformService methods for better clarity: - getModels() → listPlatformModels() - getModelInfo() → findModel() - getAllModels() → listAllModels() - Add PlatformModelInfo interface extending ModelInfo with platform and toModelName() - Update all middleware and service consumers to use new API - Fix Azure OpenAI adapter method call (initClients → initClient) - Improve default model selection logic to prefer smaller models (nano/flash/mini) - Fix memory leaks by properly disposing watchers in chat service and schema utils - Add type guards for ModelInfo and PlatformModelInfo validation Breaking Changes: - PlatformService API method names have changed - getAllModels() now returns PlatformModelInfo[] instead of string[]

Fix method name inconsistency across all adapter packages to use the correct initClient() method instead of the deprecated initClients() method. This aligns with the refactored ChatLunaPlugin API and ensures proper initialization of platform clients. Affected adapters: - azure-openai-adapter - claude-adapter - deepseek-adapter - dify-adapter - doubao-adapter - gemini-adapter - hunyuan-adapter - ollama-adapter - openai-adapter - openai-like-adapter - qwen-adapter - rwkv-adapter - spark-adapter - wenxin-adapter - zhipu-adapter

… adapters Replace absolute maxTokens configuration with maxContextRatio (0-1) to use percentage-based context window management. This provides better adaptation to different model context sizes and more intuitive configuration. Changes: - Replace maxTokens number field with maxContextRatio slider (0-1, default 0.35) - Update client token limit calculation to use ratio of model's max tokens - Update localization files for both English and Chinese descriptions - Affects all adapter packages: azure-openai, claude, deepseek, dify, doubao, gemini, hunyuan, ollama, openai, openai-like, qwen, rwkv, spark, wenxin, zhipu

Increment patch versions for all adapter packages and update shared adapter dependency to v1.0.11. Also fix embeddings-service to use proper client initialization method. Changes: - Bump shared adapter from 1.0.10 to 1.0.11 - Increment adapter package versions (alpha.6 -> alpha.7, alpha.7 -> alpha.8, etc.) - Fix embeddings-service huggingface client to use parseConfig instead of initClientsWithPool - Update search-service version from alpha.0 to alpha.1 Affected packages: azure-openai, claude, deepseek, dify, doubao, gemini, hunyuan, ollama, openai, openai-like, qwen, rwkv, spark, wenxin, zhipu, shared, search-service, embeddings-service

gemini-code-assist · 2025-09-29T18:36:14Z

Important

Installation incomplete: to start using Gemini Code Assist, please ask the organization owner(s) to visit the Gemini Code Assist Admin Console and sign the Terms of Services.

coderabbitai · 2025-09-29T18:36:20Z

Caution

Review failed

The pull request is closed.

Walkthrough

本次变更统一了平台模型查询 API（getModel(s)/getAllModels/getModelInfo → listPlatformModels/listAllModels/findModel），引入 PlatformModelInfo 类型，并在各处适配。多家适配器改为用 maxContextRatio 按模型 maxTokens 计算 maxTokenLimit，配置从 maxTokens 迁移为 maxContextRatio，同时初始化方法统一为 initClient。少量控制流与监听清理优化。

Changes

Cohort / File(s)	Summary
Core 平台服务 API 重构 `packages/core/src/llm-core/platform/service.ts`, `packages/core/src/llm-core/platform/types.ts`	重命名并重塑 API：getModels→listPlatformModels，getAllModels→listAllModels（返回 PlatformModelInfo[]），getModelInfo→findModel；新增 PlatformModelInfo 接口与类型守卫。
Core 消费端适配与模型选择 `packages/core/src/chains/rooms.ts`, `packages/core/src/llm-core/chat/app.ts`, `packages/core/src/utils/schema.ts`, `packages/core/src/middlewares/...`: `auth/create_auth_group.ts`, `auth/set_auth_group.ts`, `chat/chat_time_limit_check.ts`, `chat/read_chat_message.ts`, `model/list_all_embeddings.ts`, `model/list_all_model.ts`, `model/search_model.ts`, `model/set_default_embeddings.ts`, `room/create_room.ts`, `room/resolve_room.ts`, `room/set_room.ts`	全面替换为 listPlatformModels/listAllModels/findModel，并在列表场景映射为 toModelName() 字符串；rooms 增加基于 name 的默认/回退选择与标准化命名。
Core 服务生命周期与监听 `packages/core/src/services/chat.ts`	模型加载监听改造并清理停表；平台模型获取改用 listPlatformModels；插件方法重命名 initClients→initClient，移除 initClientsWithPool。
适配器：客户端按比例限制上下文 `packages/*-adapter/src/client.ts`: `azure-openai-adapter`, `claude-adapter`, `deepseek-adapter`, `dify-adapter`, `doubao-adapter`, `gemini-adapter`, `hunyuan-adapter`, `ollama-adapter`, `openai-adapter`, `openai-like-adapter`, `qwen-adapter`, `rwkv-adapter`, `spark-adapter`, `wenxin-adapter`, `zhipu-adapter`	maxTokenLimit 由固定 maxTokens 改为 floor((info.maxTokens
适配器：配置/Schema 与初始化 `packages/*-adapter/src/index.ts`: 同上各适配器	公共配置字段 maxTokens→maxContextRatio（0–1，step 0.0001，role slider，default 0.35）；初始化调用统一 await plugin.initClient()。特殊：`wenxin-adapter` 新增 enableSearch、调整 presencePenalty；`azure-openai-adapter` 默认 contextSize 从 4096→12000（含 supportedModels）。
Embeddings 服务 `packages/embeddings-service/src/embeddings/huggingface.ts`	去除 ClientConfigPool，改为直接 parseConfig 映射多客户端。
搜索服务 `packages/search-service/src/chain/browsing_chain.ts`, `packages/search-service/src/index.ts`	工具调用新增 configurable.model 传参；summaryModel 不再回退到 params.model，仅用 keywordExtractModel?.value。
杂项 `packages/long-memory/src/utils/chat-history.ts`	调整 crypto 导入顺序，无行为变化。

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Room as Rooms
  participant Platform as PlatformService
  participant Models as listAllModels/findModel

  User->>Room: 请求创建/解析会话
  Room->>Platform: listAllModels(ModelType.llm)
  Platform-->>Room: ComputedRef<PlatformModelInfo[]>
  Room->>Models: 基于 name/平台选择默认或回退(nano/flash/mini)
  Models-->>Room: PlatformModelInfo + toModelName()
  Room-->>User: 选定模型名（标准化）

  note over Room,Platform: 新 API：listAllModels / findModel 与 toModelName()

sequenceDiagram
  autonumber
  participant Koishi as Koishi(ctx)
  participant Plugin as ChatLunaPlugin
  participant Platform as PlatformService

  Koishi->>Plugin: ready()
  Plugin->>Plugin: initClient()
  Plugin->>Platform: listPlatformModels(platform, type)
  Platform-->>Plugin: ComputedRef<models>
  Plugin->>Plugin: 建立 watcher（可停止）
  Koishi-->>Plugin: dispose -> 停止 watcher

sequenceDiagram
  autonumber
  participant Chain as BrowsingChain
  participant Search as SearchTool
  participant Browser as WebBrowserTool

  Chain->>Search: invoke(question, {configurable.model: this.model})
  Search-->>Chain: 搜索结果
  Chain->>Browser: invoke(url, {configurable.model: this.model})
  Browser-->>Chain: 总结/抓取结果

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

[Fix] 修复核心服务错误处理和生命周期管理问题 #552: 同改动核心聊天模块（app.ts），涉及模型查找与聊天流程，代码层面相关。

Poem

小兔举耳听模型，
比例换算定上限。
一键单客启新程，
平台清单名规范。
搜索携模更贴切，
春风里，代码也轻跳。 🐇✨

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/search-service

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 23b6532 and 408c7fc.

⛔ Files ignored due to path filters (48)

packages/azure-openai-adapter/package.json is excluded by !**/*.json
packages/azure-openai-adapter/src/locales/en-US.schema.yml is excluded by !**/*.yml
packages/azure-openai-adapter/src/locales/zh-CN.schema.yml is excluded by !**/*.yml
packages/claude-adapter/package.json is excluded by !**/*.json
packages/claude-adapter/src/locales/en-US.schema.yml is excluded by !**/*.yml
packages/claude-adapter/src/locales/zh-CN.schema.yml is excluded by !**/*.yml
packages/deepseek-adapter/package.json is excluded by !**/*.json
packages/deepseek-adapter/src/locales/en-US.schema.yml is excluded by !**/*.yml
packages/deepseek-adapter/src/locales/zh-CN.schema.yml is excluded by !**/*.yml
packages/dify-adapter/package.json is excluded by !**/*.json
packages/dify-adapter/src/locales/en-US.schema.yml is excluded by !**/*.yml
packages/dify-adapter/src/locales/zh-CN.schema.yml is excluded by !**/*.yml
packages/doubao-adapter/package.json is excluded by !**/*.json
packages/doubao-adapter/src/locales/en-US.schema.yml is excluded by !**/*.yml
packages/doubao-adapter/src/locales/zh-CN.schema.yml is excluded by !**/*.yml
packages/gemini-adapter/package.json is excluded by !**/*.json
packages/gemini-adapter/src/locales/en-US.schema.yml is excluded by !**/*.yml
packages/gemini-adapter/src/locales/zh-CN.schema.yml is excluded by !**/*.yml
packages/hunyuan-adapter/package.json is excluded by !**/*.json
packages/hunyuan-adapter/src/locales/en-US.schema.yml is excluded by !**/*.yml
packages/ollama-adapter/package.json is excluded by !**/*.json
packages/ollama-adapter/src/locales/en-US.schema.yml is excluded by !**/*.yml
packages/ollama-adapter/src/locales/zh-CN.schema.yml is excluded by !**/*.yml
packages/openai-adapter/package.json is excluded by !**/*.json
packages/openai-adapter/src/locales/en-US.schema.yml is excluded by !**/*.yml
packages/openai-adapter/src/locales/zh-CN.schema.yml is excluded by !**/*.yml
packages/openai-like-adapter/package.json is excluded by !**/*.json
packages/openai-like-adapter/src/locales/en-US.schema.yml is excluded by !**/*.yml
packages/openai-like-adapter/src/locales/zh-CN.schema.yml is excluded by !**/*.yml
packages/qwen-adapter/package.json is excluded by !**/*.json
packages/qwen-adapter/src/locales/en-US.schema.yml is excluded by !**/*.yml
packages/qwen-adapter/src/locales/zh-CN.schema.yml is excluded by !**/*.yml
packages/rwkv-adapter/package.json is excluded by !**/*.json
packages/rwkv-adapter/src/locales/en-US.schema.yml is excluded by !**/*.yml
packages/rwkv-adapter/src/locales/zh-CN.schema.yml is excluded by !**/*.yml
packages/search-service/package.json is excluded by !**/*.json
packages/search-service/src/locales/en-US.schema.yml is excluded by !**/*.yml
packages/search-service/src/locales/zh-CN.schema.yml is excluded by !**/*.yml
packages/shared/package.json is excluded by !**/*.json
packages/spark-adapter/package.json is excluded by !**/*.json
packages/spark-adapter/src/locales/en-US.schema.yml is excluded by !**/*.yml
packages/spark-adapter/src/locales/zh-CN.schema.yml is excluded by !**/*.yml
packages/wenxin-adapter/package.json is excluded by !**/*.json
packages/wenxin-adapter/src/locales/en-US.schema.yml is excluded by !**/*.yml
packages/wenxin-adapter/src/locales/zh-CN.schema.yml is excluded by !**/*.yml
packages/zhipu-adapter/package.json is excluded by !**/*.json
packages/zhipu-adapter/src/locales/en-US.schema.yml is excluded by !**/*.yml
packages/zhipu-adapter/src/locales/zh-CN.schema.yml is excluded by !**/*.yml

📒 Files selected for processing (51)

packages/azure-openai-adapter/src/client.ts (1 hunks)
packages/azure-openai-adapter/src/index.ts (3 hunks)
packages/claude-adapter/src/client.ts (1 hunks)
packages/claude-adapter/src/index.ts (2 hunks)
packages/core/src/chains/rooms.ts (3 hunks)
packages/core/src/llm-core/chat/app.ts (1 hunks)
packages/core/src/llm-core/platform/service.ts (4 hunks)
packages/core/src/llm-core/platform/types.ts (2 hunks)
packages/core/src/middlewares/auth/create_auth_group.ts (1 hunks)
packages/core/src/middlewares/auth/set_auth_group.ts (1 hunks)
packages/core/src/middlewares/chat/chat_time_limit_check.ts (1 hunks)
packages/core/src/middlewares/chat/read_chat_message.ts (1 hunks)
packages/core/src/middlewares/model/list_all_embeddings.ts (1 hunks)
packages/core/src/middlewares/model/list_all_model.ts (1 hunks)
packages/core/src/middlewares/model/search_model.ts (1 hunks)
packages/core/src/middlewares/model/set_default_embeddings.ts (3 hunks)
packages/core/src/middlewares/room/create_room.ts (2 hunks)
packages/core/src/middlewares/room/resolve_room.ts (1 hunks)
packages/core/src/middlewares/room/set_room.ts (2 hunks)
packages/core/src/services/chat.ts (4 hunks)
packages/core/src/utils/schema.ts (4 hunks)
packages/deepseek-adapter/src/client.ts (1 hunks)
packages/deepseek-adapter/src/index.ts (2 hunks)
packages/dify-adapter/src/client.ts (1 hunks)
packages/dify-adapter/src/index.ts (3 hunks)
packages/doubao-adapter/src/client.ts (1 hunks)
packages/doubao-adapter/src/index.ts (2 hunks)
packages/embeddings-service/src/embeddings/huggingface.ts (2 hunks)
packages/gemini-adapter/src/client.ts (1 hunks)
packages/gemini-adapter/src/index.ts (2 hunks)
packages/hunyuan-adapter/src/client.ts (1 hunks)
packages/hunyuan-adapter/src/index.ts (3 hunks)
packages/long-memory/src/utils/chat-history.ts (1 hunks)
packages/ollama-adapter/src/client.ts (1 hunks)
packages/ollama-adapter/src/index.ts (2 hunks)
packages/openai-adapter/src/client.ts (1 hunks)
packages/openai-adapter/src/index.ts (2 hunks)
packages/openai-like-adapter/src/client.ts (1 hunks)
packages/openai-like-adapter/src/index.ts (3 hunks)
packages/qwen-adapter/src/client.ts (1 hunks)
packages/qwen-adapter/src/index.ts (3 hunks)
packages/rwkv-adapter/src/client.ts (1 hunks)
packages/rwkv-adapter/src/index.ts (2 hunks)
packages/search-service/src/chain/browsing_chain.ts (2 hunks)
packages/search-service/src/index.ts (1 hunks)
packages/spark-adapter/src/client.ts (1 hunks)
packages/spark-adapter/src/index.ts (2 hunks)
packages/wenxin-adapter/src/client.ts (1 hunks)
packages/wenxin-adapter/src/index.ts (2 hunks)
packages/zhipu-adapter/src/client.ts (1 hunks)
packages/zhipu-adapter/src/index.ts (2 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🧪 Early access (Sonnet 4.5): enabled

We are currently testing the Sonnet 4.5 model, which is expected to improve code review quality. However, this model may lead to increased noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience.

Note:

Public repositories are always opted into early access features.
You can enable or disable early access features from the CodeRabbit UI or by updating the CodeRabbit configuration file.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Clean up code formatting for better readability and consistency across adapter client files. Also fix import ordering in long-memory utilities. Changes: - Fix multi-line formatting for maxTokenLimit calculations in claude, spark, and zhipu adapters - Improve code indentation and line breaks for better readability - Fix import ordering in long-memory chat-history utils (crypto imports) Affected files: claude-adapter, spark-adapter, zhipu-adapter, long-memory

dingyi222666 added 6 commits September 29, 2025 23:43

dingyi222666 merged commit e4602c3 into v1-dev Sep 29, 2025
2 checks passed

dingyi222666 deleted the fix/search-service branch September 29, 2025 18:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Refactor] Replace maxTokens with maxContextRatio and version updates #565

[Refactor] Replace maxTokens with maxContextRatio and version updates #565

Uh oh!

dingyi222666 commented Sep 29, 2025

Uh oh!

gemini-code-assist bot commented Sep 29, 2025

Uh oh!

coderabbitai bot commented Sep 29, 2025 •

edited

Loading

Review failed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[Refactor] Replace maxTokens with maxContextRatio and version updates #565

[Refactor] Replace maxTokens with maxContextRatio and version updates #565

Uh oh!

Conversation

dingyi222666 commented Sep 29, 2025

New Features

Bug fixes

Other Changes

Uh oh!

gemini-code-assist bot commented Sep 29, 2025

Uh oh!

coderabbitai bot commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Sep 29, 2025 •

edited

Loading