-
-
Notifications
You must be signed in to change notification settings - Fork 41
[Refactor] Replace maxTokens with maxContextRatio and version updates #565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Pass model configuration to search and web browser tools for proper model context - Fix typo in Chinese locale: 'mulitSourceMode' -> 'multiSourceMode' - Add missing multiSourceMode translation to English locale - Ensure tools receive proper model configuration for better performance Fixes model context passing and locale consistency issues.
- Update default maxTokens from 4096 to 12000 for better model performance - Affects all adapter packages: azure-openai, claude, gemini, hunyuan, ollama, openai-like, qwen, rwkv, spark, wenxin, and zhipu - Provides more generous token limits for complex tasks and longer conversations - Maintains existing min/max constraints for each adapter This change improves the default user experience by allowing longer context without requiring manual configuration adjustments.
…ndling - Rename PlatformService methods for better clarity: - getModels() → listPlatformModels() - getModelInfo() → findModel() - getAllModels() → listAllModels() - Add PlatformModelInfo interface extending ModelInfo with platform and toModelName() - Update all middleware and service consumers to use new API - Fix Azure OpenAI adapter method call (initClients → initClient) - Improve default model selection logic to prefer smaller models (nano/flash/mini) - Fix memory leaks by properly disposing watchers in chat service and schema utils - Add type guards for ModelInfo and PlatformModelInfo validation Breaking Changes: - PlatformService API method names have changed - getAllModels() now returns PlatformModelInfo[] instead of string[]
Fix method name inconsistency across all adapter packages to use the correct initClient() method instead of the deprecated initClients() method. This aligns with the refactored ChatLunaPlugin API and ensures proper initialization of platform clients. Affected adapters: - azure-openai-adapter - claude-adapter - deepseek-adapter - dify-adapter - doubao-adapter - gemini-adapter - hunyuan-adapter - ollama-adapter - openai-adapter - openai-like-adapter - qwen-adapter - rwkv-adapter - spark-adapter - wenxin-adapter - zhipu-adapter
… adapters Replace absolute maxTokens configuration with maxContextRatio (0-1) to use percentage-based context window management. This provides better adaptation to different model context sizes and more intuitive configuration. Changes: - Replace maxTokens number field with maxContextRatio slider (0-1, default 0.35) - Update client token limit calculation to use ratio of model's max tokens - Update localization files for both English and Chinese descriptions - Affects all adapter packages: azure-openai, claude, deepseek, dify, doubao, gemini, hunyuan, ollama, openai, openai-like, qwen, rwkv, spark, wenxin, zhipu
Increment patch versions for all adapter packages and update shared adapter dependency to v1.0.11. Also fix embeddings-service to use proper client initialization method. Changes: - Bump shared adapter from 1.0.10 to 1.0.11 - Increment adapter package versions (alpha.6 -> alpha.7, alpha.7 -> alpha.8, etc.) - Fix embeddings-service huggingface client to use parseConfig instead of initClientsWithPool - Update search-service version from alpha.0 to alpha.1 Affected packages: azure-openai, claude, deepseek, dify, doubao, gemini, hunyuan, ollama, openai, openai-like, qwen, rwkv, spark, wenxin, zhipu, shared, search-service, embeddings-service
|
Important Installation incomplete: to start using Gemini Code Assist, please ask the organization owner(s) to visit the Gemini Code Assist Admin Console and sign the Terms of Services. |
|
Caution Review failedThe pull request is closed. Walkthrough本次变更统一了平台模型查询 API(getModel(s)/getAllModels/getModelInfo → listPlatformModels/listAllModels/findModel),引入 PlatformModelInfo 类型,并在各处适配。多家适配器改为用 maxContextRatio 按模型 maxTokens 计算 maxTokenLimit,配置从 maxTokens 迁移为 maxContextRatio,同时初始化方法统一为 initClient。少量控制流与监听清理优化。 Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant Room as Rooms
participant Platform as PlatformService
participant Models as listAllModels/findModel
User->>Room: 请求创建/解析会话
Room->>Platform: listAllModels(ModelType.llm)
Platform-->>Room: ComputedRef<PlatformModelInfo[]>
Room->>Models: 基于 name/平台选择默认或回退(nano/flash/mini)
Models-->>Room: PlatformModelInfo + toModelName()
Room-->>User: 选定模型名(标准化)
note over Room,Platform: 新 API:listAllModels / findModel 与 toModelName()
sequenceDiagram
autonumber
participant Koishi as Koishi(ctx)
participant Plugin as ChatLunaPlugin
participant Platform as PlatformService
Koishi->>Plugin: ready()
Plugin->>Plugin: initClient()
Plugin->>Platform: listPlatformModels(platform, type)
Platform-->>Plugin: ComputedRef<models>
Plugin->>Plugin: 建立 watcher(可停止)
Koishi-->>Plugin: dispose -> 停止 watcher
sequenceDiagram
autonumber
participant Chain as BrowsingChain
participant Search as SearchTool
participant Browser as WebBrowserTool
Chain->>Search: invoke(question, {configurable.model: this.model})
Search-->>Chain: 搜索结果
Chain->>Browser: invoke(url, {configurable.model: this.model})
Browser-->>Chain: 总结/抓取结果
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
✨ Finishing touches
🧪 Generate unit tests
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro ⛔ Files ignored due to path filters (48)
📒 Files selected for processing (51)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🧪 Early access (Sonnet 4.5): enabledWe are currently testing the Sonnet 4.5 model, which is expected to improve code review quality. However, this model may lead to increased noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience. Note:
Comment |
Clean up code formatting for better readability and consistency across adapter client files. Also fix import ordering in long-memory utilities. Changes: - Fix multi-line formatting for maxTokenLimit calculations in claude, spark, and zhipu adapters - Improve code indentation and line breaks for better readability - Fix import ordering in long-memory chat-history utils (crypto imports) Affected files: claude-adapter, spark-adapter, zhipu-adapter, long-memory
This PR introduces a significant refactor to improve context management across all adapter packages and includes necessary version bumps.
New Features
maxTokenswithmaxContextRatio(0-1) for more intuitive and adaptive context window managementBug fixes
initClientsWithPooltoparseConfig)initClientstoinitClientfor consistencyOther Changes
Affected packages: azure-openai, claude, deepseek, dify, doubao, gemini, hunyuan, ollama, openai, openai-like, qwen, rwkv, spark, wenxin, zhipu, shared, search-service, embeddings-service
Breaking change: Configuration parameter
maxTokensis replaced withmaxContextRatio. Users will need to reconfigure their settings.