-
-
Notifications
You must be signed in to change notification settings - Fork 41
[Feature] support configurable infinite context threshold and manual compression #682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…l compression - Add `infiniteContextThreshold` configuration to control when compression triggers. - Implement manual compression command `chatluna.room.compress`. - Update `InfiniteContextManager` to honor the configurable threshold. - Add i18n support for the new configuration and command. - Small fix in `read_chat_message` to preserve file URLs.
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Walkthrough新增房间上下文压缩功能:添加 CLI 命令、配置项、聊天接口压缩方法、中间件路由与服务封装,链路通过并发队列协调并调用 InfiniteContextManager 的阈值驱动压缩逻辑。 Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant User as 用户
participant CLI as CLI
participant Chain as ChatChain
participant Service as ChatLunaService
participant Wrapper as ChatInterfaceWrapper
participant Chat as ChatInterface
participant ICM as InfiniteContextManager
User->>CLI: 执行 chatluna.room.compress
CLI->>Chain: 发起 compress_room 命令
Chain->>Chain: 解析 room / room_resolve
alt 未找到房间
Chain-->>User: 返回 no_room 消息
else 房间已解析
Chain->>Service: compressContext(room)
Service->>Wrapper: compressContext(room)
Wrapper->>Wrapper: 排队/获取队列锁
Wrapper->>Chat: compressContext()
Chat->>ICM: 请求压缩(含阈值)
ICM->>ICM: 执行压缩逻辑
ICM-->>Chat: 返回结果(true/false)
Chat-->>Wrapper: 返回结果
Wrapper->>Wrapper: 释放队列锁
Wrapper-->>Chain: 传回结果
Chain-->>User: 返回成功或失败消息
end
Estimated code review effort🎯 3 (中等) | ⏱️ ~25 分钟 Possibly related PRs
诗
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Repository UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (1)📚 Learning: 2025-12-04T20:25:49.237ZApplied to files:
🧬 Code graph analysis (1)packages/core/src/services/chat.ts (4)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
🔇 Additional comments (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @dingyi222666, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the chat system's context management capabilities by introducing both configurable automatic compression and a new manual compression command. It provides users with greater flexibility and control over how conversation history is managed, ensuring efficient use of context windows. Additionally, it refines file message processing to better retain important metadata. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a configurable threshold for infinite context compression and adds a manual compression command, which are great enhancements for managing conversation history. The implementation is solid and follows existing patterns in the codebase. I've provided a couple of suggestions for the new compress_room middleware to improve its robustness in room resolution and error handling. Overall, this is a valuable feature addition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (1)
packages/core/src/llm-core/chat/app.ts (1)
409-422: 返回值语义不够清晰,建议增强错误处理该方法返回的
true仅表示调用了压缩逻辑,而非真正执行了压缩。由于compressIfNeeded内部会根据阈值判断是否压缩,调用者无法从返回值得知是否实际发生了压缩。另外,与processChat方法(第 143-149 行)中对压缩错误的处理不同,这里缺少错误捕获。建议:
- 考虑返回更明确的状态(例如枚举或对象),指示"已压缩"、"未达到阈值"或"无法压缩"
- 参考
processChat的模式添加 try-catch 错误处理🔎 建议的改进方案
async compressContext(): Promise<boolean> { const wrapper = await this.getChatLunaLLMChainWrapper() if (!wrapper) { return false } const manager = this._ensureInfiniteContextManager() if (!manager) { return false } + try { await manager.compressIfNeeded(wrapper) - return true + return true + } catch (error) { + logger.error('Error compressing context:', error) + return false + } }
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (4)
packages/core/src/locales/en-US.schema.ymlis excluded by!**/*.ymlpackages/core/src/locales/en-US.ymlis excluded by!**/*.ymlpackages/core/src/locales/zh-CN.schema.ymlis excluded by!**/*.ymlpackages/core/src/locales/zh-CN.ymlis excluded by!**/*.yml
📒 Files selected for processing (8)
packages/core/src/commands/room.tspackages/core/src/config.tspackages/core/src/llm-core/chat/app.tspackages/core/src/llm-core/chat/infinite_context.tspackages/core/src/middleware.tspackages/core/src/middlewares/chat/read_chat_message.tspackages/core/src/middlewares/room/compress_room.tspackages/core/src/services/chat.ts
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-17T00:25:27.195Z
Learnt from: dingyi222666
Repo: ChatLunaLab/chatluna PR: 548
File: packages/core/src/llm-core/chat/app.ts:0-0
Timestamp: 2025-09-17T00:25:27.195Z
Learning: 在 ChatInterface 类中,响应式 watch 调用通过 ctx.effect() 包装来自动处理清理工作,避免内存泄漏。字段 _chain 和 _embeddings 的类型已更新为可空类型 (| undefined),并添加 ctx.on('dispose') 处理器提供额外的清理保障。这种使用 Koishi effect 系统的方式比手动管理 stop 句柄更优雅。
Applied to files:
packages/core/src/middlewares/room/compress_room.ts
🧬 Code graph analysis (2)
packages/core/src/middlewares/room/compress_room.ts (2)
packages/core/src/chains/chain.ts (1)
ChatChain(14-366)packages/core/src/chains/rooms.ts (1)
getAllJoinedConversationRoom(390-445)
packages/core/src/services/chat.ts (1)
packages/core/src/types.ts (1)
ConversationRoom(4-19)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: lint
- GitHub Check: build
🔇 Additional comments (10)
packages/core/src/middleware.ts (1)
42-42: 变更看起来不错!新增的
compress_room中间件已正确导入并注册到中间件链中,放置在clear_room之后,符合房间操作的逻辑顺序。Also applies to: 110-110
packages/core/src/commands/room.ts (1)
109-117: LGTM!新增的
chatluna.room.compress命令实现与其他房间命令保持一致,正确地通过chain.receiveCommand调用compress_room操作。packages/core/src/llm-core/chat/infinite_context.ts (2)
21-21: 可配置阈值选项实现正确。在
InfiniteContextManagerOptions中添加可选的threshold参数,允许外部配置压缩阈值,同时保持向后兼容性。
65-67: 阈值计算逻辑正确。使用
Math.floor(maxTokenLimit * (this.options.threshold ?? 0.85))替换了硬编码的 0.85 阈值。当未提供threshold时,默认回退到 0.85,保持了向后兼容性。使用Math.floor确保 token 计数为整数。packages/core/src/config.ts (2)
32-32: 配置接口扩展正确。在
Config接口中添加infiniteContextThreshold: number字段,用于存储无限上下文压缩阈值。
114-118: 配置 schema 定义合理。
infiniteContextThreshold的 schema 定义使用Schema.percent()并设置了合理的约束条件:
- 最小值 50%,避免过于激进的压缩
- 最大值 95%,确保在达到限制前触发压缩
- 默认值 85%,与之前的硬编码值保持一致
- 步进 1%,提供细粒度控制
这些配置符合 PR 目标,并与
infinite_context.ts中的实现正确对接。packages/core/src/llm-core/chat/app.ts (1)
545-546: LGTM!将配置的阈值传递给
InfiniteContextManager的实现正确,与其他配置参数的使用模式保持一致。packages/core/src/services/chat.ts (1)
237-242: LGTM!该方法遵循了与
clearChatHistory相同的委托模式,实现简洁且一致。packages/core/src/middlewares/room/compress_room.ts (2)
39-50: LGTM!压缩执行逻辑的错误处理完善,正确设置了成功和失败消息,并进行了适当的日志记录。
6-56: LGTM!中间件的整体结构符合项目的标准模式,正确集成到了链的生命周期中(在
lifecycle-handle_command之后、lifecycle-request_model之前),并且类型声明完善。
Mirror the pattern used in the chat method by acquiring a model-queue token in compressContext to enforce platform concurrentMaxSize limits.
This pr introduces configurable thresholds for Infinite Context compression and adds a manual compression command to the room management.
New Features
infiniteContextThresholdin configuration (range: 50% - 95%, default 85%) to control when history compression is triggered.chatluna.room.compresscommand to manually trigger context compression for a specified room.read_chat_messageto preserve and include file URLs in message attributes.Bug fixes
N/A
Other Changes
InfiniteContextManagerto honor the configurable threshold.compress_roommiddleware into the core middleware stack.