Skip to content

Conversation

@mithun50
Copy link
Owner

@mithun50 mithun50 commented Dec 28, 2025

  • Add complete RAG module with vector and keyword search
  • Implement document indexing with configurable chunking strategies (paragraph, sentence, fixed, semantic)
  • Add embedding providers: local (transformers.js), OpenAI, Voyage, Cohere, custom
  • Implement hybrid search combining vector (70%) and keyword (30%)
  • Add SQLite storage with FTS5 for full-text search
  • Add 8 MCP tools: rag_index_document, rag_index_project, rag_search, rag_query_context, rag_list_documents, rag_delete_document, rag_get_stats, rag_configure
  • Add HTTP endpoints for all RAG operations
  • Add comprehensive test suite (chunking, embeddings, RAG)
  • Add RAG benchmarks for performance testing
  • Update README and API documentation

Description

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Performance improvement
  • Code refactoring
  • CI/CD changes
  • Other (please describe):

Related Issues

Fixes #

Changes Made

Testing

Test Configuration

  • Node.js version:
  • Operating System:
  • AI Client tested with:

Tests Performed

  • Unit tests pass (npm test)
  • Build succeeds (npm run build)
  • Lint passes (npm run lint)
  • Manual testing with MCP client
  • Manual testing with HTTP API

Test Cases

Screenshots / Logs

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have commented my code where necessary
  • I have updated the documentation accordingly
  • My changes generate no new warnings
  • I have added tests that prove my fix/feature works
  • New and existing tests pass locally
  • Any dependent changes have been merged and published

Breaking Changes

Additional Notes

- Add complete RAG module with vector and keyword search
- Implement document indexing with configurable chunking strategies
  (paragraph, sentence, fixed, semantic)
- Add embedding providers: local (transformers.js), OpenAI, Voyage, Cohere, custom
- Implement hybrid search combining vector (70%) and keyword (30%)
- Add SQLite storage with FTS5 for full-text search
- Add 8 MCP tools: rag_index_document, rag_index_project, rag_search,
  rag_query_context, rag_list_documents, rag_delete_document,
  rag_get_stats, rag_configure
- Add HTTP endpoints for all RAG operations
- Add comprehensive test suite (chunking, embeddings, RAG)
- Add RAG benchmarks for performance testing
- Update README and API documentation
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welcome to CortexFlow! Thanks for your first pull request!

A maintainer will review your changes shortly. Here's what to expect:

Review Process:

  1. Automated checks will run (tests, linting, security)
  2. A maintainer will review your code
  3. You may receive feedback or requests for changes
  4. Once approved, your PR will be merged

While you wait:

  • Ensure all CI checks pass
  • Respond to any review comments
  • Keep your branch up to date with master

Thank you for contributing to CortexFlow!

Automatically formatted by GitHub Actions:
- Prettier formatting applied
- ESLint auto-fixes applied
@github-actions
Copy link
Contributor

Auto-formatted!

I've automatically fixed formatting and lint issues in this PR.

Please pull the latest changes before pushing new commits.

@github-actions
Copy link
Contributor

🚀 Benchmark Results

CortexFlow Benchmark Results

Performance
Token Savings
Compression
Memory

Last Run: 2025-12-28T17:01:18.330Z
Version: 2.0.0
Platform: linux (Node v20.19.6)

Summary

Metric Value
Total Benchmarks 39
Total Operations 22.90K
Avg Ops/Second 549.48K
Avg Token Savings 42.4%
Avg Compression Ratio 4.24x
Peak Memory 122.63 MB

Performance Results

Storage

Benchmark Ops/sec Avg P95 P99
Create Project 3.84K 260.26μs 319.70μs 369.84μs
Read Project 4.27K 234.36μs 294.11μs 369.49μs
List Projects 11.04 90.59ms 104.29ms 109.51ms
Update Project 1.62K 615.95μs 676.08μs 1.77ms
Read Large Project (200 tasks) 1.28K 778.78μs 801.68μs 3.75ms

Intelligent

Benchmark Ops/sec Avg P95 P99
Critical Path Analysis 4.77K 209.50μs 266.57μs 401.95μs
Smart Priority Queue 4.77K 209.61μs 223.29μs 254.31μs
Context Compression 164.25K 6.09μs 7.03μs 24.77μs
Health Score Calculation 16.29K 61.40μs 124.66μs 144.18μs
Generate Suggestions 4.17K 239.73μs 253.42μs 301.30μs

Token Efficiency

Benchmark Ops/sec Avg P95 P99
Compress Small Project 657.01K 1.52μs 1.46μs 2.94μs
Compress Medium Project 296.37K 3.37μs 3.54μs 5.64μs
Compress Large Project 108.59K 9.21μs 10.71μs 11.71μs
Compress XLarge Project 54.70K 18.28μs 19.17μs 33.75μs

Context Handoff

Benchmark Ops/sec Avg P95 P99
Export Small (minimal) 266.47K 3.75μs 4.76μs 25.00μs
Export Small (standard) 68.71K 14.55μs 16.53μs 30.48μs
Export Small (detailed) 22.87K 43.73μs 64.62μs 80.78μs
Export Medium (minimal) 224.88K 4.45μs 4.62μs 4.93μs
Export Medium (standard) 33.77K 29.61μs 50.93μs 76.91μs
Export Medium (detailed) 17.73K 56.39μs 64.27μs 207.76μs
Export Large (minimal) 131.75K 7.59μs 7.77μs 9.38μs
Export Large (standard) 19.15K 52.21μs 60.52μs 101.62μs
Export Large (detailed) 5.22K 191.73μs 204.23μs 498.48μs

RAG Chunking

Benchmark Ops/sec Avg P95 P99
Chunk Small (Paragraph) 1.05M 952.76ns 961.00ns 1.63μs
Chunk Medium (Paragraph) 217.19K 4.60μs 6.40μs 7.55μs
Chunk Large (Paragraph) 27.65K 36.17μs 46.41μs 74.59μs
Chunk Small (Sentence) 569.23K 1.76μs 2.74μs 2.98μs
Chunk Medium (Sentence) 318.71K 3.14μs 4.56μs 5.45μs
Chunk Large (Sentence) 40.53K 24.67μs 27.53μs 55.77μs
Chunk Small (Fixed) 1.57M 638.08ns 651.00ns 1.39μs
Chunk Medium (Fixed) 1.76M 567.64ns 531.00ns 1.08μs
Chunk Large (Fixed) 104.70K 9.55μs 13.34μs 29.52μs
Chunk Small (Semantic) 514.27K 1.94μs 2.04μs 2.70μs
Chunk Medium (Semantic) 251.41K 3.98μs 4.24μs 5.29μs
Chunk Large (Semantic) 61.48K 16.27μs 30.02μs 36.75μs

RAG Token Estimation

Benchmark Ops/sec Avg P95 P99
Estimate Tokens (Small) 5.08M 196.91ns 331.00ns 391.00ns
Estimate Tokens (Large) 7.33M 136.41ns 141.00ns 180.00ns

RAG Chunk Processing

Benchmark Ops/sec Avg P95 P99
Merge Small Chunks (50) 323.00K 3.10μs 6.10μs 6.52μs
Split Oversized Chunk 101.51K 9.85μs 21.19μs 22.86μs

Token Efficiency

Benchmark Original Compressed Savings Ratio
Small Project Compression 2.12K 446.00 78.9% 4.74x
Medium Project Compression 10.04K 1.62K 83.9% 6.21x
Large Project Compression 20.07K 2.87K 85.7% 7.00x
XLarge Project Compression 40.39K 5.51K 86.4% 7.33x
Small Export (minimal) 4.20K 383.00 90.9% 10.97x
Small Export (standard) 4.20K 2.30K 45.3% 1.83x
Small Export (detailed) 4.20K 4.45K -5.9% 0.94x
Medium Export (minimal) 10.48K 901.00 91.4% 11.63x
Medium Export (standard) 10.48K 5.75K 45.1% 1.82x
Medium Export (detailed) 10.48K 10.91K -4.1% 0.96x
Large Export (minimal) 20.93K 1.76K 91.6% 11.87x
Large Export (standard) 20.93K 11.40K 45.5% 1.84x
Large Export (detailed) 20.93K 21.66K -3.5% 0.97x
Paragraph Chunking Overhead 2.45K 2.45K 0.2% 1.00x
Sentence Chunking Overhead 2.45K 2.45K 0.2% 1.00x
Fixed Chunking Overhead 2.45K 2.73K -11.4% 0.90x
Semantic Chunking Overhead 2.45K 2.44K 0.6% 1.01x

Memory Usage

Benchmark Heap Delta RSS
Load 50 Task Project 177.01 KB 120.25 MB
Compress 50 Task Project 5.33 KB 120.25 MB
Health Score 50 Tasks 17.53 KB 120.25 MB
Export 50 Tasks 149.02 KB 120.25 MB
Load 100 Task Project 309.85 KB 120.25 MB
Compress 100 Task Project 9.03 KB 120.25 MB
Health Score 100 Tasks 33.22 KB 120.25 MB
Export 100 Tasks 284.77 KB 120.50 MB
Load 200 Task Project 623.73 KB 120.63 MB
Compress 200 Task Project 17.78 KB 120.63 MB
Health Score 200 Tasks 64.34 KB 120.63 MB
Export 200 Tasks 571.87 KB 120.88 MB
Load 500 Task Project 1.51 MB 121.63 MB
Compress 500 Task Project 45.24 KB 121.63 MB
Health Score 500 Tasks 139.47 KB 121.63 MB
Export 500 Tasks 1.40 MB 122.63 MB

@github-actions
Copy link
Contributor

🚀 Benchmark Results

CortexFlow Benchmark Results

Performance
Token Savings
Compression
Memory

Last Run: 2025-12-28T17:01:42.863Z
Version: 2.0.0
Platform: linux (Node v20.19.6)

Summary

Metric Value
Total Benchmarks 39
Total Operations 22.90K
Avg Ops/Second 551.35K
Avg Token Savings 42.4%
Avg Compression Ratio 4.24x
Peak Memory 112.64 MB

Performance Results

Storage

Benchmark Ops/sec Avg P95 P99
Create Project 3.91K 255.74μs 292.41μs 358.43μs
Read Project 4.50K 222.42μs 286.43μs 382.07μs
List Projects 11.97 83.52ms 96.45ms 101.70ms
Update Project 1.77K 566.00μs 657.91μs 1.05ms
Read Large Project (200 tasks) 1.34K 748.93μs 766.92μs 3.62ms

Intelligent

Benchmark Ops/sec Avg P95 P99
Critical Path Analysis 4.52K 221.31μs 230.98μs 242.07μs
Smart Priority Queue 4.12K 242.81μs 369.48μs 469.15μs
Context Compression 164.67K 6.07μs 6.64μs 24.09μs
Health Score Calculation 16.10K 62.09μs 126.44μs 144.63μs
Generate Suggestions 3.77K 265.49μs 275.36μs 414.87μs

Token Efficiency

Benchmark Ops/sec Avg P95 P99
Compress Small Project 754.70K 1.33μs 1.41μs 1.68μs
Compress Medium Project 312.90K 3.20μs 3.33μs 3.76μs
Compress Large Project 185.77K 5.38μs 5.52μs 6.31μs
Compress XLarge Project 102.21K 9.78μs 10.24μs 12.32μs

Context Handoff

Benchmark Ops/sec Avg P95 P99
Export Small (minimal) 276.24K 3.62μs 4.74μs 19.40μs
Export Small (standard) 57.30K 17.45μs 26.64μs 43.80μs
Export Small (detailed) 32.55K 30.72μs 51.41μs 58.64μs
Export Medium (minimal) 174.74K 5.72μs 4.55μs 19.12μs
Export Medium (standard) 35.94K 27.82μs 47.17μs 69.59μs
Export Medium (detailed) 18.64K 53.65μs 64.94μs 180.05μs
Export Large (minimal) 133.64K 7.48μs 7.81μs 8.97μs
Export Large (standard) 20.71K 48.28μs 54.06μs 73.80μs
Export Large (detailed) 5.05K 197.97μs 302.93μs 467.67μs

RAG Chunking

Benchmark Ops/sec Avg P95 P99
Chunk Small (Paragraph) 636.15K 1.57μs 2.40μs 2.78μs
Chunk Medium (Paragraph) 325.23K 3.07μs 3.08μs 4.74μs
Chunk Large (Paragraph) 22.95K 43.58μs 54.00μs 67.82μs
Chunk Small (Sentence) 603.81K 1.66μs 1.32μs 2.55μs
Chunk Medium (Sentence) 279.24K 3.58μs 4.68μs 5.60μs
Chunk Large (Sentence) 41.95K 23.84μs 27.83μs 53.80μs
Chunk Small (Fixed) 1.57M 638.31ns 652.00ns 1.25μs
Chunk Medium (Fixed) 1.61M 620.22ns 521.00ns 822.00ns
Chunk Large (Fixed) 100.27K 9.97μs 13.85μs 33.23μs
Chunk Small (Semantic) 738.42K 1.35μs 1.42μs 1.84μs
Chunk Medium (Semantic) 247.01K 4.05μs 4.22μs 5.31μs
Chunk Large (Semantic) 59.14K 16.91μs 30.50μs 48.54μs

RAG Token Estimation

Benchmark Ops/sec Avg P95 P99
Estimate Tokens (Small) 5.00M 200.07ns 331.00ns 381.00ns
Estimate Tokens (Large) 7.54M 132.71ns 141.00ns 170.00ns

RAG Chunk Processing

Benchmark Ops/sec Avg P95 P99
Merge Small Chunks (50) 320.32K 3.12μs 6.13μs 6.73μs
Split Oversized Chunk 100.85K 9.92μs 21.20μs 22.59μs

Token Efficiency

Benchmark Original Compressed Savings Ratio
Small Project Compression 2.12K 446.00 78.9% 4.74x
Medium Project Compression 10.04K 1.62K 83.9% 6.21x
Large Project Compression 20.07K 2.87K 85.7% 7.00x
XLarge Project Compression 40.39K 5.51K 86.4% 7.33x
Small Export (minimal) 4.20K 383.00 90.9% 10.97x
Small Export (standard) 4.20K 2.30K 45.3% 1.83x
Small Export (detailed) 4.20K 4.45K -5.9% 0.94x
Medium Export (minimal) 10.48K 901.00 91.4% 11.63x
Medium Export (standard) 10.48K 5.75K 45.1% 1.82x
Medium Export (detailed) 10.48K 10.91K -4.1% 0.96x
Large Export (minimal) 20.93K 1.76K 91.6% 11.87x
Large Export (standard) 20.93K 11.40K 45.5% 1.84x
Large Export (detailed) 20.93K 21.66K -3.5% 0.97x
Paragraph Chunking Overhead 2.45K 2.45K 0.2% 1.00x
Sentence Chunking Overhead 2.45K 2.45K 0.2% 1.00x
Fixed Chunking Overhead 2.45K 2.73K -11.4% 0.90x
Semantic Chunking Overhead 2.45K 2.44K 0.6% 1.01x

Memory Usage

Benchmark Heap Delta RSS
Load 50 Task Project 196.41 KB 110.51 MB
Compress 50 Task Project 5.33 KB 110.51 MB
Health Score 50 Tasks 17.53 KB 110.51 MB
Export 50 Tasks 149.02 KB 110.51 MB
Load 100 Task Project 309.85 KB 110.51 MB
Compress 100 Task Project 12.05 KB 110.51 MB
Health Score 100 Tasks 33.22 KB 110.51 MB
Export 100 Tasks 284.77 KB 110.64 MB
Load 200 Task Project 623.73 KB 110.76 MB
Compress 200 Task Project 17.78 KB 110.76 MB
Health Score 200 Tasks 64.34 KB 110.76 MB
Export 200 Tasks 571.87 KB 111.14 MB
Load 500 Task Project 1.50 MB 111.76 MB
Compress 500 Task Project 45.24 KB 111.76 MB
Health Score 500 Tasks 139.47 KB 112.01 MB
Export 500 Tasks 1.40 MB 112.64 MB

@github-actions github-actions bot added the ci label Dec 28, 2025
@github-actions
Copy link
Contributor

🚀 Benchmark Results

CortexFlow Benchmark Results

Performance
Token Savings
Compression
Memory

Last Run: 2025-12-28T17:06:01.885Z
Version: 2.0.0
Platform: linux (Node v20.19.6)

Summary

Metric Value
Total Benchmarks 39
Total Operations 22.90K
Avg Ops/Second 545.07K
Avg Token Savings 42.4%
Avg Compression Ratio 4.24x
Peak Memory 112.79 MB

Performance Results

Storage

Benchmark Ops/sec Avg P95 P99
Create Project 4.00K 250.05μs 304.10μs 350.87μs
Read Project 4.27K 234.22μs 279.14μs 388.43μs
List Projects 11.61 86.12ms 99.45ms 104.07ms
Update Project 1.80K 556.84μs 646.43μs 1.30ms
Read Large Project (200 tasks) 1.35K 741.74μs 753.80μs 3.37ms

Intelligent

Benchmark Ops/sec Avg P95 P99
Critical Path Analysis 4.74K 210.78μs 221.46μs 234.07μs
Smart Priority Queue 4.54K 220.07μs 231.27μs 295.69μs
Context Compression 150.15K 6.66μs 14.21μs 27.19μs
Health Score Calculation 16.16K 61.88μs 125.99μs 144.82μs
Generate Suggestions 3.99K 250.36μs 261.66μs 307.33μs

Token Efficiency

Benchmark Ops/sec Avg P95 P99
Compress Small Project 739.91K 1.35μs 1.42μs 1.60μs
Compress Medium Project 301.91K 3.31μs 3.41μs 3.97μs
Compress Large Project 183.40K 5.45μs 5.78μs 8.48μs
Compress XLarge Project 102.63K 9.74μs 10.16μs 14.48μs

Context Handoff

Benchmark Ops/sec Avg P95 P99
Export Small (minimal) 266.45K 3.75μs 5.15μs 23.27μs
Export Small (standard) 58.22K 17.18μs 27.91μs 44.46μs
Export Small (detailed) 30.43K 32.86μs 57.30μs 72.14μs
Export Medium (minimal) 170.55K 5.86μs 5.31μs 19.96μs
Export Medium (standard) 37.34K 26.78μs 33.79μs 72.23μs
Export Medium (detailed) 18.20K 54.96μs 62.34μs 214.99μs
Export Large (minimal) 133.29K 7.50μs 7.81μs 8.54μs
Export Large (standard) 20.56K 48.63μs 54.57μs 66.75μs
Export Large (detailed) 5.32K 187.83μs 209.45μs 523.59μs

RAG Chunking

Benchmark Ops/sec Avg P95 P99
Chunk Small (Paragraph) 579.74K 1.72μs 2.39μs 2.83μs
Chunk Medium (Paragraph) 217.94K 4.59μs 6.52μs 7.44μs
Chunk Large (Paragraph) 30.34K 32.96μs 45.68μs 68.03μs
Chunk Small (Sentence) 519.61K 1.92μs 2.70μs 3.60μs
Chunk Medium (Sentence) 250.77K 3.99μs 4.73μs 5.60μs
Chunk Large (Sentence) 27.71K 36.09μs 54.65μs 72.87μs
Chunk Small (Fixed) 1.58M 634.26ns 651.00ns 942.00ns
Chunk Medium (Fixed) 1.86M 536.86ns 521.00ns 590.00ns
Chunk Large (Fixed) 101.76K 9.83μs 13.66μs 28.50μs
Chunk Small (Semantic) 511.44K 1.96μs 2.01μs 2.96μs
Chunk Medium (Semantic) 249.11K 4.01μs 4.29μs 5.43μs
Chunk Large (Semantic) 55.24K 18.10μs 30.59μs 44.63μs

RAG Token Estimation

Benchmark Ops/sec Avg P95 P99
Estimate Tokens (Small) 5.20M 192.26ns 331.00ns 381.00ns
Estimate Tokens (Large) 7.45M 134.29ns 141.00ns 160.00ns

RAG Chunk Processing

Benchmark Ops/sec Avg P95 P99
Merge Small Chunks (50) 266.26K 3.76μs 12.61μs 13.71μs
Split Oversized Chunk 101.33K 9.87μs 21.24μs 22.84μs

Token Efficiency

Benchmark Original Compressed Savings Ratio
Small Project Compression 2.12K 446.00 78.9% 4.74x
Medium Project Compression 10.04K 1.62K 83.9% 6.21x
Large Project Compression 20.07K 2.87K 85.7% 7.00x
XLarge Project Compression 40.39K 5.51K 86.4% 7.33x
Small Export (minimal) 4.20K 383.00 90.9% 10.97x
Small Export (standard) 4.20K 2.30K 45.3% 1.83x
Small Export (detailed) 4.20K 4.45K -5.9% 0.94x
Medium Export (minimal) 10.48K 901.00 91.4% 11.63x
Medium Export (standard) 10.48K 5.75K 45.1% 1.82x
Medium Export (detailed) 10.48K 10.91K -4.1% 0.96x
Large Export (minimal) 20.93K 1.76K 91.6% 11.87x
Large Export (standard) 20.93K 11.40K 45.5% 1.84x
Large Export (detailed) 20.93K 21.66K -3.5% 0.97x
Paragraph Chunking Overhead 2.45K 2.45K 0.2% 1.00x
Sentence Chunking Overhead 2.45K 2.45K 0.2% 1.00x
Fixed Chunking Overhead 2.45K 2.73K -11.4% 0.90x
Semantic Chunking Overhead 2.45K 2.44K 0.6% 1.01x

Memory Usage

Benchmark Heap Delta RSS
Load 50 Task Project 161.01 KB 110.42 MB
Compress 50 Task Project 5.33 KB 110.42 MB
Health Score 50 Tasks 17.53 KB 110.42 MB
Export 50 Tasks 193.53 KB 110.42 MB
Load 100 Task Project 303.34 KB 110.42 MB
Compress 100 Task Project 9.03 KB 110.42 MB
Health Score 100 Tasks 33.22 KB 110.42 MB
Export 100 Tasks 284.77 KB 110.54 MB
Load 200 Task Project 614.92 KB 110.67 MB
Compress 200 Task Project 17.78 KB 110.67 MB
Health Score 200 Tasks 64.34 KB 110.67 MB
Export 200 Tasks 572.91 KB 110.92 MB
Load 500 Task Project 1.50 MB 112.17 MB
Compress 500 Task Project 41.87 KB 112.17 MB
Health Score 500 Tasks 139.47 KB 112.17 MB
Export 500 Tasks 1.40 MB 112.79 MB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants