Add new submission for SWE-bench evaluation 20251009_MAYA_claude-sonn…#345
Add new submission for SWE-bench evaluation 20251009_MAYA_claude-sonn…#345yashjakhar-2929 wants to merge 6 commits intoSWE-bench:mainfrom
Conversation
|
Hi @john-b-yang, I wanted to follow up on the submission I made last month as I haven’t received a response yet. Please let me know if you need any additional information from my side. Thanks! |
|
Hi guys, i took a look at the technical report - https://adya.ai/blogs/maya-multi-agentic-way-build-apps There's no mention of SWE-bench or how you actually ran your system on SWE-bench? I see your system is closed source, which is fine, but this technical report feels a bit threadbare. Can you describe how your system actually works on SWE-bench? See other technical reports from other PRs if you need a reference. |
|
Hi @john-b-yang , Thanks for the feedback! I've updated the report(https://adya.ai/blogs/maya-multi-agentic-way-build-apps) to include details on how the system was run on SWE-bench and added the missing explanations you pointed out. If there’s anything else you’d like clarified or if more information would be helpful, just let me know — happy to add it. |
MAYA: Multi Agent Yottaframe by Adya
MAYA is a modular, multi-agent debugging system that can plug into any framework or development pipeline. It decomposes bug resolution into four coordinated roles: Classification (identify error type), Analyzer (trace root cause), Planner (design precise edit instructions), and Solver (generate clean, Git-ready patches). This structured workflow transforms debugging from a manual, opaque process into a transparent and reproducible pipeline.
Unlike monolithic debuggers, MAYA is framework-agnostic and self-healing. It produces minimal patches that preserve existing functionality while iteratively resolving errors. Every step emits auditable artifacts — from root cause summaries to unified diffs — making MAYA not just a tool for fixing bugs, but a universal debugging fabric that improves reliability, traceability, and developer velocity.
Performance
Submission summary for 20251009_MAYA_claude-sonnet-4-5-20250929 on SWE-bench lite split
Resolved 155 instances (51.67%)
Resolved by Repository
==================================================
Resolved by Time
Details
Report
Site
Checklist