Add new submission for SWE-bench evaluation 20251009_MAYA_claude-sonn… by yashjakhar-2929 · Pull Request #345 · SWE-bench/experiments

yashjakhar-2929 · 2025-10-09T05:58:30Z

MAYA: Multi Agent Yottaframe by Adya

MAYA is a modular, multi-agent debugging system that can plug into any framework or development pipeline. It decomposes bug resolution into four coordinated roles: Classification (identify error type), Analyzer (trace root cause), Planner (design precise edit instructions), and Solver (generate clean, Git-ready patches). This structured workflow transforms debugging from a manual, opaque process into a transparent and reproducible pipeline.

Unlike monolithic debuggers, MAYA is framework-agnostic and self-healing. It produces minimal patches that preserve existing functionality while iteratively resolving errors. Every step emits auditable artifacts — from root cause summaries to unified diffs — making MAYA not just a tool for fixing bugs, but a universal debugging fabric that improves reliability, traceability, and developer velocity.

Performance

Submission summary for 20251009_MAYA_claude-sonnet-4-5-20250929 on SWE-bench lite split

Resolved 155 instances (51.67%)

Resolved by Repository

astropy/astropy: 3/6 (50.0%)
django/django: 44/114 (38.6%)
matplotlib/matplotlib: 9/23 (39.13%)
mwaskom/seaborn: 1/4 (25.0%)
pallets/flask: 0/3 (0.0%)
psf/requests: 0/6 (0.0%)
pydata/xarray: 0/5 (0.0%)
pylint-dev/pylint: 3/6 (50.0%)
pytest-dev/pytest: 17/17 (100.0%)
scikit-learn/scikit-learn: 20/23 (86.96%)
sphinx-doc/sphinx: 11/16 (68.75%)
sympy/sympy: 47/77 (61.04%)
==================================================
Resolved by Time
2012: 0/1 (0.0%)
2014: 0/3 (0.0%)
2015: 0/1 (0.0%)
2016: 2/4 (50.0%)
2017: 11/16 (68.75%)
2018: 12/21 (57.14%)
2019: 37/59 (62.71%)
2020: 35/66 (53.03%)
2021: 18/42 (42.86%)
2022: 24/57 (42.11%)
2023: 16/30 (53.33%)

Details

Report
Site

Checklist

Is a pass@1 submission (does not attempt the same task instance more than once)
Does not use SWE-bench test knowledge (PASS_TO_PASS, FAIL_TO_PASS)
Does not use the hints field in SWE-bench
Does not have web-browsing OR has taken steps to prevent lookup of SWE-bench solutions via web-browsing

…et-4-5-20250929

yashjakhar-2929 · 2025-11-10T05:24:03Z

Hi @john-b-yang, I wanted to follow up on the submission I made last month as I haven’t received a response yet. Please let me know if you need any additional information from my side. Thanks!

john-b-yang · 2025-11-18T19:07:14Z

Hi guys, i took a look at the technical report - https://adya.ai/blogs/maya-multi-agentic-way-build-apps

There's no mention of SWE-bench or how you actually ran your system on SWE-bench? I see your system is closed source, which is fine, but this technical report feels a bit threadbare. Can you describe how your system actually works on SWE-bench? See other technical reports from other PRs if you need a reference.

yashjakhar-2929 · 2025-11-20T08:46:32Z

Hi @john-b-yang , Thanks for the feedback! I've updated the report(https://adya.ai/blogs/maya-multi-agentic-way-build-apps) to include details on how the system was run on SWE-bench and added the missing explanations you pointed out. If there’s anything else you’d like clarified or if more information would be helpful, just let me know — happy to add it.

Ubuntu added 3 commits October 9, 2025 05:47

Add new submission for SWE-bench evaluation 20251009_MAYA_claude-sonn…

5f1e10d

…et-4-5-20250929

added trajs

ac3bf20

added logs

95cfa80

john-b-yang added 3 commits November 18, 2025 18:54

Remove logs and trajs (Uploaded to shared s3 bucket)

b8d9909

Update metadata with s3 paths

c8fb94d

Undo .gitignore change

28fbe02

john-b-yang added the invalid This doesn't seem right label Nov 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new submission for SWE-bench evaluation 20251009_MAYA_claude-sonn…#345

Add new submission for SWE-bench evaluation 20251009_MAYA_claude-sonn…#345
yashjakhar-2929 wants to merge 6 commits intoSWE-bench:mainfrom
yashjakhar-2929:20251009_MAYA_claude-sonnet-4-5-20250929

yashjakhar-2929 commented Oct 9, 2025 •

edited

Loading

Uh oh!

yashjakhar-2929 commented Nov 10, 2025

Uh oh!

john-b-yang commented Nov 18, 2025

Uh oh!

yashjakhar-2929 commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yashjakhar-2929 commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

MAYA: Multi Agent Yottaframe by Adya

Performance

Submission summary for 20251009_MAYA_claude-sonnet-4-5-20250929 on SWE-bench lite split

Resolved 155 instances (51.67%)

Details

Checklist

Uh oh!

yashjakhar-2929 commented Nov 10, 2025

Uh oh!

john-b-yang commented Nov 18, 2025

Uh oh!

yashjakhar-2929 commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yashjakhar-2929 commented Oct 9, 2025 •

edited

Loading