Skip to content

AweAI-Team/ScaleSWE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Immersion in the GitHub Universe: Scaling Coding Agents to Mastery

arXiv Hugging Face Datasets Hugging Face Models Website License

🔥 Highlights

  • Source from 6M+ pull requests and 23000+ repositories.
  • Cover 5200 Repositories.
  • 100k high-quality instances.
  • 71k trajectories from DeepSeek v3.2 with 3.5B token.
  • Strong performance: 64% in SWE-bench-Verified trained from Qwen3-30A3B-Instruct.

📣 News

  • 2026-03-03 We released AweAgent, which now provides native support for Scale-SWE data.
  • 2026-02-26 🚀 We released a portion of our data on Hugging Face. This release includes 20,000 SWE task instances—currently the largest Real Executable open-source SWE dataset available—alongside 71k distillation trajectories(3.5B) from DeepSeek v3.2. Much more data will be released in the future.
  • 2026-02-10 📝 Our paper "Immersion in the GitHub Universe: Scaling Coding Agents to Mastery" is now available on arXiv.

FAQ

  • For evaluation of Scale-SWE-Data, you can use AweAgent and refer to this evaluation script.

📊 Data Format

Field Description
instance_id A unique identifier formatted as {user}_{repo}_pr{id}.
user The owner of the GitHub repository.
repo The name of the GitHub repository.
language The programming language of the codebase (currently Python).
workdir The working directory path within the environment.
image_url The URL of the pre-built Docker image for the task.
patch The ground-truth patch (Golden Patch) from the corresponding pull request.
pr_commit The commit hash of the pull request.
parent_commit The commit hash of the parent commit (base state).
problem_statement The issue description conveying the bug, provided to the model as input.
f2p_patch The developer-written test patch containing tests that fail before the fix (if available). For evaluation, this patch should be applied. See this script.
f2p_script The synthetic reproduction script generated by our unit-test creator agent. Because a lot of high qaulity pull request do not have author written F2P, we can only synthetic F2P. This should be applied as test_fail_to_pass.py file just under repository directory. just before evaluation. See this script.
FAIL_TO_PASS Unit tests that fail on the buggy version but pass after the fix.
PASS_TO_PASS Unit tests that pass in both versions (regression tests).
github_url The URL of the original GitHub repository.
pre_commands These commands must be executed immediately upon entering the container to check out the correct commit.

🤖 Results

We fine-tuned Qwen-30B-A3B-Instruct on our synthesized trajectories.

Scale-SWE-Agent

Please use AweAgent to inference Scale-SWE-Agent. Scale-SWE-Agent model parameter is avaliable at Huggingface. Key parameters can be seen below:

Parameter Value
Max turns 200
Max sequence length 256k
Temperature 1

📖 Citation

If you find this project useful for your research, please consider citing our paper:

@misc{zhao2026immersiongithubuniversescaling,
      title={Immersion in the GitHub Universe: Scaling Coding Agents to Mastery}, 
      author={Jiale Zhao and Guoxin Chen and Fanzhe Meng and Minghao Li and Jie Chen and Hui Xu and Yongshuai Sun and Xin Zhao and Ruihua Song and Yuan Zhang and Peng Wang and Cheng Chen and Jirong Wen and Kai Jia},
      year={2026},
      eprint={2602.09892},
      archivePrefix={arXiv},
      primaryClass={cs.SE},
      url={https://arxiv.org/abs/2602.09892}, 
}

📄 License

This project is licensed under the CC BY 4.0 License - see the LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages