Skip to content

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning论文转代码工具,专治未公开代码的论文,是一个多智能体(multi-agent)系统,可以把机器学习领域的论文,自动拆解、理解,最后生成一个可以直接跑的完整代码库。- Planning阶段:首先,PaperCoder会制定一个详细的开发路线图,画出系统架构图,整理出各模块之间的依赖关系,同时生成一份标准配置文件(config.yaml),, Analyzing阶段:接着,每一个即将生成的文件都会被深入分析,比如它需要实现什么功能,输入输出是什么,和其他模块有什么耦合 Coding阶段:最后,系统根据之前的规划和分析,顺序生成各模块代码

Notifications You must be signed in to change notification settings

Xiaotangguo1234/Paper2Code

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📄 Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

PaperCoder Overview

📄 Read the paper on arXiv

PaperCoder is a multi-agent LLM system that transforms paper into code repository. It follows a three-stage pipeline: planning, analysis, and code generation, each handled by specialized agents.
Our method outperforms strong baselines both on Paper2Code and PaperBench and produces faithful, high-quality implementations.


⚡ QuickStart

  • Note: The following command runs example paper (Attention Is All You Need).
  • 💵 Estimated cost for using o3-mini: $0.50–$0.70
pip install openai

export OPENAI_API_KEY="<OPENAI_API_KEY>"

cd scripts
bash run.sh

Output Folder Structure (Only Important Files)

outputs
├── Transformer
│   ├── analyzing_artifacts
│   ├── coding_artifacts
│   └── planning_artifacts
└── Transformer_repo  # Final output repository

📚 Detailed Setup Instructions

🛠️ Environment Setup

  • Note: If you wish to use the o3-mini version, please make sure to install the latest version of the OpenAI package.
pip install openai

📄 Convert PDF to JSON

  1. Clone the s2orc-doc2json repository to convert your PDF file into a structured JSON format.
    (For detailed configuration, please refer to the official repository.)
git clone https://github.com/allenai/s2orc-doc2json.git
  1. Running the PDF processing service.
cd ./s2orc-doc2json/grobid-0.7.3
./gradlew run
  1. Convert your PDF into JSON format.
mkdir -p ./s2orc-doc2json/output_dir/paper_coder
python ./s2orc-doc2json/doc2json/grobid2json/process_pdf.py \
    -i ${PDF_PATH} \
    -t ./s2orc-doc2json/temp_dir/ \ 
    -o ./s2orc-doc2json/output_dir/paper_coder

🚀 Runing PaperCoder

  • Note: The following command runs example paper (Attention Is All You Need).
    If you want to run PaperCoder on your own paper, please modify the environment variables accordingly.
export OPENAI_API_KEY="<OPENAI_API_KEY>"

cd scripts
bash run.sh

About

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning论文转代码工具,专治未公开代码的论文,是一个多智能体(multi-agent)系统,可以把机器学习领域的论文,自动拆解、理解,最后生成一个可以直接跑的完整代码库。- Planning阶段:首先,PaperCoder会制定一个详细的开发路线图,画出系统架构图,整理出各模块之间的依赖关系,同时生成一份标准配置文件(config.yaml),, Analyzing阶段:接着,每一个即将生成的文件都会被深入分析,比如它需要实现什么功能,输入输出是什么,和其他模块有什么耦合 Coding阶段:最后,系统根据之前的规划和分析,顺序生成各模块代码

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 96.7%
  • Shell 3.3%