PaperCoder is a multi-agent LLM system that transforms paper into a code repository.
It follows a three-stage pipeline: planning, analysis, and code generation, each handled by specialized agents.
Our method outperforms strong baselines on both Paper2Code and PaperBench and produces faithful, high-quality implementations.
- β‘ Quick Start
- π Detailed Setup Instructions
- π¦ Paper2Code Benchmark Datasets
- π Model-based Evaluation of Repositories
- π LLM Router
- Note: The following command runs example paper (Attention Is All You Need).
First, configure your API keys by creating a .env file:
# Copy the example file and edit with your keys
cp .env.example .env
# Edit .env file with your actual API keys:
# OPENAI_API_KEY=sk-proj-your-openai-key
# ANTHROPIC_API_KEY=sk-ant-api03-your-anthropic-key
# GEMINI_API_KEY=your-gemini-key- π΅ Estimated cost for using o3-mini: $0.50β$0.70
pip install openai
export OPENAI_API_KEY="<OPENAI_API_KEY>"
cd scripts
bash run.shThe router configuration lives in llm_router/config.yaml.
| Task Pattern | Primary Model | Fallback |
|---|---|---|
chat|faq|rag |
gemini_flash_25 |
claude_sonnet_35 |
code|unit_tests |
claude_sonnet_37 |
o4mini |
long_doc>300k |
gpt41 |
claude_sonnet_35 |
tool_reasoning |
o4mini |
gemini_flash_25 |
Override the config by setting LLM_CFG:
export LLM_CFG=/path/to/custom.yaml- If you encounter any issues installing vLLM, please refer to the official vLLM repository.
- The default model is
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct.
pip install vllm
cd scripts
bash run_llm.shoutputs
βββ Transformer
β βββ analyzing_artifacts
β βββ coding_artifacts
β βββ planning_artifacts
βββ Transformer_repo # Final output repository- π‘ To use the
o3-miniversion, make sure you have the latestopenaipackage installed. - π¦ Install only what you need:
- For OpenAI API:
openai - For open-source models:
vllm- If you encounter any issues installing vLLM, please refer to the official vLLM repository.
- For OpenAI API:
pip install openai
pip install vllm - Or, if you prefer, you can install all dependencies using
pip:
pip install -r requirements.txtThe following process describes how to convert a paper PDF into JSON format. If you have access to the LaTeX source and plan to use it with PaperCoder, you may skip this step and proceed to π Running PaperCoder.
Note: In our experiments, we converted all paper PDFs to JSON format. The original workflow relied on the
s2orc-doc2json repository. As of 2025 more capable open-source
libraries exist. We provide multiple approaches below.
We now provide a modern PDF to JSON converter that uses vision models (Gemini 2.5 Flash) instead of the legacy GROBID approach. This method is:
- 95% cheaper than traditional approaches
- Faster (no Java services required)
- More accurate for complex layouts, formulas, and tables
# Install dependencies
pip install pdf2image pytesseract aiohttp tqdm
# With Gemini API (best quality)
export GEMINI_API_KEY="your-api-key"
python codes/pdf_to_json_modern.py -i paper.pdf -o output.json
# Or use the convenience script
cd scripts
./run_modern_pdf2json.sh ../examples/Transformer.pdfFor more details, see Modern PDF to JSON Documentation.
If you prefer the traditional method, you can still use the s2orc-doc2json repository:
- Clone
s2orc-doc2jsonand run its processing service:
git clone https://github.com/allenai/s2orc-doc2json.git
cd ./s2orc-doc2json/grobid-0.7.3
./gradlew run- Convert the PDF into JSON format using the bundled script:
mkdir -p ./s2orc-doc2json/output_dir/paper_coder
python ./s2orc-doc2json/doc2json/grobid2json/process_pdf.py \
-i ${PDF_PATH} \
-t ./s2orc-doc2json/temp_dir/ \
-o ./s2orc-doc2json/output_dir/paper_coder- Install modern PDF processing libraries.
pip install PyMuPDF pdfplumber layoutparser-
Ensure the latest
grobidserver (v0.8 or later) is running. -
Use the script
codes/pdf_to_json_hybrid.pyto combine page-level text extraction with metadata fromgrobidand produce a single JSON file:
python codes/pdf_to_json_hybrid.py \
--pdf_path ${PDF_PATH} \
--output_json ./paper_coder_output/paper.json \
--grobid_url http://localhost:8070This hybrid pipeline leverages modern layout analysis tools for accurate page content
while still using grobid for reliable metadata extraction.
- Install lightweight dependencies.
pip install PyMuPDF pdf2image pytesseract camelot-py- Run the script
codes/pdf_to_json_simple.py:
python codes/pdf_to_json_simple.py \
--pdf_path ${PDF_PATH} \
--output_json ./paper_coder_output/paper.jsonThis method relies solely on PyMuPDF and OCR, optionally using camelot to
extract tables.
- Note: The following command runs example paper (Attention Is All You Need).
If you want to run PaperCoder on your own paper, please modify the environment variables accordingly.
- π΅ Estimated cost for using o3-mini: $0.50β$0.70
# Using the PDF-based JSON format of the paper
export OPENAI_API_KEY="<OPENAI_API_KEY>"
cd scripts
bash run.sh# Using the LaTeX source of the paper
export OPENAI_API_KEY="<OPENAI_API_KEY>"
cd scripts
bash run_latex.sh- The default model is
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct.
# Using the PDF-based JSON format of the paper
cd scripts
bash run_llm.sh# Using the LaTeX source of the paper
cd scripts
bash run_latex_llm.sh-
Huggingface dataset: paper2code
-
You can find the description of the Paper2Code benchmark dataset in data/paper2code.
-
For more details, refer to Section 4.1 "Paper2Code Benchmark" in the paper.
We've extended the original Paper2Code pipeline with advanced image analysis capabilities, using o4-mini-2025-04-16 for image processing and o3-2025-04-16 for code generation.
- Copy and Setup PDF
# Copy your paper to the working directory
cp /path/to/your/paper.pdf ./custom_paper/paper.pdf- Start GROBID in a separate terminal
cd $HOME/grobid-0.7.3 && ./gradlew runGROBID is required for extracting structured text from scientific PDFs.
- Convert PDF to JSON using GROBID
python s2orc-doc2json/doc2json/grobid2json/process_pdf.py -i "custom_paper/paper.pdf" -t custom_paper/temp_dir/ -o custom_paper/This transforms the PDF into structured JSON with sections, paragraphs, and references.
- Preprocess JSON
python codes/0_pdf_process.py --input_json_path custom_paper/paper.json --output_json_path custom_paper/paper_cleaned.jsonCleans and enhances the JSON for better analysis.
- Extract and Analyze Images with o4-mini-2025-04-16
python codes/extract_figures.py --pdf_path custom_paper/paper.pdf --json_path custom_paper/paper_cleaned.json --output_dir custom_paper --gpt_version o4-mini-2025-04-16This step:
- Extracts all images from the PDF
- Uses o4-mini-2025-04-16 to create detailed descriptions of each image
- Adds these descriptions to the JSON, creating enhanced_paper.json
- Planning with o3-2025-04-16
python codes/1_planning.py --paper_name YourPaperName --gpt_version o3-2025-04-16 --pdf_json_path custom_paper/enhanced_paper.json --output_dir outputs/YourPaperName_enhancedCreates a detailed implementation plan using the enriched JSON with image descriptions.
- Configuration Extraction
python codes/1.1_extract_config.py --paper_name YourPaperName --output_dir outputs/YourPaperName_enhancedExtracts configuration parameters from the plan for use in subsequent steps.
- Analysis with o3-2025-04-16
python codes/2_analyzing.py --paper_name YourPaperName --gpt_version o3-2025-04-16 --pdf_json_path custom_paper/enhanced_paper.json --output_dir outputs/YourPaperName_enhancedPerforms detailed analysis of system components, creating logical schemas for each module.
- Code Generation with o3-2025-04-16
python codes/3_coding.py --paper_name YourPaperName --gpt_version o3-2025-04-16 --pdf_json_path custom_paper/enhanced_paper.json --output_dir outputs/YourPaperName_enhanced --output_repo_dir outputs/YourPaperName_repo_enhancedGenerates the actual code implementing all system components based on planning and analysis results.
For convenience, you can use the enhanced script:
./scripts/run_custom_enhanced.shThis script runs the entire pipeline with the appropriate configuration.
- o4-mini-2025-04-16 for image analysis
- o3-2025-04-16 for planning, analysis, and code generation
- Static content (text + image descriptions) is placed at the beginning
- Token caching between consecutive API calls
- Cost reduction of approximately 50% for cached content
- Automatic extraction of all figures from PDF
- Image analysis using o4-mini-2025-04-16
- Integration of descriptions into JSON for use by o3-2025-04-16
- Logical division into stages: planning, analysis, coding
- Saving intermediate results
- Ability to restart individual stages
- Structured implementation of the entire system
- Complete reproduction of the paper methodology
- Ready-to-use code in output_repo_dir
-
We evaluate repository quality using a model-based approach, supporting both reference-based and reference-free settings.
The model critiques key implementation components, assigns severity levels, and generates a 1β5 correctness score averaged over 8 samples using o3-mini-high. -
For more details, please refer to Section 4.3.1 (Paper2Code Benchmark) of the paper.
-
Note: The following examples evaluate the sample repository (Transformer_repo).
Please modify the relevant paths and arguments if you wish to evaluate a different repository.
pip install tiktoken
export OPENAI_API_KEY="<OPENAI_API_KEY>"target_repo_diris the generated repository.
cd codes/
python eval.py \
--paper_name Transformer \
--pdf_json_path ../examples/Transformer_cleaned.json \
--data_dir ../data \
--output_dir ../outputs/Transformer \
--target_repo_dir ../outputs/Transformer_repo \
--eval_result_dir ../results \
--eval_type ref_free \
--generated_n 8 \
--papercodertarget_repo_diris the generated repository.gold_repo_dirshould point to the official repository (e.g., author-released code).
cd codes/
python eval.py \
--paper_name Transformer \
--pdf_json_path ../examples/Transformer_cleaned.json \
--data_dir ../data \
--output_dir ../outputs/Transformer \
--target_repo_dir ../outputs/Transformer_repo \
--gold_repo_dir ../examples/Transformer_gold_repo \
--eval_result_dir ../results \
--eval_type ref_based \
--generated_n 8 \
--papercoder========================================
π Evaluation Summary π
π Paper name: Transformer
π§ͺ Evaluation type: ref_based
π Target repo directory: ../outputs/Transformer_repo
π Evaluation result:
π Score: 4.5000
β
Valid: 8/8
========================================
π Usage Summary π
[Evaluation] Transformer - ref_based
π οΈ Model: o3-mini
π₯ Input tokens: 44318 (Cost: $0.04874980)
π¦ Cached input tokens: 0 (Cost: $0.00000000)
π€ Output tokens: 26310 (Cost: $0.11576400)
π΅ Current total cost: $0.16451380
πͺ Accumulated total cost so far: $0.16451380
============================================The router configuration lives in llm_router/config.yaml.
| Task Pattern | Primary Model | Fallback |
|---|---|---|
chat|faq|rag |
gemini_flash_25 |
claude_sonnet_35 |
code|unit_tests |
claude_sonnet_37 |
o4mini |
long_doc>300k |
gpt41 |
claude_sonnet_35 |
tool_reasoning |
o4mini |
gemini_flash_25 |
Override the config by setting LLM_CFG:
export LLM_CFG=/path/to/custom.yamlThe following prices were collected from official documentation in May 2025. All values are shown per million tokens.
- o4-mini-2025-04-16: Input
$1.10, Output$4.40β fast, costβefficient reasoning with multimodal support. - gpt-4.1-2025-04-14: Input
$2.00, Output$8.00β improved coding and instruction following with a 1M token context window. - o3-2025-04-16: Input
$10.00(cached input$2.50), Output$40.00β OpenAI's most powerful reasoning model with a 200K token context window.
- Gemini 2.5 Flash (preview):
- Input: Text/Image/Video
$0.15, Audio$1.00 - Output: Non-thinking mode
$0.60, Thinking mode$3.50 - First Flash model with thinking capabilities (preview).
- Input: Text/Image/Video
- Gemini 2.5 Pro (preview):
- Input β€ 200k tokens
$1.25, > 200k tokens$2.50 - Output β€ 200k tokens
$10.00, > 200k tokens$15.00 - Most advanced Gemini reasoning model with a 1M token context window.
- Input β€ 200k tokens
Prices may change as these models move from preview to general availability. Consult the respective provider pages for the latest information.
