Adam Zweiger, Jyothish Pari, Han Guo, Ekin Akyürek, Yoon Kim, Pulkit Agrawal
MIT CSAIL
SEAL (Self-Adapting LLMs) is a framework for training language models via RL to generate self-edits (finetuning data and other update directives for themselves) in response to new inputs.
We explore SEAL in two domains:
- knowledge-incorporation: Incorporating new factual knowledge
- few-shot: Adapting to new tasks from few-shot examples
Both folders include code, data, and documentation.
git clone https://github.com/Continual-Intelligence/SEAL.git
cd SEALUsing conda:
conda create -n seal_env python=3.12
conda activate seal_envUsing venv:
python3.12 -m venv seal_env
source seal_env/bin/activatepip install -r requirements.txtCreate a .env file in the project root and add your OpenAI API key:
OPENAI_API_KEY=your_openai_api_key_hereBefore running any shell scripts, make sure to update the SLURM directives at the top of each .sh file to match your system configuration. All experiments can be run with 2 A100/H100 GPUs. Other setups may require refactoring and/or changing model sizes.
If you are using a MacBook Pro with Apple Silicon (M1/M2), you likely do not have access to A100/H100 GPUs or a SLURM-managed cluster. To run experiments locally:
-
Edit Shell Scripts:
- Remove or comment out any SLURM directives (lines starting with
#SBATCH) in the.shfiles. - Run scripts directly in your terminal instead of submitting them to SLURM.
- Remove or comment out any SLURM directives (lines starting with
-
Adjust Model and Batch Size:
- Use smaller models and reduce batch sizes to fit within your available memory and CPU/GPU resources.
-
Apple Silicon Compatibility:
- Ensure your Python environment and ML libraries (e.g., PyTorch, TensorFlow) are installed with Apple Silicon (arm64) support for best performance.
- Some models or dependencies may not be fully supported on M1/M2; check library documentation for compatibility.
-
Performance:
- Expect slower training and inference compared to datacenter GPUs. For large-scale experiments, consider using cloud resources or a cluster with suitable GPUs.
Below are examples of how to run key scripts locally after removing/commenting out SLURM directives and adjusting parameters for your MacBook Pro:
-
few-shot/launch.sh
#!/bin/bash # Run locally: remove/comment SLURM lines source ~/.bashrc cd ~/few-shot conda activate seal_env python eval-self-edits-baseline.py \ --experiment_folder="~/tti/eval_base_model" \ --pretrained_checkpoint=meta-llama/Llama-3.2-1B-Instruct \ --lora_checkpoints_folder="~/few-shot/loras/self-edit/eval_RL_iteration_1_8_epoch" \ --temperature=0 \ --n_sample=1 \ --data_file="~/few-shot/data/arc-agi_evaluation_challenges_filtered_1B_eval_set.json" \ --solution_file="~/few-shot/data/arc-agi_evaluation_solutions_filtered_1B_eval_set.json" \ --max_lora_rank=32 \ --include_n=1 \ --new_format \ --num_examples=2
-
knowledge-incorporation/scripts/train_SFT.sh
#!/bin/bash # Run locally: remove/comment SLURM lines source ~/.bashrc conda activate seal_env cd ~/SEAL MODEL_NAME="Qwen/Qwen2.5-1.8B" # Use a smaller model if needed TRAIN_FILE="knowledge-incorporation/data/synthetic_data/EM_SFT/sft_best1of5_iter0.jsonl" OUTPUT_DIR="models/iter1" mkdir -p "${OUTPUT_DIR}" PER_DEVICE_BATCH_SIZE=1 GRAD_ACC=2 EPOCHS=1 LR=3e-4 LORA_RANK=16 LORA_ALPHA=32 LORA_DROPOUT=0.0 LORA_TARGET_MODULES="q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj" LOG_STEPS=1 # Example training command (replace with your actual script/command): python train_SFT.py \ --model_name $MODEL_NAME \ --train_file $TRAIN_FILE \ --output_dir $OUTPUT_DIR \ --per_device_batch_size $PER_DEVICE_BATCH_SIZE \ --epochs $EPOCHS
-
knowledge-incorporation/scripts/CPT.sh
#!/bin/bash # Run locally: remove/comment SLURM lines source ~/.bashrc conda activate seal_env cd ~/SEAL # Adjust N_ARTICLES and other parameters for your system N_ARTICLES=10 OUTPUT_DIR="knowledge-incorporation/results/cpt" mkdir -p "${OUTPUT_DIR}" # Example CPT command (replace with your actual script/command): python CPT.py --n_articles $N_ARTICLES --output_dir $OUTPUT_DIR
-
knowledge-incorporation/scripts/TTT_server.sh
#!/bin/bash # Run locally: remove/comment SLURM lines source ~/.bashrc conda activate seal_env cd ~/SEAL MODEL_NAME="Qwen/Qwen2.5-1.8B" PORT=8001 ZMQ_PORT=5555 MAX_SEQ_LENGTH=1024 # Example server command (replace with your actual script/command): ```bash
python TTT_server.py --model_name $MODEL_NAME --port $PORT --zmq_port $ZMQ_PORT --max_seq_length $MAX_SEQ_LENGTH
If you encounter issues running scripts locally, please open an issue or reach out for support.
## 📄 Citation
If you found this work useful, please cite:
```bibtex
@misc{zweiger2025selfadaptinglanguagemodels,
title={Self-Adapting Language Models},
author={Adam Zweiger and Jyothish Pari and Han Guo and Ekin Akyürek and Yoon Kim and Pulkit Agrawal},
year={2025},
eprint={2506.10943},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2506.10943},
}
