SafeCoder: Instruction Tuning for Secure Code Generation

SafeCoder enables large language models (LLMs) to learn to generate secure code during instruction tuning. This is the official repository for our ICML 2024 paper.

Setup

First, install Python dependencies:

pip install -r requirements.txt
pip install -e .

Then, install GitHub CodeQL, which will be used for evaluating the security of LLM-generated code:

./setup_codeql.sh

Finally, set up different programming languages studied in this work (sudo rights required):

./setup_langs.sh

Training

Run the following command to fine-tune an pretrained LLM with SafeCoder:

python train.py --pretrain_name starcoderbase-1b --output_name starcoderbase-1b-safecoder --datasets evol sec-desc sec-new-desc

Here, --pretrain_name specifies the base pretrained LLM, --output_name denotes the user-provided name of the fine-tuned model, and --datasets represents a list of datasets used for training (see the datasets section for more details). We also provide fine-tuned versions of Mistral-7B (link) and CodeLlama-7B (link), such that the user does not necessarily need to perform fine-tuning by themselves.

Evaluation

Our evaluation covers various benchmarks concerning security and utility. To evaluate the security of generated code, run the following commands:

python sec_eval.py --output_name starcoderbase-1b-safecoder --model_name starcoderbase-1b-safecoder --eval_type trained
python sec_eval.py --output_name starcoderbase-1b-safecoder --model_name starcoderbase-1b-safecoder --eval_type trained-new
python print_results.py --eval_name starcoderbase-1b-safecoder --eval_type trained-joint --detail

For utility, we consider the following benchmarks:

# HumanEval, with temperature 0.2
./func_eval.sh human_eval starcoderbase-1b-safecoder-0.2 starcoderbase-1b-safecoder 0.2
python print_results.py --eval_name starcoderbase-1b-safecoder-0.2 --eval_type human_eval

# MBPP, with temperature 0.2
./func_eval.sh mbpp starcoderbase-1b-safecoder-0.2 starcoderbase-1b-safecoder 0.2
python print_results.py --eval_name starcoderbase-1b-safecoder-0.2 --eval_type mbpp

# MMLU
python mmlu_eval.py --output_name starcoderbase-1b-safecoder --model_name starcoderbase-1b-safecoder
python print_results.py --eval_name starcoderbase-1b-safecoder --eval_type mmlu

# TruthfulQA
python truthfulqa_eval.py --output_name starcoderbase-1b-safecoder --model_name starcoderbase-1b-safecoder
python print_results.py --eval_name starcoderbase-1b-safecoder --eval_type tqa

Datasets

The repository contains two utility datasets, evol and lmsys. In the paper, evol is used with code-specific LLMs and lmsys is used with general-purpose LLMs. We also have two security datasets, sec-desc and sec-new-desc. sec-desc is adapted from our previous work SVEN, while sec-new-desc is constructed within this work (see Section 5 in our paper for more details). trained and trained-new correspond to the evaluation datasets for sec-desc and sec-new-desc, respectively.

Citation

@inproceedings{safecoder,
  author       = {Jingxuan He, Mark Vero, Gabriela Krasnopolska, and Martin Vechev},
  title        = {Instruction Tuning for Secure Code Generation},
  booktitle    = {ICML},
  year         = {2024},
  url          = {https://arxiv.org/abs/2402.09497},
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data_eval		data_eval
data_train_val		data_train_val
safecoder		safecoder
scripts		scripts
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
setup_codeql.sh		setup_codeql.sh
setup_langs.sh		setup_langs.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SafeCoder: Instruction Tuning for Secure Code Generation

Setup

Training

Evaluation

Datasets

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

eth-sri/SafeCoder

Folders and files

Latest commit

History

Repository files navigation

SafeCoder: Instruction Tuning for Secure Code Generation

Setup

Training

Evaluation

Datasets

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages