Dolly

This fine-tunes the GPT-J 6B model on the Alpaca dataset using a Databricks notebook. Please note that while GPT-J 6B is Apache 2.0 licensed, the Alpaca dataset is licensed under Creative Commons NonCommercial (CC BY-NC 4.0).

Get Started Training

Install dependency

pip install -r requirements_dev.txt

Download dataset file and model file

Download from Tsinghua Cloud: https://cloud.tsinghua.edu.cn/d/0185c787cdc243d1a3b7/ .

Alternatively, you can download model checkpoint from huggingface

pip install transformers
transformers-cli download EleutherAI/gpt-j-6B --cache-dir ./model/
export TRANSFORMERS_CACHE=`pwd`/model

Train the model

Start a single-node cluster with node type having 8 A100 (40GB memory) GPUs (e.g. Standard_ND96asr_v4 or p4d.24xlarge).

export timestamp=`date +%Y-%m-%d_%H-%M-%S`
export model_name='dolly'
export checkpoint_dir_name="${model_name}__${timestamp}"
export deepspeed_config=`pwd`/config/ds_z3_bf16_config.json
export local_training_root='./'
export local_output_dir="${local_training_root}/${checkpoint_dir_name}"
export dbfs_output_dir=''
export tensorboard_display_dir="${local_output_dir}/runs"
export DATASET_FILE_PATH=`pwd`/parquet-train.arrow
export MODEL_PATH=`pwd`/model/
deepspeed --num_gpus=8 \
    --module training.trainer \
    --deepspeed $deepspeed_config \
    --epochs 1 \
    --local-output-dir $local_output_dir \
    --dbfs-output-dir "" \
    --per-device-train-batch-size 8 \
    --per-device-eval-batch-size 8 \
    --lr 1e-5

Generate some sentences

model_path = '/path/to/checkpoint'
from training.generate import load_model_tokenizer_for_generate, generate_response
model, tokenizer = load_model_tokenizer_for_generate(model_path)
instruction='Write a tweet to introduce Dolly, a model to mimic ChatGPT.'
response = generate_response(instruction, model, tokenizer)
print(response)

(It is recommended to use ipython to interactively generate sentences to avoid loading models from disk again and again.)

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
config		config
test		test
training		training
.coveragerc		.coveragerc
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
__init__.py		__init__.py
pytest.ini		pytest.ini
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
run_pytest.sh		run_pytest.sh
train_dolly.py		train_dolly.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dolly

Get Started Training

Install dependency

Download dataset file and model file

Train the model

Generate some sentences

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Dolly

Get Started Training

Install dependency

Download dataset file and model file

Train the model

Generate some sentences

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages