Time-Annealed Perturbation Sampling (TAPS) is an inference-time method for improving diversity in diffusion language models without sacrificing generation quality.
This repository contains the official implementation of TAPS and the code used to reproduce experiments reported in the paper.
A conceptual comparison of the inference process between the base Diffusion-LM and our proposed method, TAPS, illustrating different context conditioning behaviors.
This repository supports two diffusion language model backbones:
| Backbone | Hugging Face | Loader |
|---|---|---|
| LLaDA-8B-Instruct | GSAI-ML/LLaDA-8B-Instruct | transformers.AutoModel |
| TraDo-8B-Instruct | Gen-Verse/TraDo-8B-Instruct | transformers.AutoModelForCausalLM |
- GSM8K
- WritingPrompts
- NoveltyBench
- Arena-Hard-Auto
accelerate launch benchmarks/writingprompts/run_diversity.py \
--backbone llada \
--model_path /path/to/llada \
--mode embedding \
--temperature 0.8 \
--steps 128 --gen_length 256 --block_length 32 \
--cond_noise_start 0.05 --cond_noise_until 0.95 \
--cond_noise_anneal cosine \
--cond_embed_noise_std 0.05 --cond_embed_psi 1.0 \
--num_prompts 50 --num_samples 8 \
--out_dir outputs/wpaccelerate launch benchmarks/writingprompts/run_diversity.py \
--backbone trado \
--model_path /path/to/trado \
--mode embedding \
--temperature 0.8 \
--steps 128 --gen_length 256 --block_length 4 \
--cond_noise_start 0.05 --cond_noise_until 0.95 \
--cond_noise_anneal cosine \
--cond_embed_noise_std 0.20 --cond_embed_psi 1.0 \
--top_k 50 --top_p 0.9 --min_p 0.0 \
--num_prompts 50 --num_samples 8 \
--out_dir outputs/wpThis project is released under the MIT License. See the LICENSE file for the full text.
SPDX-License-Identifier: MIT
