🔋 On Powerful Ways to Generate: Autoregression, Diffusion, and Beyond

This is the official implementation for paper "On Powerful Ways to Generate: Autoregression, Diffusion, and Beyond".

Materials: Paper

Installation

Install other required packages via

pip install -r requirements.txt

Install flash-attention

pip install flash-attn --no-build-isolation

Datasets

Sudoku

cd dataset/sudoku
python sudoku_generator.py sudoku-100.npy

This generates sudoku-100.pkl.gz containing APMDM training samples and vocab_cache.pkl containing the token vocabulary. You can process multiple files at once: python sudoku_generator.py sudoku-100.npy sudoku-test.npy. To visualize the generation process, run python serve.py and open http://localhost:8001/apmdm in your browser.

Parity

cd dataset/parity
python parity_generator.py

This generates parity_train.pkl.gz containing 7 APMDM training samples (expanded to 1000 with repetition) and parity_vocab_cache.pkl containing the token vocabulary (5 tokens: BOS, EOS, MASK, 0, 1).

Graph Generation

cd dataset/max_flow
python maxflow_solver.py \
  --num_instances 10000 \
  --min_nodes 10 --max_nodes 10 \
  --min_edges 50 --max_edges 50 \
  --output graph.pkl.gz

This generates graph.pkl.gz containing APMDM training samples for max-flow problems and vocab_cache.pkl containing the token vocabulary. Customize graph parameters: --num_instances (number of graphs), --min_nodes/max_nodes (node count range), --min_edges/max_edges (edge count range), --min_flow/max_flow (flow guarantee range).

Training and Evaluation

See scripts in train/scripts.

Note: In our implementation, we use the word contraction for deletion and expansion for insertion. R, E, C denote remasking, expansion/insertion, and contraction/deletion signals, respectively.

Citation

If you find our codes useful, please consider citing our work

@article{yang2025powerful,
  title={On Powerful Ways to Generate: Autoregression, Diffusion, and Beyond},
  author={Yang, Chenxiao and Zhou, Cai and Wipf, David and Li, Zhiyuan},
  journal={arXiv preprint arXiv:2510.06190},
  year={2025}
}

Acknowledgement

The training pipeline is adapted from MDLM.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
dataset		dataset
train		train
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔋 On Powerful Ways to Generate: Autoregression, Diffusion, and Beyond

Installation

Datasets

Sudoku

Parity

Graph Generation

Training and Evaluation

Citation

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

chr26195/AP-MDM

Folders and files

Latest commit

History

Repository files navigation

🔋 On Powerful Ways to Generate: Autoregression, Diffusion, and Beyond

Installation

Datasets

Sudoku

Parity

Graph Generation

Training and Evaluation

Citation

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages