Rethinking Garment Conditioning in Diffusion-based Virtual Try-On (Re-CatVTON)

Official PyTorch implementation of "Rethinking Garment Conditioning in Diffusion-based Virtual Try-On"

📢 News

[2025.12.22] 🎉 Inference code and pre-trained models released!
[2025.11.24] The paper is available on arXiv.

🔍 Overview

Re-CatVTON is an efficient single UNet diffusion-based virtual try-on (VTON) framework that revisits how garment information should be used to condition the denoising process.

🛠️ Installation

conda create -n recatvton python=3.12
conda activate recatvton
git clone https://github.com/Levinna/Re-CatVTON.git
cd Re-CatVTON
pip install -r requirements.txt

We trained and tested our Re-CatVTON on Python 3.12, PyTorch 2.8.0 with CUDA 12.9.

🚀 Inference

Data Preparation

VITON-HD or DressCode dataset is required for inference.

Preprocess Mask

In the thirdparty folder, you can generate agnostic masks for the DressCode dataset using preprocess_agnostic_mask.py (Credit to CatVTON!)

cd thirdparty
CUDA_VISIBLE_DEVICES=0 python preprocess_agnostic_mask.py \
    --data_root_path /path/to/DressCode

Run Inference

Option 1: Load from HuggingFace Hub

python inference_recatvton.py \
    --hf_repo levinna/Re-CatVTON \
    --hf_subfolder VITON-HD/checkpoint-16000/unet \
    --dataset_name vitonhd \
    --data_root_path /path/to/VITON-HD \
    --output_dir ./output \
    --batch_size 16 \
    --mixed_precision bf16

Option 2: Load from local path

# First, download the model
hf download levinna/Re-CatVTON --local-dir ./checkpoints # or huggingface-cli download

# Then run inference
python inference_recatvton.py \
    --base_model_path ./checkpoints/VITON-HD/checkpoint-16000 \
    --dataset_name vitonhd \
    --data_root_path /path/to/VITON-HD \
    --output_dir ./output \
    --batch_size 16 \
    --mixed_precision bf16

If your GPU does not support bf16, you can try fp16 or fp32.

Available Checkpoints

Dataset	HF Subfolder	Resolution
VITON-HD	`VITON-HD/checkpoint-16000/unet`	512×384
DressCode	`DressCode/checkpoint-32000/unet`	512×384

Inference Options

Argument	Default	Description
`--sampler`	`ddim`	Sampler type: `ddim`, `ddpm`, `unipc`, `dpmpp`
`--num_inference_steps`	`50`	Number of diffusion steps
`--guidance_scale`	`2.5`	CFG guidance scale
`--repaint`	`True`	Blend result with original background
`--eval_pair`	`True`	Evaluate on paired split

Recommended steps per sampler:

ddim: 50 steps (main results)
unipc: 30 steps
dpmpp: 25 steps

📊 Results

Model	FID ↓	KID ↓	LPIPS ↓	Params (M)
CatVTON	5.888	0.513	0.061	859.5
Leffa	4.540	0.050	0.048	1802.7
Re-CatVTON (Ours)	4.438	0.010	0.047	859.5

Comparison on the VITON-HD paired setting.

📂 Project Structure

Re-CatVTON/
├── thirdparty/
│   ├── SCHP/
│   ├── DensePose/
│   ├── cloth_masker.py
│   ├── preprocess_agnostic_mask.py
│   └── preprocess_agnostic_mask.sh
├── model/
│   ├── attn_processor.py
│   ├── pipeline.py
│   └── utils.py
├── assets/
├── inference_recatvton.py
├── inference_recatvton.sh
├── vton_datasets.py
├── evaluation.py
├── evaluation.sh
├── requirements.txt
├── LICENSE
└── README.md

📝 TODO

Release inference code
Release pre-trained models
HuggingFace Demo
ComfyUI Support

📄 License

Code: CC-BY-NC-SA 4.0
Model Weights: CC-BY-NC 4.0 Note: The model weights are licensed under CC BY-NC 4.0 due to the non-commercial usage constraints of the VITON-HD and DressCode datasets.

🙏 Acknowledgement

This project is built upon Diffusers and uses Stable Diffusion v1.5 Inpainting as the base model.

For fair comparison, our data pipeline for inference and evaluation protocol follow those of CatVTON and Leffa.

We thank all the contributors of these projects for their excellent work.

Citation

If you find our work helpful, please consider citing:

@article{na2025rethinking,
  title={Rethinking Garment Conditioning in Diffusion-based Virtual Try-On},
  author={Na, Kihyun and Choi, Jinyoung and Kim, Injung},
  journal={arXiv preprint arXiv:2511.18775},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Rethinking Garment Conditioning in Diffusion-based Virtual Try-On (Re-CatVTON)

📢 News

🔍 Overview

🛠️ Installation

🚀 Inference

Data Preparation

Preprocess Mask

Run Inference

Available Checkpoints

Inference Options

📊 Results

📂 Project Structure

📝 TODO

📄 License

🙏 Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
model		model
thirdparty		thirdparty
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
evaluation.py		evaluation.py
evaluation.sh		evaluation.sh
inference_recatvton.py		inference_recatvton.py
inference_recatvton.sh		inference_recatvton.sh
requirements.txt		requirements.txt
vton_datasets.py		vton_datasets.py

License

Levinna/Re-CatVTON

Folders and files

Latest commit

History

Repository files navigation

Rethinking Garment Conditioning in Diffusion-based Virtual Try-On (Re-CatVTON)

📢 News

🔍 Overview

🛠️ Installation

🚀 Inference

Data Preparation

Preprocess Mask

Run Inference

Available Checkpoints

Inference Options

📊 Results

📂 Project Structure

📝 TODO

📄 License

🙏 Acknowledgement

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages