Skip to content

Levinna/Re-CatVTON

Repository files navigation

Rethinking Garment Conditioning in Diffusion-based Virtual Try-On (Re-CatVTON)

Official PyTorch implementation of "Rethinking Garment Conditioning in Diffusion-based Virtual Try-On"

📢 News

  • [2025.12.22] 🎉 Inference code and pre-trained models released!
  • [2025.11.24] The paper is available on arXiv.

🔍 Overview

Method Overview

Re-CatVTON is an efficient single UNet diffusion-based virtual try-on (VTON) framework that revisits how garment information should be used to condition the denoising process.

🛠️ Installation

conda create -n recatvton python=3.12
conda activate recatvton
git clone https://github.com/Levinna/Re-CatVTON.git
cd Re-CatVTON
pip install -r requirements.txt

We trained and tested our Re-CatVTON on Python 3.12, PyTorch 2.8.0 with CUDA 12.9.

🚀 Inference

Data Preparation

VITON-HD or DressCode dataset is required for inference.

Preprocess Mask

In the thirdparty folder, you can generate agnostic masks for the DressCode dataset using preprocess_agnostic_mask.py (Credit to CatVTON!)

cd thirdparty
CUDA_VISIBLE_DEVICES=0 python preprocess_agnostic_mask.py \
    --data_root_path /path/to/DressCode

Run Inference

Option 1: Load from HuggingFace Hub

python inference_recatvton.py \
    --hf_repo levinna/Re-CatVTON \
    --hf_subfolder VITON-HD/checkpoint-16000/unet \
    --dataset_name vitonhd \
    --data_root_path /path/to/VITON-HD \
    --output_dir ./output \
    --batch_size 16 \
    --mixed_precision bf16

Option 2: Load from local path

# First, download the model
hf download levinna/Re-CatVTON --local-dir ./checkpoints # or huggingface-cli download

# Then run inference
python inference_recatvton.py \
    --base_model_path ./checkpoints/VITON-HD/checkpoint-16000 \
    --dataset_name vitonhd \
    --data_root_path /path/to/VITON-HD \
    --output_dir ./output \
    --batch_size 16 \
    --mixed_precision bf16

If your GPU does not support bf16, you can try fp16 or fp32.

Available Checkpoints

Dataset HF Subfolder Resolution
VITON-HD VITON-HD/checkpoint-16000/unet 512×384
DressCode DressCode/checkpoint-32000/unet 512×384

Inference Options

Argument Default Description
--sampler ddim Sampler type: ddim, ddpm, unipc, dpmpp
--num_inference_steps 50 Number of diffusion steps
--guidance_scale 2.5 CFG guidance scale
--repaint True Blend result with original background
--eval_pair True Evaluate on paired split

Recommended steps per sampler:

  • ddim: 50 steps (main results)
  • unipc: 30 steps
  • dpmpp: 25 steps

📊 Results

Model FID ↓ KID ↓ LPIPS ↓ Params (M)
CatVTON 5.888 0.513 0.061 859.5
Leffa 4.540 0.050 0.048 1802.7
Re-CatVTON (Ours) 4.438 0.010 0.047 859.5

Comparison on the VITON-HD paired setting.

📂 Project Structure

Re-CatVTON/
├── thirdparty/
│   ├── SCHP/
│   ├── DensePose/
│   ├── cloth_masker.py
│   ├── preprocess_agnostic_mask.py
│   └── preprocess_agnostic_mask.sh
├── model/
│   ├── attn_processor.py
│   ├── pipeline.py
│   └── utils.py
├── assets/
├── inference_recatvton.py
├── inference_recatvton.sh
├── vton_datasets.py
├── evaluation.py
├── evaluation.sh
├── requirements.txt
├── LICENSE
└── README.md

📝 TODO

  • Release inference code
  • Release pre-trained models
  • HuggingFace Demo
  • ComfyUI Support

📄 License

  • Code: CC-BY-NC-SA 4.0
  • Model Weights: CC-BY-NC 4.0 Note: The model weights are licensed under CC BY-NC 4.0 due to the non-commercial usage constraints of the VITON-HD and DressCode datasets.

🙏 Acknowledgement

This project is built upon Diffusers and uses Stable Diffusion v1.5 Inpainting as the base model.

For fair comparison, our data pipeline for inference and evaluation protocol follow those of CatVTON and Leffa.

We thank all the contributors of these projects for their excellent work.

Citation

If you find our work helpful, please consider citing:

@article{na2025rethinking,
  title={Rethinking Garment Conditioning in Diffusion-based Virtual Try-On},
  author={Na, Kihyun and Choi, Jinyoung and Kim, Injung},
  journal={arXiv preprint arXiv:2511.18775},
  year={2025}
}

About

Official implementation of "Rethinking Garment Conditioning in Diffusion-based Virtual Try-On (Re-CatVTON)"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published