Skip to content

[CVPR 2025] Official repository for "Hearing Hands: Generating Sounds from Physical Interactions in 3D Scenes".

Notifications You must be signed in to change notification settings

Dou-Yiming/hearing_hands

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Hearing Hands: Generating Sounds from Physical Interactions in 3D Scenes

Installation

1. Clone this repo

git clone --branch main --single-branch https://github.com/Dou-Yiming/hearing_hands.git

2. Create Conda environment

cd hearing_hands
conda env create -f environment.yml

Prepare data and pretrained models

1. Download dataset

Download the dataset from this link, then extract them:

tar -xvf dataset.tar.gz video2audio/data/dataset

2. Download pretrained models

Download the pretrained checkpoints from this link, then extract them:

tar -xvf checkpoints.tar.gz ./
mkdir video2audio/logs; mv checkpoints/sarf_full video2audio/logs
mv checkpoints/adm_checkpoints video2audio/ldm/adm/checkpoints
mv checkpoints/vocoder_checkpoints/* video2audio/ldm/vocoder/bigvgan/checkpoints
rm checkpoints.tar.gz

Run Inference and Training

1. Run inference with the pretrained model

sh scripts/inference.sh

2. Train video-to-audio model

sh scripts/train.sh

Bibtex

If you find our work useful, please consider citing:

@inproceedings{dou2025hearing,
  title={Hearing Hands: Generating Sounds from Physical Interactions in 3D Scenes},
  author={Dou, Yiming and Oh, Wonseok and Luo, Yuqing and Loquercio, Antonio and Owens, Andrew},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={1795--1804},
  year={2025}
}

About

[CVPR 2025] Official repository for "Hearing Hands: Generating Sounds from Physical Interactions in 3D Scenes".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages