Skip to content

[ICCV'19] ForkNet: Adversarial Semantic Scene Completion from a Single Depth Image

Notifications You must be signed in to change notification settings

Alenbeinsane/forknet

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ForkNet: Multi-branch Volumetric Semantic Completion from a Single Depth Image

The implementation of our paper accepted in ICCV 2019 (International Conference on Computer Vision, IEEE) Yida Wang, David Tan, Nassir Navab and Federico Tombari. If you find this work useful in yourr research, please cite:

@inproceedings{wang2019forknet,
  title={ForkNet: Multi-branch Volumetric Semantic Completion from a Single Depth Image},
  author={Wang, Yida and Tan, David Joseph and Navab, Nassir and Tombari, Federico},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={8608--8617},
  year={2019}
}

ForkNet

road condition

Architecture

The overall architecture is combined with 1 encoder with input of a TSDF volume and 3 decoders.

Generated synthetic samples

Given 1 latent sample, we can use 2 decoders to generate a pair of TSDF volume and semantic scene separately.

More examples

Data preprocessing

Depth image to TSDF volumes

Firstly you need to go to depth-tsdf folder to compile the our depth converter. Then camake and make are suggested tools to compile our codes.

cmake . # configure
make # compiles demo executable

After the file named with back-project is compiled, depth images of NYU or SUNCG datasets could be converted into TSDF volumes parallelly.

CUDA_VISIBLE_DEVICES=0 python2 data/depth_backproject.py -s /media/wangyida/SSD2T/database/SUNCG_Yida/train/depth_real_png -tv /media/wangyida/HDD/database/SUNCG_Yida/train/depth_tsdf_camera_npy -tp /media/wangyida/HDD/database/SUNCG_Yida/train/depth_tsdf_pcd

Semantic volumes used for training

We further convert the binary files from SUNCG and NYU datasets into numpy arrays in dimension of [80, 48, 80] with 12 semantic channels. Those voxel data are used as training ground truth. Notice that our data is presented in numpy array format which is converted from the original binary data

python2 data/depthbin2npy.py -s /media/wangyida/HDD/database/SUNCGtrain_1001_2000  -tv /media/wangyida/HDD/database/SUNCG_Yida/train/voxel_semantic_npy &
python2 data/depthbin2npy.py -s /media/wangyida/HDD/database/SUNCGtrain_501_1000  -tv /media/wangyida/HDD/database/SUNCG_Yida/train/voxel_semantic_npy &
python2 data/depthbin2npy.py -s /media/wangyida/HDD/database/SUNCGtrain_1_1000  -tv /media/wangyida/HDD/database/SUNCG_Yida/train/voxel_semantic_npy &
python2 data/depthbin2npy.py -s /media/wangyida/HDD/database/SUNCGtrain_1001_3000  -tv /media/wangyida/HDD/database/SUNCG_Yida/train/voxel_semantic_npy &
python2 data/depthbin2npy.py -s /media/wangyida/HDD/database/SUNCGtrain_3001_5000  -tv /media/wangyida/HDD/database/SUNCG_Yida/train/voxel_semantic_npy &
python2 data/depthbin2npy.py -s /media/wangyida/HDD/database/SUNCGtrain_1_500  -tv /media/wangyida/HDD/database/SUNCG_Yida/train/voxel_semantic_npy &
python2 data/depthbin2npy.py -s /media/wangyida/HDD/database/SUNCGtrain_5001_7000  -tv /media/wangyida/HDD/database/SUNCG_Yida/train/voxel_semantic_npy &
python2 data/depthbin2npy.py -s /media/wangyida/HDD/database/depthbin_NYU_SUNCG/SUNCGtest_49700_49884 -tv /media/wangyida/HDD/database/SUNCG_Yida/test/voxel_semantic_npy &
python2 data/depthbin2npy.py -s /media/wangyida/HDD/database/depthbin_NYU_SUNCG/NYUtrain -tv /media/wangyida/HDD/database/NYU_Yida/train/voxel_semantic_npy &
python2 data/depthbin2npy.py -s /media/wangyida/HDD/database/depthbin_NYU_SUNCG/NYUtest -tv /media/wangyida/HDD/database/NYU_Yida/test/voxel_semantic_npy &
wait

Train and Test

Training

CUDA_VISIBLE_DEVICES=0 python3 main.py --mode train --discriminative True

Testing

We provide a compact version of ForkNet which is only 25 MB in the pretrained_model folder If the model is not discriminative

CUDA_VISIBLE_DEVICES=1 python main.py --mode evaluate_recons --conf_epoch 0

Otherwise

CUDA_VISIBLE_DEVICES=1 python main.py --mode evaluate_recons --conf_epoch 30 --discriminative True

where '--conf_epoch' indicates the index of the pretrained model

About

[ICCV'19] ForkNet: Adversarial Semantic Scene Completion from a Single Depth Image

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 45.5%
  • Makefile 24.6%
  • CMake 11.2%
  • C++ 7.7%
  • Shell 4.9%
  • C 3.9%
  • Cuda 2.2%