Skip to content

George-Dong/PhysicalConceptReasoner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PhysicalConceptReasoner-Release

Pytorch implementation for paper Compositional Physical Reasoning of Objects and Events from Videos. More details and visualization results can be found at the project page.

Framework

Compositional Physical Reasoning of Objects and Events from Videos

Zhenfang Chen, Shilong Dong, Kexin Yi, Yunzhu Li, Mingyu Ding, Antonio Torralba, Joshua B. Tenenbaum, Chuang Gan

Installation

  • Prerequisites

    • Python 3
    • PyTorch 1.0 or higher, with NVIDIA CUDA Support
    • Other required python packages specified by requirements.txt.
  • Install Jacinle: Clone the package, and add the bin path to your global PATH environment variable:

    git clone https://github.com/vacancy/Jacinle --recursive
    export PATH=<path_to_jacinle>/bin:$PATH
    
  • Clone this repository:

    git clone https://github.com/zfchenUnique/DCL-Release.git --recursive
    
  • Create a conda environment for NS-CL, and install the requirements. This includes the required python packages from both Jacinle NS-CL. Most of the required packages have been included in the built-in anaconda package.

Dataset preparation

  • Download videos, video annotation, questions and answers, and object proposals accordingly from the official website
  • Transform videos into ".png" frames with ffmpeg.

Step-by-step Training on ComPhy Dataset

  • Step 1: download the proposals from the region proposal network and extract object trajectories for train and val set by
   bash scripts/script_gen_tubes.sh
  • Step 2: train a concept learner with descriptive and explanatory questions for static concepts (i.e. color, shape and material)
   bash scripts/comphy_train_pcr_stage1 <GPU_ID> <DATA_DIR>
  • Step 3: extract static attributes & refine object trajectories extract static attributes
   bash scripts/script_extract_attribute.sh
refine object trajectories
   bash scripts/script_gen_tubes_refine.sh
  • Step 4: train a pcr for stage2 learning
    bash script/script_comphy_train_pcr_stage2.sh <GPU_ID> <DATA_DIR> <STAGE1_MODEL_DIR>
  • Step 5: train a pcr for stage3 learning
    bash script/script_comphy_train_pcr_stage3.sh <GPU_ID> <DATA_DIR> <STAGE2_MODEL_DIR>

Generalization to Real-World Scenario Dataset

    bash script/script_real_world_dataset_finetune.sh <GPU_ID> <DATA_DIR> <STAGE2_MODEL_DIR>

Citation

If you find this repo useful in your research, please consider citing:

@misc{chen2024compositionalphysicalreasoningobjects,
      title={Compositional Physical Reasoning of Objects and Events from Videos}, 
      author={Zhenfang Chen and Shilong Dong and Kexin Yi and Yunzhu Li and Mingyu Ding and Antonio Torralba and Joshua B. Tenenbaum and Chuang Gan},
      year={2024},
      eprint={2408.02687},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2408.02687}, 
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published