Skip to content

hq-King/Affordance-R1

Repository files navigation

Affordance-R1: Reinforcement Learning for Generalizable Affordance Reasoning in Multimodal Large Language Model

The repo is the official implementation of "Affordance-R1: Reinforcement Learning for Generalizable Affordance Reasoning in Multimodal Large Language Model".

Paper: 📖 Arxiv
Model: 🤗 Affordance-R1

News

[Dec 21th, 2025] 🔥 ReasonAff is coming! We have released the original dataset, and as stated in the appendix of the paper, we will filter the test data and provide a cleaner dataset soon, stay tuned!!!

[Aug 11th, 2025] 🔥 Affordance-R1 is coming! We have released the code !!!

Performance of Affordance-R1:

Affordance-R1 demonstrates extraordinary affordance reasoning ability and powerful generalization ability.

Model

Affordance-R1 framework overview. The model processes queries through policy-based reasoning with < think > and < rethink > stages to generate affordance predictions. The policy optimization uses a sophisticated reward system comprising (a) format rewards for reasoning structure, (b) perception rewards for spatial accuracy (Box-Num, IOU, L1), and (c) recognition rewards for semantic similarity, enabling effective GRPO-based training for affordance reasoning

Visualization on Web Image

Affordance-R1 can understand complex scenarios and shows good generalization.

Installation

git clone https://github.com/hq-King/Affordance-R1.git
cd Affordance-R1
conda create -n Affordance-R1 python=3.12
conda activate Affordance-R1
pip install torch==2.6.0 torchvision==0.21.0
pip install -e .
pip install gensim

Inference

Download pretrained models: 🤗 Affordance-R1 Modify the path in inference_scripts/infer.py and then run the following

python inference_scripts/infer.py 

And you will get results like this:

Dataset

Download our ReasonAff datasrt here As mentioned in the paper, we found there are some coarse ground truth in the original dataset, and we are trying to filter some dataset in the test split of the data, and we will release it soon! Stay tuned!

Training

Download pretrained models:Qwen2.5-VL-7B and SAM2 Modify the path in training_scripts/aff_r1.sh and training_scripts/aff_r1.yaml and then run the following command to start training:

bash training_scripts/run_aff_r1.sh

After training, run the following command to merge the model"

python3 training_scripts/model_merger.py --local_dir [path_to_your_actor_checkpoint]

Evaluation

Download the dataset, and modify the dataset path in the following file

bash evaluation_scripts/eval_aff_r1.sh

Acknowledgement

We would like to thank the following repos for their great work:

Acknowledgement

@article{wang2025affordance,
  title={Affordance-R1: Reinforcement Learning for Generalizable Affordance Reasoning in Multimodal Large Language Model},
  author={Wang, Hanqing and Wang, Shaoyang and Zhong, Yiming and Yang, Zemin and Wang, Jiamin and Cui, Zhiqing and Yuan, Jiahao and Han, Yifan and Liu, Mingyu and Ma, Yuexin},
  journal={arXiv preprint arXiv:2508.06206},
  year={2025}
}

About

code for affordance-r1

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages