We introduce the APA Benchmark, consisting of images of Actors, Politicians, and Athletes paired with a series of text prompts. Our benchmark serves as a tool for practitioners and researchers who are considering VLMs for people-centric tasks. Our benchmark tests VLM for their association ability with respect to pictures of public figures in three domains: Athletics, Politics and Acting. We issue text prompts against photos of famous people in each of these domains and provide a score for VLMs according to their matching capabilities. Images are mostly sourced from Wikimedia Commons and Wikipedia which means they are either in the public domain or have a friendly license to redistribute. Other metadata is manually curated from Wikipedia or from official sources.
If needed, create a new conda environment with Python 3.10:
conda create -n apa_benchmark python=3.10
conda activate apa_benchmarkInstall the following Python packages:
- pandas
- pytorch==1.7.1
- torchvision
- cudatoolkit==11.0
- ftfy
- regex
- tqdm
- transformers
- accelerate
Note: Gemma 3 is supported starting from
transformers==4.50.0.
Make sure you have the latest version of CLIP installed:
pip install git+https://github.com/openai/CLIP.gitpython clip/base-benchmark.pypython clip/pks_benchmark.pypython clip/pks_benchmark_rev.pypython gemma3/base-benchmark.pypython gemma3/pks_benchmark.pypython gemma3/pks_benchmark_rev.pyThese scripts summarize benchmark results from different analytical perspectives.
python eval/exp1/cal_three_mean_actor.py
python eval/exp1/cal_three_mean_politician.py
python eval/exp1/cal_three_mean_athletes.pypython eval/exp2/cal_bias_actor_athlete.py
python eval/exp2/cal_bias_politician.pypython eval/exp3/cal_image_score_mean.py
python eval/exp3/cal_text_score_mean.pyIf you use this work, please cite us using the following BibTeX entry:
@misc{apabench,
author = {Yuri Ishitoya and Veronica Flores and Ziyan Yang and Paola Cascante-Bonilla and Vicente Ordonez},
title = {The APA Benchmark: A People-centric Benchmark for Testing Vision-Language Models},
year = {2025},
howpublished = {\url{https://github.com/uvavision/apa-bench}}
}This project is licensed under the MIT License.