[CVPR2026]The Blind Spot of Adaptation: Quantifying and Mitigating Forgetting in Fine-tuned Driving Models
Runhao Mao* Hanshi Wang* Yixiang Yang Qianli Ma Jingmeng Zhou Zhipeng Zhang✉
AutoLab, School of Artificial Intelligence, Shanghai Jiao Tong University
* Equal contribution
✉ Corresponding author
The first systematic benchmark and mitigation framework for catastrophic forgetting in VLM-centric autonomous driving.
- [2026.04.07] 🎉 🎉 Code, paper and dataset are released.
- [2026.02.21] 🎉 🎉 Our paper has been accepted by CVPR 2026!
Vision-Language Models bring strong world knowledge and long-tail generalization to autonomous driving, but standard fine-tuning can silently destroy these capabilities. FidelityAD studies this blind spot systematically by introducing a dedicated forgetting benchmark and a mitigation framework tailored for driving VLMs.
We build Fidelity Driving Bench, a large-scale benchmark for quantifying forgetting in autonomous driving, and propose Drive Expert Adapter (DEA), which shifts adaptation from destructive weight updates to prompt-level and expert-level routing. Extensive experiments show that DEA improves downstream driving performance while better preserving pretrained knowledge.
Our main contributions are summarized as follows:
- We provide the first systematic investigation of catastrophic forgetting in VLM-centric autonomous driving.
- We introduce Fidelity Driving Bench, a large-scale benchmark built from 180K scenes and 900K QA pairs across 15 data sources.
- We propose DEA, a new framework with a Prompt Adapter and a Task-Adaptive Expert Module for scene-aware knowledge routing.
- We demonstrate that DEA mitigates forgetting while maintaining strong performance on driving-specific tasks.
DEA learns prompt-level task priors and retrieves the most relevant prompt tokens according to the input question, helping the model adapt without overwriting core parameters.
DEA further introduces a scene-aware expert routing mechanism that dynamically selects suitable driving experts according to prompt semantics and scene-specific cues.
Fidelity Driving Bench shows that many existing driving VLMs suffer substantial forgetting after adaptation. On the Qwen2.5VL-3B backbone, DEA achieves stronger task performance with better knowledge retention than full fine-tuning.
| Method | KRR | SD | T-QA | NoPR |
|---|---|---|---|---|
| Base (Qwen2.5VL-3B) | - | 56.6 | 28.7 | 36.8 |
| ImpromptuVLA-3B | 68.4% | 59.1 | 33.0 | 25.2 |
| DEA (Base + TAEM + PA) | 79.0% | 58.8 | 41.0 | 29.0 |
180Ktraining scenes900Klanguage QA pairs15source datasets1,000manually verified long-tail test images
- Scene Description
- Traffic-QA
- Noteworthy Objects' Perception
- LLM-as-Judge (GPT Score)
- Noteworthy Objects' Perception Recall (NoPR)
- Knowledge Retention Rate (KRR)
The training pipeline can be launched with the provided shell script. A typical workflow is:
- Clone the repository.
- Create and activate a conda environment.
- Install the training dependencies.
- Update the paths in
train/train_DEA.sh. - Run the training script.
git clone https://github.com/AutoLab-SAI-SJTU/FidelityDrivingBench.git
cd FidelityDrivingBench
conda create -n fidelityad python=3.10
conda activate fidelityad
pip install -r requirements.txt
cd train
# Please update the dataset path, checkpoint path, and output path in train_DEA.sh first.
sh train_DEA.shThe evaluation service is designed as a local API server. You can start it with the following steps:
- Install the evaluation dependencies.
- Enter the
evaldirectory. - Update the paths in eval/app.py.
- Launch the FastAPI service with
uvicorn. - Submit a
.jsonlfile for scoring.
pip install -r requirements_eval.txt
cd eval
uvicorn app:app --host 0.0.0.0 --port 10086 --reloadgpt_score: returns the GPT-based score.gpt_eval: returns the NoPR score.gpt_acc: returns the Traffic-QA accuracy.
Replace the input file path and server address with your own environment before running:
curl -F "file=@/path/to/test_input.jsonl" \
-F "output_name=result.jsonl" \
http://<server-ip>:8000/gpt_score- Release paper
- Release dataset
- Release code
- Release trained models
If you find this work useful, please consider citing:
@inproceedings{mao2026blindspot,
title={The Blind Spot of Adaptation: Quantifying and Mitigating Forgetting in Fine-tuned Driving Models},
author={Mao, Runhao and Wang, Hanshi and Yang, Yixiang and Ma, Qianli and Zhou, Jingmeng and Zhang, Zhipeng},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2026}
}
