Skip to content

NJUNLP/LongCoT-Internal-Bias

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The First Impression Problem: Internal Bias Triggers Overthinking in Reasoning Models

Overview

This repository shares the evaluation codes for our work The First Impression Problem: Internal Bias Triggers Overthinking in Reasoning Models.

Reasoning models often exhibit overthinking, characterized by redundant reasoning steps. We identify internal bias elicited by the input question as a key trigger of such behavior. Upon encountering a problem, the model immediately forms a preliminary guess about the answer, which we term an internal bias since it may not be explicitly generated, and it arises without systematic reasoning. When this guess conflicts with its subsequent reasoning, the model tends to engage in excessive reflection, resulting in wasted computation. We validate the association between internal bias and overthinking across multiple models and diverse reasoning tasks. To demonstrate the causal relationship more rigorously, we conduct two counterfactual interventions, showing that removing the input question after the model reduces the redundant reasoning across various complex reasoning tasks, and manually injecting bias affects overthinking accordingly. Further interpretability experiments suggest that excessive attention to the input question serves as a key mechanism through which internal bias influences subsequent reasoning trajectories. Finally, we evaluated several methods aimed at mitigating overthinking, yet the influence of internal bias persisted under all conditions.

Codes

allFilesDesc.md contains all detailed descriptions of each .py file in this repository.

Citation

@misc{dang2025impressionprobleminternalbias,
      title={The First Impression Problem: Internal Bias Triggers Overthinking in Reasoning Models}, 
      author={Renfei Dang and Zhening Li and Shujian Huang and Jiajun Chen},
      year={2025},
      eprint={2505.16448},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2505.16448}, 
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages