Skip to content

Official code of the paper "MIND: Multi-rationale INtegrated Discriminative Reasoning Framework for Multi-modal Large Models"

Notifications You must be signed in to change notification settings

YuChuang1205/MIND

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 

Repository files navigation

MIND: Multi-rationale INtegrated Discriminative Reasoning Framework for Multi-modal Large Models

Chuang Yu1,2,5Jinmiao Zhao1,2,  Mingxuan Zhao3,  Yunpeng Liu1*,  Xiujun Shu4,
Yuanhao Feng4,  Bo Wang4Xiangyu Yue5*

1 Shenyang Institute of Automation, Chinese Academy of Sciences
2 University of Chinese Academy of Sciences   
3 HKUST(GZ)    4 Tencent    5 MMLab, The Chinese University of Hong Kong

💥 Abstract

Recently, multimodal large language models (MLLMs) have been widely applied to reasoning tasks. However, they suffer from limited multi-rationale semantic modeling, insufficient logical robustness, and are susceptible to misleading interpretations in complex scenarios. Therefore, we propose a Multi-rationale INtegrated Discriminative (MIND) reasoning framework, which is designed to endow MLLMs with human-like cognitive abilities of “Understand → Rethink → Correct”, and achieves a paradigm evolution from passive imitation-based reasoning to active discriminative reasoning. Specifically, we introduce a Rationale Augmentation and Discrimination (RAD) paradigm, which automatically and efficiently expands existing datasets by generating diverse rationales, providing a unified and extensible data foundation. Meanwhile, we design a Progressive Two-stage Correction Learning (P2CL) strategy. The first phase enhances multi-rationale positive learning, while the second phase enables active logic discrimination and correction. In addition, to mitigate representation entanglement in the multi-rationale semantic space, we propose a Multi-rationale Contrastive Alignment (MCA) optimization strategy, which achieves semantic aggregation of correct reasoning and boundary separation of incorrect reasoning. Extensive experiments demonstrate that the proposed MIND reasoning framework achieves state-of-the-art (SOTA) performance on multiple public datasets covering scientific, commonsense, and mathematical scenarios. It provides a new perspective for advancing MLLMs towards higher levels of cognitive intelligence.

🚀 Overview


MIND Overview

✅ TODO List

We are finalizing the release of the paper, dataset and code and aim to complete it as soon as possible. Please stay tuned! ⚡⚡⚡

  • Release paper. [Paper/arXiv]
  • Release training and inference code.
  • Release ScienceQA-RAD, V-OKVQA-RAD, and M3CoT-RAD datasets.
  • Release model weights.

About

Official code of the paper "MIND: Multi-rationale INtegrated Discriminative Reasoning Framework for Multi-modal Large Models"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published