Multi-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images

Jie Mei¹, Chenyu Lin², Yu Qiu³, Yaonan Wang¹, Hui Zhang¹, Ziyang Wang⁴, Dong Dai⁴

¹ Hunan University, ² Nankai University, ³ Hunan Normal University, ⁴ Tianjin Medical University Cancer Institute and Hospital

This repository contains the official code for paper Multi-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images.
This paper has been accepted to CVPR 2025.
This code and PCLT20K dataset are licensed for non-commerical research purpose only.

Introduction

Lung cancer is a leading cause of cancer-related deaths globally. PET-CT is crucial for imaging lung tumors, providing essential metabolic and anatomical information, while it faces challenges such as poor image quality, motion artifacts, and complex tumor morphology. Deep learning-based models are expected to address these problems, however, existing small-scale and private datasets limit significant performance improvements for these methods. Hence, we introduce a large-scale PET-CT lung tumor segmentation dataset, termed PCLT20K, which comprises 21,930 pairs of PET-CT images from 605 patients. Furthermore, we propose a cross-modal interactive perception network with Mamba (CIPA) for lung tumor segmentation in PET-CT images. Specifically, we design a channel-wise rectification module (CRM) that implements a channel state space block across multi-modal features to learn correlated representations and helps filter out modality-specific noise. A dynamic cross-modality interaction module (DCIM) is designed to effectively integrate position and context information, which employs PET images to learn regional position information and serves as a bridge to assist in modeling the relationships between local features of CT images. Extensive experiments on a comprehensive benchmark demonstrate the effectiveness of our CIPA compared to the current state-of-the-art segmentation methods. We hope our research can provide more exploration opportunities for medical image segmentation.

Environment

Create environment.

conda create -n MIPA python=3.10
conda activate MIPA

Install all dependencies. Install pytorch, cuda and cudnn, then install other dependencies via:

pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu118

pip install -r requirements.txt

Install selective_scan_cuda_core.

cd models/encoders/selective_scan
pip install .
cd ../../..

PCLT20K

Please contact Jie Mei (jiemei AT hnu.edu.cn) for the dataset. We will get back to you shortly. The email should contain the following information. Note: For better academic communication, a real-name system is encouraged and your email suffix must match your affiliation (e.g., hello@hnu.edu.cn). If not, you need to explain why.

Name: (Tell us who you are.)
Affiliation: (The name/url of your institution or university, etc.)
Job Title: (E.g., Professor, Associate Professor, PhD, etc.)
Email: (Dataset will be sent to this email.)
How to use: (Only for non-commercial use.)

Data Preparation

For our dataset PCLT20K, we orgnize the dataset folder in the following structure:

<PCLT20K>
    |-- <0001>
        |-- <name1_CT.png>
        |-- <name1_PET.png>
        |-- <name1_mask.png>
        ...
    |-- <0002>
        |-- <name2_CT.png>
        |-- <name2_PET.png>
        |-- <name2_mask.png>
        ...
    ...
    |-- train.txt
    |-- test.txt

train.txt/test.txt contains the names of items in training/testing set, e.g.:

<name1>
<name2>
...

Please put our dataset in the data directory

Usage

Training

Please download the pretrained VMamba weights, and put them under pretrained/vmamba/. We use VMamba_Tiny as default.
Config setting.

Edit config in the train.py. Change C.backbone to sigma_tiny / sigma_small / sigma_base to use the three versions of VMamba.

Run multi-GPU distributed training:

torchrun --nproc_per_node 'GPU_Numbers' train.py

You can also use single-GPU training:
```
python train.py
```
Results will be saved in save_model folder.

Testing

The pretrained model of CIPA (CIPA.pth) can be downloaded:

Baidu Yunpan, Password: CIPA
Google Drive

python pred.py

Citation

If you are using the code/model provided here in a publication, please consider citing:

@inproceedings{mei2025cross,
  title={Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images},
  author={Mei, Jie and Lin, Chenyu and Qiu, Yu and Wang, Yaonan and Zhang, Hui and Wang, Ziyang and Dai, Dong},
  booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2025}
}

Contact

For any questions, please contact me via e-mail: jiemei AT hnu.edu.cn.

Acknowledgment

This project is based on the VMamba and Sigma, thanks for their excellent works.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
figures		figures
models		models
train_utils		train_utils
utils		utils
LICENSE		LICENSE
README.md		README.md
pred.py		pred.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images

Introduction

Environment

PCLT20K

Data Preparation

Usage

Training

Testing

Citation

Contact

Acknowledgment

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

Multi-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images

Introduction

Environment

PCLT20K

Data Preparation

Usage

Training

Testing

Citation

Contact

Acknowledgment

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 1

Languages

Packages