CPMP: Cyclic Peptide Membrane Permeability Prediction Using Deep Learning Model Based on Molecular Attention Transformer

Abstract

The CPMP model is a deep learning approach for predicting the membrane permeability of cyclic peptides. Built on the Molecular Attention Transformer (MAT) neural network, CPMP achieves high determination coefficients (R²) of 0.67 for PAMPA, 0.75 for Caco-2, 0.62 for RRCK, and 0.73 for MDCK permeability predictions.

Requirements

Python (3.10)
Pandas (2.2.3)
PyTorch (2.5.1)
RDKit (2024.3.2)
Scikit-learn (1.6.1)

Usage

1. Use the trained model to predict your own data.

python predict.py --help
usage: predict.py [-h] [--input_file INPUT_FILE] [--result_file RESULT_FILE]
Predict PAMPA, Caco-2, RRCK and MDCK Membrane Permeability

options:
  -h, --help            show this help message and exit
  --input_file          INPUT_FILE (.csv file)
  --result_file         RESULT_FILE (out file)

The input data input.csv you need to prepare is as follows:

smiles
C/C=C/C[C@@H](C)[C@@H](O)[C@H]1C(=O)N[C@@H](CC)C(=O)N(C)CC(=O)N(C)[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N(C)[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@H](C)C(=O)N(C)[C@@H](CC(C)C)C(=O)N(C)[C@@H](CC(C)C)C(=O)N(C)[C@@H](C(C)C)C(=O)N1C
CC(C)C[C@@H]1NC(=O)[C@@H](CC(C)C)NC(=O)[C@@H](CC(C)C)NC(=O)[C@H](Cc2ccc(O)cc2)NC(=O)[C@@H]2CCCN2C(=O)[C@@H](CC(C)C)NC1=O

You want to save the results to: predict.csv
Run:

python predict.py --input_file input.csv --result_file predict.csv

The obtained results are:

smiles,pampa,caco2,rrck,mdck
C/C=C/C[C@@H](C)[C@@H](O)[C@H]1C(=O)N[C@@H](CC)C(=O)N(C)CC(=O)N(C)[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N(C)[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@H](C)C(=O)N(C)[C@@H](CC(C)C)C(=O)N(C)[C@@H](CC(C)C)C(=O)N(C)[C@@H](C(C)C)C(=O)N1C,-5.85759162902832,-5.69475793838501,-5.773202419281006,-6.25748348236084
CC(C)C[C@@H]1NC(=O)[C@@H](CC(C)C)NC(=O)[C@@H](CC(C)C)NC(=O)[C@H](Cc2ccc(O)cc2)NC(=O)[C@@H]2CCCN2C(=O)[C@@H](CC(C)C)NC1=O,-6.402388572692871,-6.128831386566162,-5.797394752502441,-5.736731052398682

2. Data in the paper

Before running CPMP on the same data used in the paper, the original data needs to be processed.
For example, if you want to preprocess the PAMPA data, run:

cd data/pampa_uff_ig_true
python process_data.py

This takes around five hour on a regular computer. Alternatively, pre-processed data (~6.2GB) can be found here. You can directly download and replace the data directory.

3. Train the CPMP model

For example, if you want to train CPMP model with PAMPA data, run:

python train_pampa.py

4. Use the trained model to predict the test set

For example, if you want to predict PAMPA test data, run:

python predict_pampa.py

5. Baseline

The implementation scripts for the baseline methods can be found at this path: baselines/

MAT

The CPMP model is built based on the MAT framework. You can find more information about MAT here.

License

CPMP is released under an Apache v2.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
baselines		baselines
data		data
featurization		featurization
model		model
plot		plot
saved_model		saved_model
train_caco2		train_caco2
train_mdck_with_pretrained_caco2		train_mdck_with_pretrained_caco2
train_pampa		train_pampa
train_rrck_with_pretrained_caco2		train_rrck_with_pretrained_caco2
utils		utils
LICENSE		LICENSE
README.md		README.md
input.csv		input.csv
predict.csv		predict.csv
predict.py		predict.py
predict_caco2.py		predict_caco2.py
predict_mdck.py		predict_mdck.py
predict_pampa.py		predict_pampa.py
predict_rrck.py		predict_rrck.py
train_caco2.py		train_caco2.py
train_mdck_with_pretrained_caco2.py		train_mdck_with_pretrained_caco2.py
train_pampa.py		train_pampa.py
train_rrck_with_pretrained_caco2.py		train_rrck_with_pretrained_caco2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CPMP: Cyclic Peptide Membrane Permeability Prediction Using Deep Learning Model Based on Molecular Attention Transformer

Abstract

Requirements

Usage

1. Use the trained model to predict your own data.

2. Data in the paper

3. Train the CPMP model

4. Use the trained model to predict the test set

5. Baseline

MAT

License

About

Uh oh!

Releases

Packages

Languages

License

iamgroot42/CPMP

Folders and files

Latest commit

History

Repository files navigation

CPMP: Cyclic Peptide Membrane Permeability Prediction Using Deep Learning Model Based on Molecular Attention Transformer

Abstract

Requirements

Usage

1. Use the trained model to predict your own data.

2. Data in the paper

3. Train the CPMP model

4. Use the trained model to predict the test set

5. Baseline

MAT

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages