Source code and benchmark for the paper Dynamics-Inspired Generative Discovery of Allosteric Ligands Reveals HCAR1 as a Therapeutic Target in Inflammation
The DynAlloBind model was developed by fine-tuning the original DynamicBind4 framework on a strategically augmented training set. We began with the complete DynamicBind dataset and expanded it by applying the identical data collection pipeline to encompass all relevant Protein Data Bank (PDB) depositions through the end of 2023. This extended set was further supplemented with a small, curated collection of additional GPCR–ligand complexes to ensure comprehensive coverage.
Create a new environment for inference. While in the project directory run
conda env create -f environment.yml
Or you setup step by step:
conda create -n dynamicbind python=3.10
Activate the environment
conda activate dynamicbind
Install
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
conda install -c conda-forge rdkit
conda install pyg pyyaml biopython -c pyg
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.0.0+cu117.html
pip install e3nn fair-esm spyrmsd
Create a new environment for structural Relaxation.
conda create --name relax python=3.8
Activate the environment
conda activate relax
Install
conda install -c conda-forge openmm pdbfixer libstdcxx-ng openmmforcefields openff-toolkit ambertools=22 compilers biopython
Download and unzip the workdir.zip containing the model checkpoint from https://drive.google.com/file/d/1-GTtaEFavlYkPpsC9S60HzmZ6TjO3kj4/view?usp=drive_link
By default: 40 poses will be predicted, poses will be ranked (rank1 is the best-scoring pose, rank40 the lowest), relax processes are included.
- Protein (PDB File):
protein.pdb- Automatically cleaned to remove non-standard amino acids, water molecules, or small molecules.
- Ligand (CSV File):
ligand.csv- Must contain a column named 'ligand' listing smiles.
- Number of Animations:
- outputs intermediate pkl data, not the final animation PDB. (After
--savings_per_complex, default is 40)
- outputs intermediate pkl data, not the final animation PDB. (After
- Frames in Animation/inference_steps:
- default is 20.
--header: Name of the result folder.--device: GPU device ID.--python: Python environment for inference.--relax_python: Python environment for relaxation.--num_workers: Number of processes for final output relaxation.
python run_single_protein_inference.py ./data/8y6y.pdb ./data/8y6y.csv --savings_per_complex 40 --inference_steps 20 --header test --device $1 --python /path/to/dynamicbind/python --relax_python /path/to/relax/pythonThe results of the docking step, typically found in the results/test folder, include:
- Affinity Score for Each Complex:
affinity_prediction.csv - Pose Score and Conformation of Each Animation: Example files like
rank1_ligand_lddt0.63_affinity5.67_relaxed.sdf(where 0.63 is the pose score) and corresponding protein.pdbfiles. - Data for Animation Generation: Such as
rank1_reverseprocess_data_list.pklandrank2_reverseprocess_data_list.pkl.
Inputs:
- Data from Docking Output: Indicated by paths like
results/test/index0_idx_0/. The notation "1+2" implies that movies for rank1 and rank2 poses are needed. - Number of Animations: Specified by the user (default is "1").
python movie_generation.py results/test/index0_idx_0/ 1+2 --device $1 --python /path/to/dynamicbind/python --relax_python /path/to/relax/pythonOutputs:
- Final Animation PDB Files: Located in
results/test_8y6y/index0_idx_0/, with files likerank1_receptor_reverseprocess_relaxed.pdbandrank1_ligand_reverseprocess_relaxed.pdb.
Example command for HTS:
python run_single_protein_inference.py protein.pdb ligand.csv --hts --savings_per_complex 3 --inference_steps 20 --header test --device $1 --python /path/to/dynamicbind/python --relax_python /path/to/relax/pythonHTS Output files:
complete_affinity_prediction.csvaffinity_prediction.csv
- Use AlphaFold2/3 to predict the protein structure.
- Use the predicted protein structure to run DynAlloBind for prediction as described above.
- Compare the results with the ground truth to calculate the RMSD.
@article{lu2024dynamicbind,
title={DynamicBind: predicting ligand-specific protein-ligand complex structure with a deep equivariant generative model},
author={Lu, Wei and Zhang, Jixian and Huang, Weifeng and Zhang, Ziqiao and Jia, Xiangyu and Wang, Zhenyu and Shi, Leilei and Li, Chengtao and Wolynes, Peter G and Zheng, Shuangjia},
journal={Nature Communications},
volume={15},
number={1},
pages={1071},
year={2024},
publisher={Nature Publishing Group UK London}
}