Skip to content

microsoft/Mu-Protein

Repo Logo

Introduction

μProtein is a general framework designed to accelerate protein engineering by integrating μFormer, a deep learning model for accurate mutational effect prediction, with μSearch, a reinforcement learning algorithm tailored for efficient navigation of the protein fitness landscape.

For more details, please refer to our paper in Nature Machine Intelligence.

This repository contains the following components:

  • pmlm/ – Protein language model pretraining
  • mu-former/ – Fitness landscape modeling using the pretrained protein language model
  • mu-search/ – Navigating the constructed fitness landscape oracle
  • pretrained/ – Pretrained PMLM model checkpoint (stored using Git LFS).

For more details, refer to the respective README files:

Citation

If you are using our code or model, please cite the following paper:

@article{sun2025accelerating,
  title={Accelerating protein engineering with fitness landscape modelling and reinforcement learning},
  author={Sun, Haoran and He, Liang and Deng, Pan and Liu, Guoqing and Zhao, Zhiyu and Jiang, Yuliang and Cao, Chuan and Ju, Fusong and Wu, Lijun and Liu, Haiguang and others},
  journal={Nature Machine Intelligence},
  pages={1--15},
  year={2025},
  publisher={Nature Publishing Group UK London}
}

License

This repository is licensed under the MIT License.


Contact

For questions or collaborations, please contact the authors via email or open an issue in this repository.


About

Official implementation of μProtein -- Accelerating protein engineering with fitness landscape modeling and reinforcement learning

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 5