It is estimated that roughly 30% of patients visiting the primary care have musculoskeletal complaints (MSK). Many of these patients are referred to the rheumatology outpatient clinic by the first healthcare provider (GP). Providing timely and appropriate care is crucial for the future prognosis of these patients, but the specific type of care needed varies depending on the diagnosis. We aimed to optimize the triaging procedure by automatically screening and prioritizing patients with AI methods. Using only the contents of the referral letters, we developed Machine Learning (ML) models to identify RA, OA, FMS and patients needing chronic care (>3 months).
For more information see our article published in npj Digital Medicine: https://www.nature.com/articles/s41746-025-01495-4
Prerequisite: Install Anaconda with python version 3.6+. This additionally installs the Anaconda Prompt, which you can find in the windows search bar. Use this Anaconda prompt to run the commands mentioned below.
Prerequisite: conda environment (with jupyter notebook). Use the terminal to run the commands mentioned below.
Install Jupyter Notebook:
$ conda install -c anaconda notebookBefore running, please install the dependencies.
prerequisite: conda3
$ bash build_kernel.shprerequisite: pip
$ pip install -r requirements.txtFor Dutch rheumatologists we have made a seperate github to facilitate deployment (which also features a Dutch User Manual): https://github.com/levrex/implement_triage_agent/tree/main
General study flow showing the different steps of our approach.
If you were to use this pipeline, please use the following citation:
Maarseveen, T.D., Glas, H.K., Veris-van Dieren, J. et al. Improving musculoskeletal care with AI enhanced triage through data driven screening of referral letters. npj Digit. Med. 8, 98 (2025). https://doi.org/10.1038/s41746-025-01495-4
README.md: This filerequirements.txt: Prerequisite python modules (with version numbers) to run codefigures/md: Figures used for the readmefigures/tuning: Interactive HTML figures for hyperparameter tuning (generated by optuna)models/*: Contains the final models for extracting diagnoses from referral letter. Disclaimer: these are only usable on Dutch data.models/tfidf/: Contains the different vectorizers for preprocessing the Dutch referral lettersmodels/xgb/: Contains the different XGB-classification models for extracting the different diagnoses (gold standard: ICD-codes)models/llm_update_250915/: Contains the different classification models retrained (gold standard: Physician conclusions extracted by LLM)
src/*: Code base for the projectsrc/1_visualize_the_data.ipynb: Read and clean the referral letter datasrc/2_baseline_analysis.ipynb: Get insight into the baseline characteristics & available datasrc/3_process_referral_letters.ipynb: Notebook to process & vectorize the referral letters (apply Bertje to redact names)src/4_predict_chronic_disease.ipynb: Create the classification model for prediction of Chronic disease (>6 months follow up)src/4_predict_Fibromyalgia_syndrome.ipynb: Create the classification model for prediction of Fibromyalgiasrc/4_predict_Osteoarthritis_disease.ipynb: Create the classification model for prediction of Osteoarthritissrc/4_predict_RA_disease.ipynb: Create the classification model for prediction of Rheumatoid Arthritissrc/5_evaluation.ipynb: Evaluate performance across different centers & get insight in prioritisation qualities for triaging!src/EmployNER.py: Employ Dutch transformer (BERTje)src/Functions.py: Extra functions for preprocessing & visualizationsrc/Hyperparameter_tuning.py: script for finetuning the classification models (Bayesian optimization with optuna)src/StudyRx.py: script for processing medication information (unused)
suppl/*: Supplementary files for the projectbuild_conda_env.sh: Example batch script to set up the required conda environment for this projectbuild_kernel.sh: Example batch script to set up the required Python kernel (for Jupyter Notebook) for this project
If you experience difficulties with implementing the pipeline or if you have any other questions feel free to send me an e-mail. You can contact me on: t.d.maarseveen@lumc.nl