A Python script that performs speech diarization (speaker identification) in the current REAPER project, automatically splitting a single mixed recording onto new, colour-coded tracks for each unique speaker.
This script uses the senko library for diarization and reapy to interact with REAPER.
- REAPER Integration: Automatically connects to a running REAPER instance and targets the first media item on the first track.
- Format Check & Conversion: Uses
ffmpeg-pythonto check if the source audio is in the required 16kHz, mono, 16-bit PCM WAV format. If not, it converts it to a temporary file before processing, ensuring compatibility. - Individual Tracks: Creates a new, unique track for every detected speaker and assigns a distinct colour.
- Item Splitting: Splits the original media item and moves the resulting segments to the corresponding speaker tracks.
- Caching: Supports saving and loading diarization results to a JSON file to avoid re-running the heavy diarization process.
- REAPER, with
reapyset up - uv for Python virtual environment management
- ffmpeg installed in your $PATH
-
Clone the Repository:
git clone https://github.com/atmosfar/reaper_speech_diarizer.git cd reaper_speech_diarizer -
Create and Activate Virtual Environment:
uv venv --python 3.11.13 source .venv/bin/activate -
Install Dependencies:
uv pip install -r requirements.txt
The script must be run from your terminal/command line outside of REAPER, but while a REAPER project is open. It will automatically connect to REAPER using the reapy library.
1. Prepare Your REAPER Project:
- Ensure the audio you want to diarize is the first item on the first track of your active REAPER project.
- Save your REAPER project.
2. Run the Script:
The primary script is named reaper_speech_diarizer.py.
-
To Run Full Diarization:
python reaper_speech_diarizer.py
-
To Run and Save Results to a JSON Cache:
python reaper_speech_diarizer.py --json_output results/my_diarization_cache.json
-
To Load Results from a JSON Cache (skips diarization):
python reaper_speech_diarizer.py --json_input results/my_diarization_cache.json
