A fast, local pronunciation and speech synthesis assistant powered by piper-phonemize, optimized for language learners and anyone looking to improve their spoken fluency across dozens of languages. This GUI application supports IPA phoneme visualization and real-time audio playback using neural voice models.
- Text-to-IPA phonemization using eSpeak-ng
- Real-time audio synthesis with neural voice models (ONNX format)
- Adjustable length scale (speaking speed) and volume
- Multi-language support with auto-discovered voices
- Simple and responsive GUI built with wxWidgets
- Runs fully offline on Windows and Linux
- Download a voice model (see Voices below)
- Place both
.onnxand.onnx.jsonfiles into themodels/directory - Run the app, choose a model, type a phrase, and click Pronounce
Welcome to the world of speech synthesis!
→ wˈɛlkʌm tə ðə wˈɜːld ʌv spˈiːtʃ sˈɪnθəsˌɪsThen listen to the synthesized audio with realistic pronunciation.
This application uses the same voices as the Piper TTS project, trained with VITS and exported for ONNX Runtime.
- Arabic (ar_JO)
- Catalan (ca_ES)
- Czech (cs_CZ)
- Welsh (cy_GB)
- Danish (da_DK)
- German (de_DE)
- Greek (el_GR)
- English (en_GB, en_US)
- Spanish (es_ES, es_MX)
- Finnish (fi_FI)
- French (fr_FR)
- Hungarian (hu_HU)
- Icelandic (is_IS)
- Italian (it_IT)
- Georgian (ka_GE)
- Kazakh (kk_KZ)
- Luxembourgish (lb_LU)
- Nepali (ne_NP)
- Dutch (nl_BE, nl_NL)
- Norwegian (no_NO)
- Polish (pl_PL)
- Portuguese (pt_BR, pt_PT)
- Romanian (ro_RO)
- Russian (ru_RU)
- Serbian (sr_RS)
- Swedish (sv_SE)
- Swahili (sw_CD)
- Turkish (tr_TR)
- Ukrainian (uk_UA)
- Vietnamese (vi_VN)
- Chinese (zh_CN)
📁 Each voice requires:
- A
.onnxfile (neural model)- A corresponding
.onnx.jsonconfiguration file
You can download voices from the Piper Voices repository or see the VOICES.md list.
MODEL_CARD file before use.
Download prebuilt releases from the Releases page, or build from source:
- Clone this repo
- Run
cmakefirst, and thenmakeor open in your IDE of choice
Your voice models should be placed in a models/ folder next to the binary.
After launching the app:
- Select a language model from the dropdown
- Enter any word or phrase in the text box
- Click Look up to see the IPA symbols
- Click Pronounce to hear the voice synthesis
- Adjust speed and volume using the sliders
All synthesis and audio playback happen locally and offline.
- GUI: wxWidgets
- TTS Engine: Piper (ONNX Runtime, VITS)
- Phonemizer: espeak-ng via piper-phonemize
- Audio Output: PortAudio
- Language Support: Multilingual, via ONNX+JSON model pairs
This app is open source and intended for educational, research, and personal language learning use. See LICENSE for more information.

