Browser-based AI Speaking Practice
Speako is a local-first application designed for practicing exam-style English speaking tests. It prioritizes user privacy, zero latency, and a premium user experience by running powerful AI models directly in your browser.
- π Privacy First: Voice data is processed locally on your device using Transformers.js.
- π¨ Premium Design: A beautiful, distraction-free "Dark Glass" interface built with Pure CSS.
- π§ Smart Analysis:
- CEFR Level Detection: ML-powered proficiency assessment using a fine-tuned DeBERTa model (robg/speako-cefr-deberta).
- Grammar Check: Detects hedging, passive voice, and weak vocabulary.
- Clarity Score: Real-time evaluation of speaking clarity.
- Positive Reinforcement: Highlights strong vocabulary usage.
- β‘οΈ Ultra-Low Latency: Instant feedback without server round-trips.
- π WebGPU Optimized: Uses hardware acceleration for fast in-browser inference, with automatic WASM fallback.
- π± PWA Support: Installable as a Progressive Web App with offline model caching.
Speako is a pure frontend application with no backend server.
- Frontend: Vite + Preact + TypeScript
- Styling: Zero-dependency Pure CSS
- AI Models:
- Speech Recognition:
Xenova/whisper-base(running locally via ONNX) - CEFR Classification:
robg/speako-cefr-deberta(fine-tuned DeBERTa)
- Speech Recognition:
- NLP: Compromise for grammar analysis
- State Management: Preact Signals for high-performance reactivity
speako/
βββ src/
β βββ components/ # UI components (split by feature)
β β βββ session/ # Recording session components
β β βββ validation/ # Validation interface components
β βββ hooks/ # Custom hooks (useSessionManager, useValidation, etc.)
β βββ logic/ # Pure TS business logic
β β βββ local-transcriber.ts # Whisper integration
β β βββ model-loader.ts # Model singleton with WebGPU/WASM
β β βββ cefr-classifier.ts # CEFR ML prediction
β β βββ grammar-checker.ts # Grammar analysis
β β βββ metrics-calculator.ts # Speaking metrics
β βββ types/ # TypeScript type definitions
βββ ml/ # CEFR classifier training scripts
βββ scripts/ # Helper scripts
βββ public/ # Static assets and local models
- Node.js 20+ (check with
node -v) - Python 3.11+ with uv for ML training (optional)
# Install dependencies
npm install
# Start development server
npm run devOpen http://localhost:5173.
| Script | Description |
|---|---|
npm run dev |
Start development server |
npm run build |
Build for production |
npm run preview |
Preview production build |
npm run test |
Run unit tests |
npm run lint |
Run ESLint |
npm run format |
Format code with Prettier |
npm run prepare:models |
Download models locally for offline testing |
npm run prepare:data |
Convert corpus audio to WAV for validation |
npm run cefr:verify |
Verify CEFR model is working |
npm run deploy |
Build and deploy to Cloudflare Pages |
For testing with real L2 learner audio, we use the Speak & Improve Corpus 2025 from Cambridge University Press & Assessment.
- Visit ELiT Datasets - Speak & Improve Corpus 2025
- Complete the free registration and accept the license
- Download and extract
sandi-corpus-2025.zip
The audio files are hosted separately on S3. Download the dev set (smaller, for testing):
cd /path/to/sandi-corpus-2025
mkdir -p data && cd data
# Dev set (~2.7GB total)
curl -LO "https://speak-and-improve-corpus-2025.s3.eu-west-1.amazonaws.com/audio/data.flac.dev.01.zip"
curl -LO "https://speak-and-improve-corpus-2025.s3.eu-west-1.amazonaws.com/audio/data.flac.dev.02.zip"
# Unzip into data/flac/dev/
unzip data.flac.dev.01.zip
unzip data.flac.dev.02.zipcd /path/to/speako
ln -s /path/to/sandi-corpus-2025 ./test-data# Requires ffmpeg: brew install ffmpeg
npm run prepare:data| Property | Value |
|---|---|
| Duration | ~315 hours of L2 learner audio |
| Format | 16kHz FLAC |
| CEFR Levels | A2βC1 |
| Manual Transcriptions | ~55 hours with disfluency annotations |
| License | Non-commercial research only |
Caution
Do not share the corpus publicly or include it in any repository. See the license agreement for full terms.
Validation is performed through the web interface:
- Start the development server:
npm run dev - Navigate to http://localhost:5173/#validate
- Use the validation controls to run tests on the corpus
Results are saved to validation-results.json.
For information on training the CEFR classifier, see docs/ml.md.
Note
The CEFR model is trained on UniversalCEFR (CC-BY-NC-4.0) to ensure license compliance. The S&I Corpus is used for validation only.
See AGENTS.md for coding standards and agent instructions.
To build for production:
npm run buildThis produces a static output in dist/ which can be deployed to any static host (Cloudflare Pages, Vercel, Netlify).
npm run deploy- Transformers.js β Run Transformers in the browser
- Preact β Fast 3kB React alternative
- Vite β Next Generation Frontend Tooling
- Compromise β Modest natural-language processing
- Xenova/whisper-base β Speech recognition model
- robg/speako-cefr-deberta β CEFR classification model
- WebGPU Implementation Status β Browser support tracker
- WebGPU Explainer β Introduction to WebGPU
- Speak & Improve Corpus 2025 β L2 learner speech corpus
- Corpus Paper (DOI) β Academic citation
MIT
