Skip to content

A web-based voice agent application with a frontend interface and backend processing capabilities.

Notifications You must be signed in to change notification settings

emiliacb/voice-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Voice Agent

A web-based voice agent application with a frontend interface and backend processing capabilities.

https://voice-agent-front.onrender.com/

Project Structure

voice-agent/
├── frontend/
│   ├── index.html
│   ├── script.js
│   ├── style.css
│   └── package.json
├── backend/
│   ├── index.mjs
│   ├── Dockerfile
│   ├── dev.sh
│   └── package.json
└── .gitignore

Architecture

The application consists of three main components:

1) Frontend (/frontend)

  • Vite-based vanilla web application
  • Pure CSS animations for lip-sync visualization
  • Real-time audio processing with Web Audio API

2) Backend (/backend)

  • Hono.js lightweight web framework
  • Docker for easy deployment
  • Rate limiting with IP and route-based protection
  • CORS configuration for secure cross-origin requests

3) Rhubarb (Custom Replicate Service)

  • A Replicate/Cog model that provides automatic lip synchronization analysis using Rhubarb Lip Sync by Daniel Wolf. https://github.com/emiliacb/replicate-rhubarb
  • Automatic Lip Sync Analysis: Generates mouth cue data from audio input
  • Multiple Audio Format Support: Handles MP3, WAV, and other common audio formats
  • Chunked Processing: Automatically splits long audio files into manageable chunks
  • JSON Output: Returns structured mouth cue data in JSON format
  • Phonetic Recognition: Uses phonetic recognition for accurate lip sync
  • Cloud-Ready: Deployed on Replicate for easy API access

Local Development

Requirements

  • Node.js (v18+)
  • Docker (v20+)

Frontend

cd frontend
npm install
npm run dev

Backend

We provide a development script that automatically rebuilds and restarts the container when you make changes to the backend code:

# Make the script executable (first time only)
chmod +x dev.sh

# Start development mode
./dev.sh

This will:

  • Build the Docker container
  • Start it in the foreground showing all logs
  • Automatically rebuild and restart when you make changes
  • Make the service available at http://localhost:8080

License

This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0). This means:

  • ✅ You are free to share and adapt this work for non-commercial purposes
  • ✅ You must give appropriate credit
  • ✅ You must share any derivative works under the same license
  • ❌ You cannot use this work for commercial purposes

For more information, see Creative Commons BY-NC-SA 4.0.

About

A web-based voice agent application with a frontend interface and backend processing capabilities.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors