Skip to content

vic4code/meeting-assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Realtime Meeting Assistant

A local-first, real-time meeting transcription and summarization agent.

This project aims to build an assistant that listens to real-time conversations, transcribes them, detects trigger keywords (like "Hey Assistant"), and dynamically dispatches tasks such as summarization and post-correction — all running on your own machine, without relying on cloud services.


🚀 Project Goals

  • 🎙️ Real-time speech-to-text (multi-language, cross-lingual)
  • 🧠 Agent framework to dispatch tasks dynamically
  • 🗣️ Wake word detection ("Hey Assistant") to trigger specific actions
  • ✍️ Summarization and post-transcription correction
  • 💬 Seamless real-time text input into applications (via Chrome extension or native app)

📈 Development Stages

Phase 1 - Core MVP

  • Set up real-time transcription using RealtimeSTT or WhisperLive.
  • Build an agent that:
    • Receives live transcriptions
    • Detects trigger keywords
    • Buffers conversations
    • Calls summarization or correction functions
  • Output summarized or corrected text to console.

Phase 2 - Frontend Integration

  • Set up a WebSocket server to broadcast live transcriptions.
  • Build a Chrome extension to:
    • Connect to the local WebSocket server
    • Autofill active input fields with live transcription

Phase 3 - Post-processing Enhancements

  • Integrate local LLMs (e.g., Mistral 7B, OpenHermes) for summarization.
  • Implement post-transcription correction (grammar, spelling).
  • Fine-tune summarization prompts for meeting notes.

Phase 4 - System-level Integration

  • Build a macOS native input method (InputMethodKit) for system-wide text input.
  • Add speaker diarization to separate notes per speaker.
  • Optimize low-latency real-time correction during transcription.

🛠️ Tech Stack

  • ASR Engine: RealtimeSTT / WhisperLive (Whisper-based)
  • Programming Language: Python 3.9+
  • Agent Framework: Lightweight custom agent
  • Frontend: Chrome Extension (Manifest V3, WebSocket client)
  • Optional: OpenAI API (for early summarization tasks)
  • Future: Local LLM (Mistral 7B, OpenHermes, llama.cpp)

📦 Project Structure

realtime-meeting-assistant/
├── agent/                 # Core agent to manage transcription and tasks
│   ├── agent.py
│   └── ws_server.py
├── asr/                   # Setup and documentation for ASR runner
│   └── runner.md
├── frontend/               # Chrome extension for real-time input
│   └── chrome_extension/
├── requirements.txt        # Python dependencies
├── README.md               # Project introduction and guide
└── .gitignore              # Ignore temp files

🛠️ Installation

1. Clone the repo

git clone https://github.com/your-username/realtime-meeting-assistant.git
cd realtime-meeting-assistant

2. Install Python dependencies

pip install -r requirements.txt

3. Start your real-time transcription service

(Use RealtimeSTT or WhisperLive, running locally.)

4. Run the agent

python agent/agent.py

The agent will connect to the transcription server, detect keywords, and dispatch tasks.

5. (Optional) Launch Chrome Extension

(Instructions coming soon after frontend MVP!)


🧠 Future Improvements

  • Native speaker separation and diarization
  • Low-latency incremental summarization
  • Full offline summarization using local models
  • Mobile device integration
  • Windows/Linux support for system-wide typing

📄 License

This project is licensed under the MIT License.
Feel free to fork, contribute, or modify for personal or commercial use!


🤝 Contribution

Issues and pull requests are welcome!
Please submit detailed bug reports or feature suggestions via GitHub Issues.


Built with ❤️ to make meetings smarter and more efficient.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published