Omi is an open-source AI wearable platform designed to capture, transcribe, and analyze conversations and screen data in real-time. The ecosystem integrates custom hardware, cross-platform mobile and desktop applications, and a robust cloud backend to transform raw audio and visual data into structured memories, action items, and AI-driven insights README.md3-7
The platform is organized into several key technical domains:
| Component | Location | Technology | Primary Entry Point |
|---|---|---|---|
| Backend | backend/ | Python (FastAPI) | backend/main.py README.md112 |
| Mobile App | app/ | Flutter (Dart) | app/lib/main.dart README.md111 |
| Desktop App | desktop/ | Swift (macOS) | desktop/Desktop/Sources/DesktopApp.swift README.md110 |
| Firmware (Omi) | omi/ | Zephyr RTOS (C) | omi/firmware/omi/src/main.c omi/firmware/omi/README.md11-13 |
| Firmware (Glass) | omiGlass/ | ESP32-S3 (Arduino/C++) | omiGlass/firmware/firmware.ino README.md114 |
| Web Apps | web/ | Next.js (TS) | web/app/, web/frontend/ README.md116-117 |
The core value proposition lies in its 100% open-source nature, allowing developers to customize everything from the PCB layouts and firmware to the LLM processing pipelines and third-party integrations via a modular plugin system README.md7-14
Sources:
The Omi ecosystem follows a distributed architecture where audio is captured at the edge (wearables, mobile, or desktop), streamed via Bluetooth Low Energy (BLE) or WebSockets to a gateway, and processed by a specialized FastAPI backend.
Diagram: Data Flow from Capture to AI Processing
Sources:
The repository is a monorepo containing all components of the Omi platform:
app/: The Flutter mobile application. It manages device pairing, audio streaming, and state management via the Provider pattern README.md111backend/: Python FastAPI services. Includes real-time transcription via WebSockets (/v4/listen), speaker identification via a dedicated diarizer service, and voice activity detection via vad AGENTS.md57-62desktop/: Native macOS application. Features screen capture, OCR (Rewind feature), and a floating control bar for proactive AI assistance README.md110omi/: Hardware designs and Zephyr-based firmware for the nRF5340-powered Omi wearable. Supports dual microphones, BLE, SD card storage, and haptic feedback omi/firmware/omi/README.md26-35omiGlass/: Firmware and designs for the ESP32-S3 smart glasses variant, supporting a first-person camera and local AI integration README.md114sdks/: Official SDKs for Python, Swift, and React Native to build custom integrations that connect to Omi devices README.md115web/: Next.js applications including the authenticated companion app, public landing page, and AI persona hosting README.md116-117Sources:
Omi differentiates itself through three main pillars:
process_conversation function uses LLMs to extract action items, events, and memories from raw transcripts docs/doc/developer/backend/backend_deepdive.mdx140-146Sources:
The system utilizes a streaming architecture. Audio is captured by the wearable's microphones, encoded into Opus format omi/firmware/omi/README.md46 and transmitted over BLE. The backend main.py provides a REST API and handles audio streaming to a pusher service via WebSockets AGENTS.md57
Diagram: Audio Processing Service Map
Sources:
When a recording session ends, the backend executes process_conversation() docs/doc/developer/backend/backend_deepdive.mdx52 This function triggers an LLM-based extraction pipeline that populates structured fields in Firestore:
Omi uses a multi-tier storage strategy:
Sources:
Refresh this wiki
This wiki was recently refreshed. Please wait 5 days to refresh again.