Overview

Relevant source files

Purpose and Scope

Omi is an open-source AI wearable platform designed to capture, transcribe, and analyze conversations and screen data in real-time. The ecosystem integrates custom hardware, cross-platform mobile and desktop applications, and a robust cloud backend to transform raw audio and visual data into structured memories, action items, and AI-driven insights README.md3-7

The platform is organized into several key technical domains:

Component	Location	Technology	Primary Entry Point
Backend	`backend/`	Python (FastAPI)	`backend/main.py` README.md112
Mobile App	`app/`	Flutter (Dart)	`app/lib/main.dart` README.md111
Desktop App	`desktop/`	Swift (macOS)	`desktop/Desktop/Sources/DesktopApp.swift` README.md110
Firmware (Omi)	`omi/`	Zephyr RTOS (C)	`omi/firmware/omi/src/main.c` omi/firmware/omi/README.md11-13
Firmware (Glass)	`omiGlass/`	ESP32-S3 (Arduino/C++)	`omiGlass/firmware/firmware.ino` README.md114
Web Apps	`web/`	Next.js (TS)	`web/app/`, `web/frontend/` README.md116-117

The core value proposition lies in its 100% open-source nature, allowing developers to customize everything from the PCB layouts and firmware to the LLM processing pipelines and third-party integrations via a modular plugin system README.md7-14

Sources:

System Architecture

The Omi ecosystem follows a distributed architecture where audio is captured at the edge (wearables, mobile, or desktop), streamed via Bluetooth Low Energy (BLE) or WebSockets to a gateway, and processed by a specialized FastAPI backend.

High-Level Ecosystem Flow

Diagram: Data Flow from Capture to AI Processing

Sources:

Repository Structure

The repository is a monorepo containing all components of the Omi platform:

app/: The Flutter mobile application. It manages device pairing, audio streaming, and state management via the Provider pattern README.md111
backend/: Python FastAPI services. Includes real-time transcription via WebSockets (/v4/listen), speaker identification via a dedicated diarizer service, and voice activity detection via vad AGENTS.md57-62
desktop/: Native macOS application. Features screen capture, OCR (Rewind feature), and a floating control bar for proactive AI assistance README.md110
omi/: Hardware designs and Zephyr-based firmware for the nRF5340-powered Omi wearable. Supports dual microphones, BLE, SD card storage, and haptic feedback omi/firmware/omi/README.md26-35
omiGlass/: Firmware and designs for the ESP32-S3 smart glasses variant, supporting a first-person camera and local AI integration README.md114
sdks/: Official SDKs for Python, Swift, and React Native to build custom integrations that connect to Omi devices README.md115
web/: Next.js applications including the authenticated companion app, public landing page, and AI persona hosting README.md116-117

Sources:

Core Value Proposition

Omi differentiates itself through three main pillars:

Open-Source Transparency: Unlike proprietary wearables, Omi provides full access to hardware schematics, firmware source code, and backend processing logic README.md7-14
Continuous Capture: The wearable is optimized for 24h+ continuous audio capture using low-power BLE and efficient Opus encoding README.md151-154 The firmware handles 100 packets/sec at 50 bytes each omi/firmware/omi/README.md48
Proactive Intelligence: The system doesn't just record; it proactively evaluates data for relevance. The process_conversation function uses LLMs to extract action items, events, and memories from raw transcripts docs/doc/developer/backend/backend_deepdive.mdx140-146

Sources:

Implementation Details

Audio Capture & Transcription Pipeline

The system utilizes a streaming architecture. Audio is captured by the wearable's microphones, encoded into Opus format omi/firmware/omi/README.md46 and transmitted over BLE. The backend main.py provides a REST API and handles audio streaming to a pusher service via WebSockets AGENTS.md57

Diagram: Audio Processing Service Map

Sources:

Conversation Processing

When a recording session ends, the backend executes process_conversation() docs/doc/developer/backend/backend_deepdive.mdx52 This function triggers an LLM-based extraction pipeline that populates structured fields in Firestore:

Title & Overview: Summarizes the conversation docs/doc/developer/backend/backend_deepdive.mdx142
Action Items: Extracts tasks and to-dos docs/doc/developer/backend/backend_deepdive.mdx143
Memories: Identifies facts about the user for long-term storage docs/doc/developer/backend/backend_deepdive.mdx145

Data Storage & Protection

Omi uses a multi-tier storage strategy:

Firestore: Primary database for conversations, memories, and metadata docs/doc/developer/backend/backend_deepdive.mdx57
Pinecone: Stores vector embeddings to enable semantic search across user history docs/doc/developer/backend/backend_deepdive.mdx58
Redis: High-speed caching for user profiles and active session metadata docs/doc/developer/backend/backend_deepdive.mdx59
Google Cloud Storage: Stores binary assets like audio files and speech profiles docs/doc/developer/backend/backend_deepdive.mdx60

Sources:

Overview

Relevant source files

Purpose and Scope

The platform is organized into several key technical domains:

Component	Location	Technology	Primary Entry Point
Backend	`backend/`	Python (FastAPI)	`backend/main.py` README.md112
Mobile App	`app/`	Flutter (Dart)	`app/lib/main.dart` README.md111
Desktop App	`desktop/`	Swift (macOS)	`desktop/Desktop/Sources/DesktopApp.swift` README.md110
Firmware (Omi)	`omi/`	Zephyr RTOS (C)	`omi/firmware/omi/src/main.c` omi/firmware/omi/README.md11-13
Firmware (Glass)	`omiGlass/`	ESP32-S3 (Arduino/C++)	`omiGlass/firmware/firmware.ino` README.md114
Web Apps	`web/`	Next.js (TS)	`web/app/`, `web/frontend/` README.md116-117

Sources:

System Architecture

High-Level Ecosystem Flow

Diagram: Data Flow from Capture to AI Processing

Sources:

Repository Structure

The repository is a monorepo containing all components of the Omi platform:

app/: The Flutter mobile application. It manages device pairing, audio streaming, and state management via the Provider pattern README.md111
backend/: Python FastAPI services. Includes real-time transcription via WebSockets (/v4/listen), speaker identification via a dedicated diarizer service, and voice activity detection via vad AGENTS.md57-62
desktop/: Native macOS application. Features screen capture, OCR (Rewind feature), and a floating control bar for proactive AI assistance README.md110
omi/: Hardware designs and Zephyr-based firmware for the nRF5340-powered Omi wearable. Supports dual microphones, BLE, SD card storage, and haptic feedback omi/firmware/omi/README.md26-35
omiGlass/: Firmware and designs for the ESP32-S3 smart glasses variant, supporting a first-person camera and local AI integration README.md114
sdks/: Official SDKs for Python, Swift, and React Native to build custom integrations that connect to Omi devices README.md115
web/: Next.js applications including the authenticated companion app, public landing page, and AI persona hosting README.md116-117

Sources:

Core Value Proposition

Omi differentiates itself through three main pillars:

Open-Source Transparency: Unlike proprietary wearables, Omi provides full access to hardware schematics, firmware source code, and backend processing logic README.md7-14
Continuous Capture: The wearable is optimized for 24h+ continuous audio capture using low-power BLE and efficient Opus encoding README.md151-154 The firmware handles 100 packets/sec at 50 bytes each omi/firmware/omi/README.md48
Proactive Intelligence: The system doesn't just record; it proactively evaluates data for relevance. The process_conversation function uses LLMs to extract action items, events, and memories from raw transcripts docs/doc/developer/backend/backend_deepdive.mdx140-146

Sources:

Implementation Details

Audio Capture & Transcription Pipeline

Diagram: Audio Processing Service Map

Sources:

Conversation Processing

Title & Overview: Summarizes the conversation docs/doc/developer/backend/backend_deepdive.mdx142
Action Items: Extracts tasks and to-dos docs/doc/developer/backend/backend_deepdive.mdx143
Memories: Identifies facts about the user for long-term storage docs/doc/developer/backend/backend_deepdive.mdx145

Data Storage & Protection

Omi uses a multi-tier storage strategy:

Firestore: Primary database for conversations, memories, and metadata docs/doc/developer/backend/backend_deepdive.mdx57
Pinecone: Stores vector embeddings to enable semantic search across user history docs/doc/developer/backend/backend_deepdive.mdx58
Redis: High-speed caching for user profiles and active session metadata docs/doc/developer/backend/backend_deepdive.mdx59
Google Cloud Storage: Stores binary assets like audio files and speech profiles docs/doc/developer/backend/backend_deepdive.mdx60

Sources:

Overview

Purpose and Scope

System Architecture

High-Level Ecosystem Flow

Repository Structure

Core Value Proposition

Implementation Details

Audio Capture & Transcription Pipeline

Conversation Processing

Data Storage & Protection

On this page

Overview

Purpose and Scope

System Architecture

High-Level Ecosystem Flow

Repository Structure

Core Value Proposition

Implementation Details

Audio Capture & Transcription Pipeline

Conversation Processing

Data Storage & Protection

On this page