Turn your Ray-Ban Meta glasses into a real-time AI assistant.
Your AI sees what you see. Talks back instantly. Remembers everything.
Open-source iOS app + backend runtime for agents that live in the real world — not just in chat.
Top 10 at the Mistral Worldwide Hackathon 2026 · Winner of Giant Ventures' "Future Unicorn" prize
Look at something → ask a question → get an answer instantly
Walk through the world → your AI remembers what mattered
Talk naturally → no screen, no typing, no app switching
Get started in minutes
pip install portworldImagine walking through a city with an AI that can:
- Identify buildings, objects, and landmarks in real time
- Answer follow-up questions by voice
- Remember what you saw earlier in the day
- Act through tools, APIs, and external agents
"What is this building?" "Where should I eat nearby?" "What was the name of that church we passed earlier?"
Port:World is a runtime for AI agents in the physical world.
Instead of building chatbots, you build agents that:
- See through a camera
- Hear through a microphone
- Speak back in real time
- Remember across sessions
- Act through tools
This is not an app — it's a platform. You bring the agent logic. Port:World handles streaming, model routing, memory, and the glasses.
- AI engineers building agent systems that go beyond text
- Developers exploring wearable + AI integration
- Builders who want to ship real-world AI — not another chat wrapper
Time to first interaction: ~5 minutes
pip install portworld
portworld init
portworld doctor --target localgit clone https://github.com/portworld/PortWorld.git && cd PortWorld
cp backend/.env.example backend/.env
# Set at least one provider key — see "Minimum config" below
docker compose up --buildVerify:
curl http://127.0.0.1:8080/livez
# → {"status":"ok","service":"portworld-backend"}open IOS/PortWorld.xcodeproj- Let Xcode resolve Swift Package dependencies
- Build the PortWorld scheme
- Enter your backend URL in Settings and validate the connection
The iOS app exposes a Siri shortcut that opens PortWorld and attempts to start an assistant session.
Available phrases:
Start PortWorld sessionStart assistant in PortWorldLaunch PortWorld assistant session
If onboarding is incomplete or backend or glasses readiness is blocked, Siri still opens the app but the session will not start until those requirements are met.
If your OpenClaw gateway runs on a different VM/VPS/cloud than PortWorld, use the openclaw-gateway-bridge agent skill for provider-agnostic setup and validation:
npx skills add portworld/PortWorld --skill openclaw-gateway-bridge -a codex -yIt guides secure endpoint exposure, auth wiring, and PortWorld OPENCLAW_* configuration end-to-end.
You only need one API key to start:
| Provider | backend/.env |
|---|---|
| OpenAI Realtime | REALTIME_PROVIDER=openai + OPENAI_API_KEY=sk-... |
| Gemini Live | REALTIME_PROVIDER=gemini_live + GEMINI_LIVE_API_KEY=... |
Vision, search, memory consolidation, and tool integrations are all off by default — enable them as you need them.
| Use case | How it works |
|---|---|
| Real-time travel guide | Walk through a city — the agent identifies landmarks, translates signs, suggests restaurants based on what it sees |
| Hands-free field assistant | Mechanics, surgeons, or technicians get step-by-step guidance while keeping both hands free |
| Accessibility companion | Describe scenes, read text aloud, identify objects and people for visually impaired users |
| Personal memory engine | "What was the name of that restaurant we passed?" — the agent remembers what you saw |
| Live coding pair | Point your glasses at a whiteboard or screen — the agent reads, reasons, and discusses |
| Security / inspection | Walk a site — the agent logs observations, flags anomalies, and generates reports |
| Your idea here | Port:World is a runtime, not a single app. Build whatever you want on top of it. |
┌──────────────┐ WebSocket (audio + control) ┌──────────────────┐
│ │ ◄──────────────────────────────────────► │ │
│ Ray-Ban │ │ FastAPI │
│ Meta │ HTTP (vision frames) │ Backend │
│ Glasses │ ──────────────────────────────────────► │ │
│ │ │ ┌────────────┐ │
│ ↕ DAT SDK │ │ │ Realtime │ │
│ │ │ │ Bridge │─┼──► OpenAI / Gemini
│ iPhone │ │ ├────────────┤ │
│ (bridge) │ │ │ Vision │─┼──► Mistral / Claude / GPT-4o / ...
│ │ │ ├────────────┤ │
│ │ │ │ Memory │ │
│ │ │ ├────────────┤ │
│ │ │ │ Tools │─┼──► Web search, MCP, OpenClaw, ...
│ │ │ └────────────┘ │
└──────────────┘ └──────────────────┘
Glasses capture audio and camera frames via Meta's DAT SDK. iPhone bridges glasses I/O to the backend over WebSocket (audio) and HTTP (vision). Backend routes audio to a realtime AI provider, processes vision frames through pluggable analyzers, manages persistent memory, and executes tools during the conversation.
| Surface | What it does |
|---|---|
backend/ |
FastAPI server — realtime voice relay, vision pipeline, memory, tools, auth |
IOS/ |
SwiftUI app — glasses integration (DAT), audio capture, wake word, WebSocket transport |
portworld_cli/ |
Developer CLI — init, doctor, deploy, status, logs, providers, extensions |
portworld_shared/ |
Shared Python contracts between CLI and backend |
| Provider | ID | Key |
|---|---|---|
| OpenAI Realtime | openai |
OPENAI_API_KEY |
| Gemini Live | gemini_live |
GEMINI_LIVE_API_KEY |
| Provider | ID | Key(s) |
|---|---|---|
| Mistral | mistral |
VISION_MISTRAL_API_KEY |
| OpenAI | openai |
VISION_OPENAI_API_KEY |
| Gemini | gemini |
VISION_GEMINI_API_KEY |
| Claude | claude |
VISION_CLAUDE_API_KEY |
| Groq | groq |
VISION_GROQ_API_KEY |
| NVIDIA | nvidia_integrate |
VISION_NVIDIA_API_KEY |
| Azure OpenAI | azure_openai |
VISION_AZURE_OPENAI_API_KEY + endpoint |
| AWS Bedrock | bedrock |
VISION_BEDROCK_REGION (+ IAM) |
| Provider | ID | Key |
|---|---|---|
| Tavily | tavily |
TAVILY_API_KEY |
portworld providers list # see all available providers
portworld providers show <id> # inspect a specific providerPort:World is designed to be extended. Here's how developers plug into the system:
Tools are async functions the AI can call mid-conversation. Register a definition + executor in the tool catalog:
registry.register(
ToolDefinition(
name="my_tool",
description="Does something useful",
parameters={"type": "object", "properties": { ... }},
),
executor=my_tool_executor,
)Implement a vision analyzer and register it in the vision factory. Your analyzer receives camera frames and returns semantic descriptions that feed into the agent's memory.
Implement the realtime bridge interface and register it in the provider registry. The bridge handles upstream audio streaming and tool dispatch for any new model API.
The backend supports Model Context Protocol (MCP) extensions. Drop a server config into the extensions system and expose new capabilities to the agent without touching core code.
Use the OpenClaw delegation layer to offload long-running or tool-heavy tasks to external agent runtimes, while Port:World stays the live conversational orchestrator.
See backend/README.md for the full API and extension reference.
docker compose up --buildportworld deploy gcp-cloud-run --project <id> --region <region>
portworld deploy aws-ecs-fargate --region <region>
portworld deploy azure-container-apps --subscription <sub> --resource-group <rg> --region <region>See portworld_cli/README.md for readiness checks, log streaming, and redeployment.
Port:World is evolving from a hackathon winner into a full wearable AI platform:
- Agentic delegation — OpenClaw integration for heavy multi-step tasks
- Richer memory — identity, routines, social graph, preferences, with confidence tracking
- Passive context — ambient scene understanding even outside active conversations
- Proactive assistance — timely suggestions earned through context quality and user trust
- Siri / App Shortcuts — launch a session with your voice, no app interaction needed
- Android support — bring the same glasses-first experience to Android
- Multi-agent coordination — orchestrate multiple specialized agents in parallel
Full roadmap: docs/roadmap/AGENTIC_PERSONAL_ASSISTANT_ROADMAP.md
Port:World is in its first stable release phase. Core surfaces are release-ready.
- Stable: backend self-hosting, CLI bootstrap/deploy, iOS app with Meta glasses
- Shipping: first public PyPI + GHCR releases with
v0.2.x - Hardening: managed cloud defaults, operator docs, production security posture
- Provider API keys required — no keyless demo mode
- AWS/Azure one-click deploys use public DB access by default (tighten before production)
- Full glasses features require Meta hardware + Meta AI app
- Xcode test schemes are not yet maintained
| Doc | What's inside |
|---|---|
| backend/README.md | Backend API, config reference, storage, auth |
| portworld_cli/README.md | CLI install, commands, deploy workflows |
| IOS/README.md | iOS setup, Meta DAT, permissions, architecture |
| GETTING_STARTED.md | Extended onboarding for all setup paths |
| CHANGELOG.md | Release history |
Contributions are welcome. Please read CONTRIBUTING.md before opening a PR.
- Bug reports & features: open an issue
- Security: SECURITY.md
- Code of conduct: CODE_OF_CONDUCT.md
Built during the Mistral Worldwide Hackathon 2026 by Pierre Haas, Vassili de Rosen, and Arman Artola.
MIT — see LICENSE.