Port:World

Turn your Ray-Ban Meta glasses into a real-time AI assistant.

Your AI sees what you see. Talks back instantly. Remembers everything.
Open-source iOS app + backend runtime for agents that live in the real world — not just in chat.

_{Top 10 at the Mistral Worldwide Hackathon 2026 · Winner of Giant Ventures' "Future Unicorn" prize}

Look at something → ask a question → get an answer instantly
Walk through the world → your AI remembers what mattered
Talk naturally → no screen, no typing, no app switching

Get started in minutes

pip install portworld

Demo

Imagine walking through a city with an AI that can:

Identify buildings, objects, and landmarks in real time
Answer follow-up questions by voice
Remember what you saw earlier in the day
Act through tools, APIs, and external agents

"What is this building?" "Where should I eat nearby?" "What was the name of that church we passed earlier?"

What Is Port:World?

Port:World is a runtime for AI agents in the physical world.

Instead of building chatbots, you build agents that:

See through a camera
Hear through a microphone
Speak back in real time
Remember across sessions
Act through tools

This is not an app — it's a platform. You bring the agent logic. Port:World handles streaming, model routing, memory, and the glasses.

Who Is This For?

AI engineers building agent systems that go beyond text
Developers exploring wearable + AI integration
Builders who want to ship real-world AI — not another chat wrapper

Quickstart

Time to first interaction: ~5 minutes

Option A: Install the CLI (no clone required)

pip install portworld
portworld init
portworld doctor --target local

Option B: Clone and run with Docker

git clone https://github.com/portworld/PortWorld.git && cd PortWorld
cp backend/.env.example backend/.env
# Set at least one provider key — see "Minimum config" below
docker compose up --build

Verify:

curl http://127.0.0.1:8080/livez
# → {"status":"ok","service":"portworld-backend"}

Connect the iOS app

open IOS/PortWorld.xcodeproj

Let Xcode resolve Swift Package dependencies
Build the PortWorld scheme
Enter your backend URL in Settings and validate the connection

Siri shortcuts on iOS

The iOS app exposes a Siri shortcut that opens PortWorld and attempts to start an assistant session.

Available phrases:

Start PortWorld session
Start assistant in PortWorld
Launch PortWorld assistant session

If onboarding is incomplete or backend or glasses readiness is blocked, Siri still opens the app but the session will not start until those requirements are met.

OpenClaw (cross-cloud) setup

If your OpenClaw gateway runs on a different VM/VPS/cloud than PortWorld, use the openclaw-gateway-bridge agent skill for provider-agnostic setup and validation:

npx skills add portworld/PortWorld --skill openclaw-gateway-bridge -a codex -y

It guides secure endpoint exposure, auth wiring, and PortWorld OPENCLAW_* configuration end-to-end.

Minimum config

You only need one API key to start:

Provider	`backend/.env`
OpenAI Realtime	`REALTIME_PROVIDER=openai` + `OPENAI_API_KEY=sk-...`
Gemini Live	`REALTIME_PROVIDER=gemini_live` + `GEMINI_LIVE_API_KEY=...`

Vision, search, memory consolidation, and tool integrations are all off by default — enable them as you need them.

What You Can Build

Use case	How it works
Real-time travel guide	Walk through a city — the agent identifies landmarks, translates signs, suggests restaurants based on what it sees
Hands-free field assistant	Mechanics, surgeons, or technicians get step-by-step guidance while keeping both hands free
Accessibility companion	Describe scenes, read text aloud, identify objects and people for visually impaired users
Personal memory engine	"What was the name of that restaurant we passed?" — the agent remembers what you saw
Live coding pair	Point your glasses at a whiteboard or screen — the agent reads, reasons, and discusses
Security / inspection	Walk a site — the agent logs observations, flags anomalies, and generates reports
Your idea here	Port:World is a runtime, not a single app. Build whatever you want on top of it.

How It Works

┌──────────────┐       WebSocket (audio + control)       ┌──────────────────┐
│              │ ◄──────────────────────────────────────► │                  │
│  Ray-Ban     │                                          │   FastAPI        │
│  Meta        │       HTTP (vision frames)               │   Backend        │
│  Glasses     │ ──────────────────────────────────────►  │                  │
│              │                                          │   ┌────────────┐ │
│  ↕ DAT SDK   │                                          │   │ Realtime   │ │
│              │                                          │   │ Bridge     │─┼──► OpenAI / Gemini
│  iPhone      │                                          │   ├────────────┤ │
│  (bridge)    │                                          │   │ Vision     │─┼──► Mistral / Claude / GPT-4o / ...
│              │                                          │   ├────────────┤ │
│              │                                          │   │ Memory     │ │
│              │                                          │   ├────────────┤ │
│              │                                          │   │ Tools      │─┼──► Web search, MCP, OpenClaw, ...
│              │                                          │   └────────────┘ │
└──────────────┘                                          └──────────────────┘

Glasses capture audio and camera frames via Meta's DAT SDK. iPhone bridges glasses I/O to the backend over WebSocket (audio) and HTTP (vision). Backend routes audio to a realtime AI provider, processes vision frames through pluggable analyzers, manages persistent memory, and executes tools during the conversation.

Component map

Surface	What it does
`backend/`	FastAPI server — realtime voice relay, vision pipeline, memory, tools, auth
`IOS/`	SwiftUI app — glasses integration (DAT), audio capture, wake word, WebSocket transport
`portworld_cli/`	Developer CLI — init, doctor, deploy, status, logs, providers, extensions
`portworld_shared/`	Shared Python contracts between CLI and backend

Supported Providers

Realtime (voice)

Provider	ID	Key
OpenAI Realtime	`openai`	`OPENAI_API_KEY`
Gemini Live	`gemini_live`	`GEMINI_LIVE_API_KEY`

Vision (opt-in)

Provider	ID	Key(s)
Mistral	`mistral`	`VISION_MISTRAL_API_KEY`
OpenAI	`openai`	`VISION_OPENAI_API_KEY`
Gemini	`gemini`	`VISION_GEMINI_API_KEY`
Claude	`claude`	`VISION_CLAUDE_API_KEY`
Groq	`groq`	`VISION_GROQ_API_KEY`
NVIDIA	`nvidia_integrate`	`VISION_NVIDIA_API_KEY`
Azure OpenAI	`azure_openai`	`VISION_AZURE_OPENAI_API_KEY` + endpoint
AWS Bedrock	`bedrock`	`VISION_BEDROCK_REGION` (+ IAM)

Search & tools (opt-in)

Provider	ID	Key
Tavily	`tavily`	`TAVILY_API_KEY`

portworld providers list          # see all available providers
portworld providers show <id>     # inspect a specific provider

Extending Port:World

Port:World is designed to be extended. Here's how developers plug into the system:

Add a tool

Tools are async functions the AI can call mid-conversation. Register a definition + executor in the tool catalog:

registry.register(
    ToolDefinition(
        name="my_tool",
        description="Does something useful",
        parameters={"type": "object", "properties": { ... }},
    ),
    executor=my_tool_executor,
)

Add a vision provider

Implement a vision analyzer and register it in the vision factory. Your analyzer receives camera frames and returns semantic descriptions that feed into the agent's memory.

Add a realtime provider

Implement the realtime bridge interface and register it in the provider registry. The bridge handles upstream audio streaming and tool dispatch for any new model API.

Connect MCP servers

The backend supports Model Context Protocol (MCP) extensions. Drop a server config into the extensions system and expose new capabilities to the agent without touching core code.

Delegate to external agents

Use the OpenClaw delegation layer to offload long-running or tool-heavy tasks to external agent runtimes, while Port:World stays the live conversational orchestrator.

See backend/README.md for the full API and extension reference.

Deploy

Local (Docker Compose)

docker compose up --build

Cloud (one command)

portworld deploy gcp-cloud-run       --project <id> --region <region>
portworld deploy aws-ecs-fargate     --region <region>
portworld deploy azure-container-apps --subscription <sub> --resource-group <rg> --region <region>

See portworld_cli/README.md for readiness checks, log streaming, and redeployment.

Roadmap

Port:World is evolving from a hackathon winner into a full wearable AI platform:

Agentic delegation — OpenClaw integration for heavy multi-step tasks
Richer memory — identity, routines, social graph, preferences, with confidence tracking
Passive context — ambient scene understanding even outside active conversations
Proactive assistance — timely suggestions earned through context quality and user trust
Siri / App Shortcuts — launch a session with your voice, no app interaction needed
Android support — bring the same glasses-first experience to Android
Multi-agent coordination — orchestrate multiple specialized agents in parallel

Full roadmap: docs/roadmap/AGENTIC_PERSONAL_ASSISTANT_ROADMAP.md

Project Status

Port:World is in its first stable release phase. Core surfaces are release-ready.

Stable: backend self-hosting, CLI bootstrap/deploy, iOS app with Meta glasses
Shipping: first public PyPI + GHCR releases with v0.2.x
Hardening: managed cloud defaults, operator docs, production security posture

Known limitations

Provider API keys required — no keyless demo mode
AWS/Azure one-click deploys use public DB access by default (tighten before production)
Full glasses features require Meta hardware + Meta AI app
Xcode test schemes are not yet maintained

Documentation

Doc	What's inside
backend/README.md	Backend API, config reference, storage, auth
portworld_cli/README.md	CLI install, commands, deploy workflows
IOS/README.md	iOS setup, Meta DAT, permissions, architecture
GETTING_STARTED.md	Extended onboarding for all setup paths
CHANGELOG.md	Release history

Contributing

Contributions are welcome. Please read CONTRIBUTING.md before opening a PR.

Bug reports & features: open an issue
Security: SECURITY.md
Code of conduct: CODE_OF_CONDUCT.md

Origin

Built during the Mistral Worldwide Hackathon 2026 by Pierre Haas, Vassili de Rosen, and Arman Artola.

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
.github		.github
IOS		IOS
backend		backend
docs		docs
portworld_cli		portworld_cli
portworld_shared		portworld_shared
skills		skills
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
GETTING_STARTED.md		GETTING_STARTED.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Port World logo.png		Port World logo.png
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.yml		docker-compose.yml
install.sh		install.sh
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Port:World

Demo

What Is Port:World?

Who Is This For?

Quickstart

Option A: Install the CLI (no clone required)

Option B: Clone and run with Docker

Connect the iOS app

Siri shortcuts on iOS

OpenClaw (cross-cloud) setup

Minimum config

What You Can Build

How It Works

Component map

Supported Providers

Realtime (voice)

Vision (opt-in)

Search & tools (opt-in)

Extending Port:World

Add a tool

Add a vision provider

Add a realtime provider

Connect MCP servers

Delegate to external agents

Deploy

Local (Docker Compose)

Cloud (one command)

Roadmap

Project Status

Known limitations

Documentation

Contributing

Origin

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages