Skip to content

pyraxo/clicky

 
 

Repository files navigation

Hi, this is Clicky.

It's an AI teacher that lives as a buddy next to your cursor. It can see your screen, talk to you, and even point at stuff. Kinda like having a real teacher next to you.

Clicky — an ai buddy that lives on your mac

This is the open-source version of Clicky for those that want to hack on it, build their own features, or just see how it works under the hood.

Get started with Claude Code

The fastest way to get this running is with Claude Code.

Once you get Claude running, paste this:

Hi Claude.

Clone https://github.com/farzaa/clicky.git into my current directory.

Then read the CLAUDE.md. I want to get Clicky running locally on my Mac.

Help me set up the local API keys in Info.plist and get it building in Xcode. Walk me through it.

That's it. It'll clone the repo, read the docs, and walk you through the whole setup. Once you're running you can just keep talking to it — build features, fix bugs, whatever. Go crazy.

Manual setup

If you want to do it yourself, here's the deal.

Prerequisites

1. Add your API keys locally

Populate these Info.plist values in leanring-buddy/Info.plist before running the app:

  • AnthropicAPIKey
  • AssemblyAIAPIKey
  • ElevenLabsAPIKey
  • ElevenLabsVoiceID
  • Optional: OpenAIAPIKey
  • Optional: OpenAITranscriptionModel

2. Open in Xcode and run

open leanring-buddy.xcodeproj

In Xcode:

  1. Select the leanring-buddy scheme (yes, the typo is intentional, long story)
  2. Set your signing team under Signing & Capabilities
  3. Hit Cmd + R to build and run

The app will appear in your menu bar (not the dock). Click the icon to open the panel, grant the permissions it asks for, and you're good.

Permissions the app needs

  • Microphone — for push-to-talk voice capture
  • Accessibility — for the global keyboard shortcut (Control + Option)
  • Screen Recording — for taking screenshots when you use the hotkey
  • Screen Content — for ScreenCaptureKit access

Architecture

If you want the full technical breakdown, read CLAUDE.md. But here's the short version:

Menu bar app (no dock icon) with two NSPanel windows — one for the control panel dropdown, one for the full-screen transparent cursor overlay. Push-to-talk streams audio over a websocket to AssemblyAI, sends the transcript + screenshot to Claude via streaming SSE, and plays the response through ElevenLabs TTS. Claude can embed [POINT:x,y:label:screenN] tags in its responses to make the cursor fly to specific UI elements across multiple monitors. API calls go directly to the vendors using keys configured locally in Info.plist.

Project structure

leanring-buddy/          # Swift source (yes, the typo stays)
  CompanionManager.swift    # Central state machine
  CompanionPanelView.swift  # Menu bar panel UI
  ClaudeAPI.swift           # Claude streaming client
  ElevenLabsTTSClient.swift # Text-to-speech playback
  OverlayWindow.swift       # Blue cursor overlay
  AssemblyAI*.swift         # Real-time transcription
  BuddyDictation*.swift     # Push-to-talk pipeline
CLAUDE.md                # Full architecture doc (agents read this)

Contributing

PRs welcome. If you're using Claude Code, it already knows the codebase — just tell it what you want to build and point it at CLAUDE.md.

About

Clicky without telemetry

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Swift 96.1%
  • Shell 3.9%