An iOS app that uses AI to generate on-demand audiobook snippets for Kindle.
Read the blog post for a detailed writeup on the technical challenges and how it works.
demo.mp4
- On-demand audiobook generation – Generate audio for any Kindle book from your current reading position
- Multiple TTS providers – Choose between ElevenLabs and Cartesia voices
- Configurable duration – Specify how many minutes of audio to generate (1-8 at a time)
- Auto-seek playback – When resuming, audio automatically seeks to match your current Kindle position
- Bidirectional progress sync – Listening progress syncs back to Kindle in real-time
- LLM text preprocessing – Optional GPT-based cleanup of OCR text before synthesis, including custom pauses for Cartesia
- Audiobook library – Browse and replay previously generated audiobooks
- Background playback – Lock screen controls and background audio support
I usually read on a Kindle Paperwhite, but when I don't have it with me I use the Kindle app on my phone. Sometimes I want to continue reading after I have to stop using my phone (e.g. I get off the subway and walk to my destination); an audiobook would be perfect for this.
However, I don't normally listen to audiobooks, so I wouldn't want to purchase the audiobook version of my book just for a few minutes of listening when I can't use my phone. Audiobooks on Kindle also only sync with the printed book when Whispersync-for-Voice is enabled. These two factors led me to build this app.
ios-app: SwiftUI client that captures Kindle session info, and drives audiobook genertion, listening, and management. See ios-app-architecture.md.server: Fastify backend orchestrating Kindle fetches and text extraction. Uses my fork ofkindle-apifor Kindle interactions. See server-architecture.md.text-extraction-stubs: Stub interfaces for the text extraction module (implementation not included).tls-client-api: Submoduled TLS proxy binary used by the backend for Amazon requests (repo).
Clone the repository with submodules:
git clone --recursive https://github.com/ryanbbrown/kindle-storyteller.gitThere are three ways to test the app (iPhone simulator on Mac, iPhone on local network, or iPhone against production), so you need to tell the iOS app which server to connect to. The API base URL is stored in ios-app/KindleAudioApp/Config.xcconfig (git-ignored). API_BASE_HOST should be one of:
localhost:3000– local development with simulator<your-mac-ip>:3000– testing on a physical iPhone connected to the same network<your-fly-app>.fly.dev– production Fly.io deployment, works for simulator or actual iPhone
For running on a physical iPhone, see 4.0 iOS App Installation.
Copy the example config files and fill in your values:
server/.env.example→server/.envtext-extraction/.env.example→text-extraction/.envios-app/KindleAudioApp/Config.xcconfig.example→ios-app/KindleAudioApp/Config.xcconfig
Server (server/.env):
SERVER_API_KEY– Secret key for authenticating iOS client requests. Set to any secure random string, then configure the same value inConfig.xcconfig.TLS_SERVER_API_KEY– API key for the TLS proxy (can be any string).ELEVENLABS_API_KEY– ElevenLabs API key for text-to-speech.CARTESIA_API_KEY– Cartesia API key for text-to-speech.OPENAI_API_KEY– OpenAI API key for LLM-based text preprocessing.
Text extraction (text-extraction/.env):
OCRSPACE_API_KEY– OCR.space API key for text extraction from images.
iOS app (ios-app/KindleAudioApp/Config.xcconfig):
API_BASE_HOST– Server hostname (see Testing Environments above).SERVER_API_KEY– Must match the server'sSERVER_API_KEY.
The TLS proxy requires Go to build from source:
- Build the binaries:
cd tls-client-api/cmd/tls-client-api && ./build.sh
- Copy the config template and add your API key:
cp tls-client-api/cmd/tls-client-api/config.dist.yml tls-client-api/dist/config.yml
- Edit
tls-client-api/dist/config.ymland add yourTLS_SERVER_API_KEYto theapi_auth_keysarray.
The text-extraction-stubs directory contains interface definitions for text extraction functionality.
This application requires a working text extraction module that is not included in this repository.
See text-extraction-stubs/README.md for the expected interface and return types.
- Start the TLS proxy:
cd tls-client-api/dist && ./tls-client-api-darwin-arm64- - Run the Fastify backend:
cd server && pnpm dev - With both services running, open the iOS project in Xcode and build + run it.
The repository includes a multi-stage Dockerfile, process supervisor (start.sh), and fly.toml to run the Fastify backend, TLS proxy, and text-extraction pipeline in a single Fly machine.
-
Create the Fly app
fly launch --no-deploy --copy-config
Adjust the generated app name/region inside
fly.tomlif needed. -
Configure secrets – set the server API key, TLS proxy key, TTS keys, LLM key, and OCR key:
fly secrets set \ SERVER_API_KEY="your-key" \ TLS_SERVER_API_KEY="your-key" \ ELEVENLABS_API_KEY="your-key" \ CARTESIA_API_KEY="your-key" \ OPENAI_API_KEY="your-key" \ OCRSPACE_API_KEY="your-key"
Or, if you've already configured
server/.env, set all secrets from it:grep -v '^#' server/.env | grep '=' | xargs fly secrets set
All Kindle cookies/tokens must come from the iOS client; there's no server-side fallback.
-
Deploy
fly deploy --remote-only
The container installs Node 20, Python 3.12 + uv, and builds the Go TLS proxy. The
start.shentrypoint boots the TLS proxy first, then starts the Fastify server. -
Point the iOS app at Fly – update
API_BASE_HOSTin yourConfig.xcconfigto your Fly hostname.
If you want to use the app on an iPhone instead of the simulator, there are steps to follow on Mac.
Not all of these are necessary if you've done iOS app development on your Mac before, and note that I don't use a paid developer account, so the app has to be re-signed every 7 days.
Expand to see steps
- Open Xcode
- Click "Xcode" -> "Settings"
- Click "Apple Accounts"
- Add your apple account
- Click on the account, then "Personal Team", then "Manage Certificates..."
- If a certificate doesn't exist, click the "+" in the bottom left, then "Apple Development" to create a new certificate
Expand to see steps
- Open the KindleAudioApp project in Xcode
- In the navigation bar on the left, select the top-level "KindleAudioApp"
- Go to the "Signing & Capabilities" tab
- Under "Team", select your "Personal Team" (your Apple ID) that you added.
- Ensure "Automatically manage signing" is enabled.
- Set "Bundle Identifier" to something unique; I used com.example.KindleAudioApp (probably should change lol)
- Near the top left, click "+ Capability", then click on the "Background Modes" capability to add it
- Expand it and check the "Audio, AirPlay, and Picture in Picture" box
Expand to see steps
- Plug your iPhone into your Mac
- Click "Trust" if you haven't already
- In center top bar of Xcode, click on the device selector and select your iPhone
- Click the Run (▶) button
- You'll receive a pop-up that says "Developer Mode disabled"; go to Settings -> Privacy & Security, scroll all the way down, click on "Developer Mode", then enable it, restart your phone, and accept any prompts
- Click the Run button again if needed
This project is a personal proof-of-concept for generating short audio snippets from books you own. It's intended for temporary, on-the-go listening when you can't look at a screen—not as a replacement for purchasing audiobooks.
If you enjoy a book in audio form, please support the author and narrator by buying the official audiobook. Professional narrators bring craft and interpretation that AI-generated speech can't replicate.
Use this tool only with content you've legitimately purchased, and at your own risk.