"Your phone becomes a complete AI workstation: Chat, Voice, Images, and Knowledge—all running offline with zero compromises."
The most comprehensive privacy-focused AI ecosystem for mobile devices. Run LLMs, generate images, use voice AI, inject custom knowledge—all completely offline. Or seamlessly connect to 100+ cloud models when you need more power. No subscriptions. No data harvesting. Complete control.
GitHub Releases • Join Discord • Documentation
"Sometimes the bravest thing you can do is step back to move forward stronger."
Development temporarily paused as of December 2025.
After careful consideration, I've made the difficult decision to temporarily pause active development on ToolNeuron. This decision comes down to a few key factors:
Why I'm pausing:
- My internship requires my full attention right now, and I need to prioritize it financially
- Managing both my internship and this project simultaneously has become unsustainable
- The mental load of juggling two complex projects has been taking a toll on my health and well-being
- The project needs more maintainers than I can currently provide alone
What this means:
- The repository will remain available, and anyone interested in forking and continuing development is more than welcome to do so
- Once my internship situation stabilizes and I have the bandwidth to give this project the attention it deserves, I plan to resume development
- The Discord community remains open for discussions, collaboration, and support
A heartfelt thank you:
I'm genuinely grateful to everyone who has supported this project and stuck around. Your interest and contributions have meant a lot to me. I'm sorry I couldn't deliver the complete product as I had hoped, but I believe this pause is necessary for both the project's future and my own health.
If you're interested in maintaining or contributing to a fork, please feel free to reach out via Discord. I'd love to see this project continue in some form.
Thank you for understanding.
ToolNeuron is the first Android application to combine Chat AI (LLMs), Image Generation (Stable Diffusion), Voice AI (TTS/STT), and Custom Knowledge Injection (RAG) in a single, privacy-first package. Everything runs entirely on-device with zero internet dependency, or connect to cloud models for maximum flexibility.
Stop choosing between privacy and power. ToolNeuron gives you both.
Three Operating Modes:
- 🔒 Privacy Mode — Execute GGUF models (Llama 3, Mistral, Gemma), generate images with Stable Diffusion 1.5, use voice AI, and inject custom knowledge—all completely offline. Your data never leaves your phone.
- ⚡ Power Mode — Access 100+ premium cloud models (GPT-4, Claude 3.5, Gemini, DALL-E) via OpenRouter for complex tasks requiring maximum capability.
- 🔄 Hybrid Intelligence — Seamlessly switch between offline and cloud modes mid-conversation while preserving full context and conversation history.
🤖 Complete AI Suite Offline
The only mobile app that combines chat, image generation, voice, and knowledge injection—all running on-device without internet.
🎨 On-Device Image Generation
Run Stable Diffusion 1.5 (censored & uncensored) completely offline. Generate images on flights, in remote areas, anywhere.
🧠 RAG Data-Packs
Inject Wikipedia dumps, coding documentation, personal notes, or any custom knowledge directly into AI context—no model retraining required.
🔌 Extensible Plugin System
Add web search, content scraping, document analysis, and more. Build your own plugins for unlimited extensibility.
🎙️ Premium Offline Voice
11 professional TTS voices + Whisper STT—all running on-device with zero cloud dependencies and near-instant processing.
🌐 100+ Cloud Models
When you need maximum power, seamlessly access GPT-4, Claude, Gemini, and 100+ other models via OpenRouter integration.
Local Execution
Native support for GGUF model formats using llama.cpp. Run models like Llama 3, Mistral, Gemma, Phi, and more entirely on-device with optimized quantization for mobile hardware.
Cloud Orchestration
Unified API integration through OpenRouter provides instant access to 100+ state-of-the-art models without vendor lock-in or multiple subscriptions.
Intelligent Streaming
Real-time token generation with context-aware memory management ensures smooth performance whether running locally or in the cloud.
Stable Diffusion 1.5
Full SD 1.5 implementation running completely offline on your phone. Generate high-quality images in 30-90 seconds depending on your device.
Censored & Uncensored Options
Choose between SFW (censored) or uncensored models for artistic freedom and research applications.
Optimized for Mobile
Specially quantized and optimized to run on phones with 6GB+ RAM while maintaining image quality.
Dynamic Knowledge Injection
Mount custom datasets (JSON, text, markdown) to enhance AI responses with specialized knowledge without retraining models.
Use Cases:
- Inject Wikipedia dumps for educational queries
- Load coding documentation for development assistance
- Add personal notes or company data for context-aware responses
- Import research papers or domain-specific knowledge
Plugin Integration
Data-Packs work seamlessly with both local GGUF models and cloud models for maximum flexibility.
Text-to-Speech (TTS)
Powered by Sherpa-ONNX, includes 11 professional-grade voices (5 American Female, 2 American Male, 2 British Female, 2 British Male) running entirely on CPU/NPU with zero cloud dependencies.
Speech-to-Text (STT)
Offline Whisper-powered speech recognition for hands-free AI interaction. Perfect for driving, multitasking, or accessibility needs.
Zero Latency
All voice processing happens on-device with near-instantaneous synthesis and recognition.
Available Now:
- Web Search — Real-time information retrieval with search engine integration
- Web Scraper — Extract and inject content from any URL into conversation context
- DataHub — Mount and manage custom knowledge bases dynamically
- Document Viewer — Analyze and discuss PDF/text documents with AI
Coming Soon:
- Code execution environments
- Advanced image processing pipelines
- Multi-document analysis
- Custom plugin marketplace
- Conversation Persistence — Full chat history with efficient SQLite storage
- Dynamic Datasets — Attach custom knowledge without model retraining
- Context Preservation — Switch models mid-conversation without losing thread
- Export Options — Save conversations, code snippets, and generated images
- Multi-Session — Manage multiple conversation threads simultaneously
| Chat Interface Multi-modal conversations |
Model Hub 100+ models available |
Code Canvas Syntax highlighting & export |
Settings Complete customization |
![]() |
![]() |
![]() |
![]() |
| Feature | ToolNeuron | ChatGPT Mobile | Other AI Apps |
|---|---|---|---|
| Offline Chat (LLMs) | ✅ Full GGUF support | ❌ Cloud only | |
| Offline Image Generation | ✅ Stable Diffusion 1.5 | ❌ | ❌ |
| Offline Voice (TTS/STT) | ✅ 11 voices + Whisper | ❌ Cloud only | |
| Custom Knowledge (RAG) | ✅ Data-Packs system | ❌ | ❌ |
| Plugin Extensibility | ✅ Open architecture | ❌ | ❌ |
| Cloud Model Access | ✅ 100+ via OpenRouter | ✅ 1 model | |
| Uncensored Options | ✅ User choice | ❌ Heavily filtered | ❌ Restricted |
| Privacy Architecture | ✅ Local-first, zero logging | ❌ Server logging | ❌ Data harvesting |
| Pricing Model | ✅ Free (BYOK optional) | ❌ $20/month | ❌ $10-60/month |
| Source Code | ✅ Apache 2.0 | ❌ Proprietary | ❌ Closed source |
| Works Without Internet | ✅ Full functionality | ❌ Useless offline |
Visit ToolNeuron on APKPure for the latest stable release with automatic update notifications.
Download the latest release from GitHub Releases and install ToolNeuron-Beta-5.1.apk on Android 8.0+ devices.
# Clone repository
git clone https://github.com/Siddhesh2377/NeuroVerse.git
cd NeuroVerse
# Open in Android Studio (Ladybug or newer)
# Sync Gradle dependencies
./gradlew assembleDebug
# Install on connected device
./gradlew installDebug1. Load a Chat Model (GGUF)
- Download a GGUF model from HuggingFace
- Recommended:
Llama-3-8B-Q4_K_M.gguf(4.5GB) - Budget:
TinyLlama-1.1B-Q4_K_M.gguf(669MB)
- Recommended:
- Navigate to Settings → Local Models → Import Model
- Select your downloaded GGUF file
- Wait for model to load, then start chatting offline!
2. Load Image Generation (Stable Diffusion)
- Download SD 1.5 model (censored or uncensored version)
- Navigate to Settings → Image Models → Import SD Model
- Select model file
- Generate images completely offline!
3. Enable Voice AI
- TTS voices are included by default (no download needed)
- For STT: Download Whisper model from Settings → Voice Models
- Enable voice input in chat interface
4. Create RAG Data-Packs
- Prepare your knowledge in JSON/text format
- Navigate to DataHub → Create New Pack
- Import your data files
- Attach to conversations for enhanced context
- Visit OpenRouter.ai and create account
- Generate an API key (free tier available)
- In ToolNeuron: Settings → API Configuration
- Enter your OpenRouter API key
- Access 100+ models instantly (GPT-4, Claude, Gemini, etc.)
Simply switch between local and cloud models mid-conversation:
- Use offline LLM for privacy-sensitive queries
- Switch to GPT-4 for complex reasoning tasks
- Return to offline for continued privacy
- All context is preserved automatically!
- Operating System: Android 8.0+ (API 26)
- RAM: 4GB
- Storage: 2GB available space
- Use Case: Cloud models + basic TTS only
- Operating System: Android 10+
- RAM: 6GB+ (8GB preferred)
- Processor: Snapdragon 8 Gen 1 / Dimensity 8100 or equivalent
- Storage: 5GB+ available space
- NPU: Optional but improves performance significantly
- Operating System: Android 11+
- RAM: 8GB minimum (12GB preferred)
- Processor: Snapdragon 8 Gen 2 or equivalent flagship
- Storage: 8GB+ available space (for SD models)
- Generation Time: 30-90 seconds depending on device
- RAM: 12GB+
- Processor: Snapdragon 8 Gen 3 or equivalent
- Storage: 10GB+ free space
- Experience: Smooth chat + image generation + voice AI
- ✅ GGUF model support with llama.cpp
- ✅ 11 offline TTS voices via Sherpa-ONNX
- ✅ OpenRouter cloud integration (100+ models)
- ✅ Plugin system (Web Search, Scraper, DataHub)
- ✅ RAG Data-Packs for knowledge injection
- 🚧 Stable Diffusion 1.5 offline image generation
- 🚧 Offline Whisper STT integration
- Multi-voice TTS conversations (different voices for different characters)
- Advanced code export with syntax highlighting
- Desktop companion app (Windows/Linux sync)
- Enhanced plugin marketplace
- Vector database for long-term memory
- Multi-modal vision models (LLaVA, GPT-4V integration)
- TFLite and ONNX runtime support
- On-device video analysis
- Collaborative AI sessions
- Custom model fine-tuning tools
- Cross-platform synchronization (phone ↔ desktop)
- Community plugin marketplace
- Advanced RAG with semantic search
- Enterprise deployment options
- API for third-party integration
- Test prompts and APIs without cloud costs during development
- Run coding assistants offline on flights or with poor connectivity
- Inject documentation into RAG for context-aware code help
- Generate UI mockups and diagrams with SD
- Privacy-first development environment
- Zero data leaves your device in offline mode
- Verify privacy claims (open-source Apache 2.0)
- No tracking, no telemetry, no server logging
- Full control over your AI interactions
- Uncensored options for research and legitimate use
- Full AI capability on flights (no WiFi needed)
- Works in remote areas with no connectivity
- No roaming data costs for AI queries
- Generate travel content (images, itineraries) offline
- Voice translations without cloud latency
- Generate images for social media posts anywhere
- Brainstorm content ideas with AI offline
- Create variations and iterations without API limits
- No subscription costs eating into creator budgets
- Uncensored artistic freedom
- Free access to cutting-edge AI models
- Study AI without expensive subscriptions
- Load research papers into RAG for analysis
- Generate diagrams and visualizations
- Privacy for sensitive academic work
ToolNeuron implements modern Android development patterns with a hybrid native/Kotlin architecture:
Core Technologies:
- Language: Kotlin (UI/Logic) + C++ (Inference engines)
- UI Framework: Jetpack Compose (declarative, reactive UI)
- Local Inference: llama.cpp (GGUF models) + JNI bindings
- Image Generation: Stable Diffusion C++ implementation
- TTS Engine: Sherpa-ONNX (neural voices)
- STT Engine: Whisper via Sherpa-ONNX
- API Layer: Retrofit + OkHttp (cloud models)
- Database: Room (SQLite wrapper) for conversations
- Async Operations: Kotlin Coroutines + Flow
- Dependency Injection: Hilt/Dagger
Performance Optimizations:
- Quantized model support (Q4_K_M, Q5_K_S, etc.)
- Context caching for faster inference
- Memory-mapped model loading
- NPU acceleration where available
- Efficient token streaming
- Background processing with WorkManager
We welcome contributions from developers, researchers, AI enthusiasts, and privacy advocates!
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes with descriptive messages
- Push to your branch (
git push origin feature/AmazingFeature) - Open a Pull Request with detailed description
High Priority:
- 🐛 Bug reports and fixes (especially device-specific issues)
- 📚 Documentation improvements and translations
- 🧪 Testing on various Android devices and chipsets
- 🔌 New plugin development
- 🎨 UI/UX enhancements
Medium Priority:
- 🌍 Internationalization (i18n) - help us support more languages
- ♿ Accessibility improvements
- 📊 Performance optimizations
- 🎓 Tutorial content and guides
Feature Requests:
- Check existing issues before creating new ones
- Provide clear use cases and examples
- Be patient - we're a small team!
- Follow Kotlin coding conventions
- Write meaningful commit messages
- Test on real devices when possible
- Document new features
- Respect user privacy in all contributions
Distributed under the Apache 2.0 License. See LICENSE for complete terms.
What this means:
- ✅ Commercial use - Use in commercial products
- ✅ Modification - Modify and create derivatives
- ✅ Distribution - Distribute freely
- ✅ Patent use - License includes patent rights
- ✅ Private use - Use privately without restrictions
Requirements:
- 📄 Include license and copyright notice
- 📝 Document any changes made
- 🔓 Make source available if distributing
"If I have seen further, it is by standing on the shoulders of giants." — Isaac Newton
ToolNeuron would not be possible without these exceptional open-source projects:
- llama.cpp by Georgi Gerganov — Efficient LLM inference in pure C/C++
- Sherpa-ONNX — Premium offline speech synthesis and recognition
- Stable Diffusion — Revolutionary text-to-image generation
- OpenRouter — Unified API gateway for 100+ AI models
- Jetpack Compose — Modern declarative UI for Android
- HuggingFace — Community and models that make AI accessible
Special thanks to the open-source AI community for making privacy-respecting AI possible.
- 💬 Discord Community — Real-time chat, support, and discussions
- 🐛 Issue Tracker — Report bugs or request features
- 💡 GitHub Discussions — Technical questions and ideas
- 📧 Email: Support — For private inquiries
- ⭐ Star this repository to show support and get updates
- 👀 Watch releases for new features and updates
- 🐦 Follow on Twitter @ToolNeuron — News and announcements
- 📱 APKPure — Automatic update notifications
Q: Will this drain my battery?
A: Local inference is power-intensive. For long sessions, keep your phone plugged in. Cloud mode uses minimal battery.
Q: How big are the model files?
A: GGUF models: 0.5GB-8GB depending on model size. SD 1.5: ~2GB. TTS/STT: 50-500MB.
Q: Can I use my own API keys?
A: Yes! BYOK (Bring Your Own Key) for OpenRouter. You control costs and usage.
Q: Is my data really private?
A: In offline mode, absolutely nothing leaves your device. Verify in our open-source code.
Q: Why not just use ChatGPT?
A: ToolNeuron gives you choice, privacy, offline capability, uncensored options, and zero subscriptions.
Q: Does it support iOS?
A: Not currently. Android only due to technical constraints of iOS.
Q: Can I monetize apps built with this?
A: Yes! Apache 2.0 license allows commercial use.
Built with ❤️ by Siddhesh2377 and the Open Source Community
Privacy-first AI for everyone, everywhere
If ToolNeuron empowers your AI journey, please ⭐ star the repository!
Download • Report Bug • Request Feature • View Roadmap • Join Discord
Made possible by llama.cpp • Sherpa-ONNX • Stable Diffusion • OpenRouter • Jetpack Compose



