To use TT-Studio's deployment features, you need access to a Tenstorrent AI accelerator.
Alternatively, you can connect to remote endpoints running models on Tenstorrent cards without local hardware.
TL;DR: TT-Studio is an easy-to-use web interface for running AI models on Tenstorrent hardware. It handles all the technical setup automatically and gives you a simple GUI to deploy models, chat with models, and more.
TT-Studio combines TT Inference Server's core packaging setup, containerization, and deployment automation with TT-Metal's model execution framework specifically optimized for Tenstorrent hardware and provides an intuitive GUI for model management and interaction.
Before you start, make sure you have:
⚠️ IMPORTANT: Complete the base Tenstorrent software installation first:Follow the Tenstorrent Getting Started Guide
This guide covers hardware setup, driver installation, and system configuration. You must complete this before using TT-Studio.
Also ensure you have:
- Python 3.8+ (Download here)
- Docker (Installation guide)
Want to start using AI models right away on your Tenstorrent hardware? This is for you!
Quick Setup:
git clone https://github.com/tenstorrent/tt-studio.git && cd tt-studio && python3 run.pyWhat happens step by step:
- Downloads TT-Studio - Gets the code from GitHub
- Enters the directory - Changes to the tt-studio folder
- Runs the setup script - Automatically configures everything
- Initializes submodules - Downloads TT Inference Server and dependencies
- Prompts for configuration - Asks for your Hugging Face token and generates security keys
- Builds containers - Sets up Docker environments for frontend and backend
- Starts all services - Launches the web interface and backend server
After Setup:
- Go to http://localhost:3000 to use TT-Studio
- The backend runs at http://localhost:8001
- Individual AI models run on ports 7000+ (e.g., 7001, 7002, etc.)
To Stop TT-Studio:
python3 run.py --cleanupNote: This command will stop and remove all running Docker containers, including any currently deployed models. It cleans up containers and networks but preserves your data and configuration files.
🎯 What Can You Do Next?
Once TT-Studio is running:
- Deploy a Model - Go to the Model Deployment page and deploy a model to start using AI features
- Use AI Features:
- 💬 Chat with AI models - Upload documents and ask questions
- 🖼️ Generate images - Create art with Stable Diffusion
- 🎤 Process speech - Convert speech to text with Whisper
- 👁️ Analyze images - Detect objects with YOLO models
- 📚 RAG (Retrieval-Augmented Generation) - Query your documents with AI-powered search
- 🤖 AI Agent - Autonomous AI assistant for complex tasks
📖 Learn More: Check out our Model Interface Guide for detailed tutorials.
🆘 Need Help?
- No Tenstorrent hardware? → Remote Endpoint Setup - Connect to remote Tenstorrent cards
- Issues during setup? → Troubleshooting Guide
- Questions? → FAQ
- Remote server setup? → See Remote Access Guide below
- Technical support? → Submit issues on GitHub
Want to contribute to TT-Studio or modify it?
Development Mode Setup:
git clone https://github.com/tenstorrent/tt-studio.git
cd tt-studio
python3 run.py --devDevelopment Features:
- Hot Reload: Code changes automatically trigger rebuilds
- Container Mounting: Local files mounted for real-time development
- Automatic Setup: All submodules and dependencies handled automatically
Get Started:
- Contributing Guide - How to contribute code
- Development Setup - Set up your dev environment
- Frontend Development - React frontend
- Backend API - Django backend
Resources:
Running TT-Studio on a remote server? Use SSH port forwarding to access it from your local browser:
ssh -L 3000:localhost:3000 -L 8001:localhost:8001 -L 7000-7010:localhost:7000-7010 username@your-serverNote: Port range 7000-7010 forwards the model inference ports where individual AI models run.
Then open http://localhost:3000 in your local browser.
Hardware Requirements: Tenstorrent AI accelerator hardware is automatically detected when available. You can also connect to remote endpoints if you don't have direct hardware access.
TT-Studio combines TT Inference Server and TT-Metal to provide:
- Modern Web Interface: React-based UI for easy model interaction
- Django Backend: Robust backend service for model management and deployment
- Vector Database: ChromaDB for document storage and semantic search
- Multiple AI Models: Chat, vision, speech, and image generation
- Model Isolation: Each AI model runs on separate ports (7000+) for better resource management
- Hardware Optimization: Specifically optimized for Tenstorrent devices
- Docker Containers: Isolated environments for frontend, backend, and inference services
- Language Models (LLMs): Chat, Q&A, text completion
- Computer Vision: Object detection with YOLO
- Speech Processing: Speech-to-text with Whisper
- Image Generation: Create images with Stable Diffusion
Want to contribute or customize TT-Studio?
Get Started:
- Contributing Guide - How to contribute code
- Development Setup - Set up your dev environment
- Frontend Development - React frontend
- Backend API - Django backend
Development Mode:
python3 run.py --dev # Enables hot reload for developmentDevelopment Features:
- Hot Reload: Code changes automatically trigger rebuilds
- Container Mounting: Local files mounted for real-time development
- Automatic Setup: All submodules and dependencies handled automatically
Resources:
- FAQ - Quick answers to common questions
- Troubleshooting Guide - Fix common setup issues
- Model Interface Guide - Detailed tutorials for using AI models
- Complete run.py Guide - Advanced usage and command-line options
- Having issues? Check our Troubleshooting Guide
- Want to contribute? See our Contributing Guide
- Need specific models? Follow our vLLM Models Guide
⚠️ Note: Thestartup.shscript is deprecated. Always usepython3 run.pyfor setup and management.