-
Adobe
- Bangalore,India
-
03:23
(UTC -12:00)
Highlights
Stars
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Stable Diffusion web UI
A feature-rich command-line audio/video downloader
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
real time face swap and one-click video deepfake with only a single image
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
The official gpt4free repository | various collection of powerful language models | opus 4.6 gpt 5.3 kimi 2.5 deepseek v3.2 gemini 3
Clone a voice in 5 seconds to generate arbitrary speech in real-time
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
Get your documents ready for gen AI
Unified web UI for training and running open models like Qwen, DeepSeek, and Gemma locally.
No fortress, purely open ground. OpenManus is Coming.
Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
🎨 Diagram as Code for prototyping cloud system architectures
Instant voice cloning by MIT and MyShell. Audio foundation model.
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
Best and simplest tool for website change detection, web page monitoring, and website change alerts. Perfect for tracking content changes, price drops, restock alerts, and website defacement monito…
AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in a tailored way.
Open-Sora: Democratizing Efficient Video Production for All
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phone
DeepSeek Coder: Let the Code Write Itself
Automate browser based workflows with AI
A TTS model capable of generating ultra-realistic dialogue in one pass.
A list of free LLM inference resources accessible via API.
