Stars
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
Stable Diffusion web UI
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
A generative speech model for daily dialogue.
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
Official inference framework for 1-bit LLMs
TradingAgents: Multi-Agents LLM Financial Trading Framework
[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero
A modular graph-based Retrieval-Augmented Generation (RAG) system
A generative world for general-purpose robotics & embodied AI learning.
SGLang is a high-performance serving framework for large language models and multimodal models.
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
🤖 可 DIY 的 多模态 AI 聊天机器人 | 🚀 快速接入 微信、 QQ、Telegram、等聊天平台 | 🦈支持DeepSeek、Grok、Claude、Ollama、Gemini、OpenAI | 工作流系统、网页搜索、AI画图、人设调教、虚拟女仆、语音对话 |
Tongyi Deep Research, the Leading Open-source Deep Research Agent
fay是一个帮助数字人(2.5d、3d、移动、pc、网页)或大语言模型(openai兼容、deepseek)连通业务系统的agent框架。
SQL Native Memory Layer for LLMs, AI Agents & Multi-Agent Systems
Kronos: A Foundation Model for the Language of Financial Markets
Agent S: an open agentic framework that uses computers like a human
Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-re…
🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目,支持ChatGPT多轮对话能力,还可能是首个支持脑机交互的开源智能音箱项目。
[CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation