Stars
📚 Freely available programming books
Stable Diffusion web UI
Command-line program to download videos from YouTube.com and other video sites
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
An AI SKILL that provide design intelligence for building professional UI/UX multiple platforms
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
No fortress, purely open ground. OpenManus is Coming.
🚀🚀 「大模型」2小时完全从0训练64M的小参数GPT!🌏 Train a 64M-parameter GPT from scratch in just 2h!
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time
Official Code for DragGAN (SIGGRAPH 2023)
Convert PDF to markdown + JSON quickly with high accuracy
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Private AI platform for agents, assistants and enterprise search. Built-in Agent Builder, Deep research, Document analysis, Multi-model support, and API connectivity for agents.
Open Source DeepWiki: AI-Powered Wiki Generator for GitHub/Gitlab/Bitbucket Repositories. Join the discord: https://discord.gg/gMwThUMeme
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
pix2code: Generating Code from a Graphical User Interface Screenshot
Implementation of Nougat Neural Optical Understanding for Academic Documents
Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/ (no longer working)






