Lists (7)
Sort Name ascending (A-Z)
Stars
📚 Freely available programming books
A feature-rich command-line audio/video downloader
Command-line program to download videos from YouTube.com and other video sites
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Robust Speech Recognition via Large-Scale Weak Supervision
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
No fortress, purely open ground. OpenManus is Coming.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
A generative speech model for daily dialogue.
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…
🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time
中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
Instant voice cloning by MIT and MyShell. Audio foundation model.
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
We have made you a wrapper you can't refuse
Deezer source separation library including pretrained models.
🍰 Desktop utility to download images/videos/music/text from various websites, and more.
Universal LLM Deployment Engine with ML Compilation
Rembg is a tool to remove images background
⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
A Deep Learning based project for colorizing and restoring old images (and video!)
