Lists (32)
Sort Name ascending (A-Z)
3DPose
3D目标检测
ai toy
AIGame
AI绘图
GIBHUB代理
GL_DX_InterOP
Live2D
Mocap
NDI
NERF
TensorRT
Text->Image
tracking
TRT Plugin
UE_Plugin
Unity
VRoid
人脸检测
体育检测
图像变化检测
图像拼接
多目标跟踪
慢动作
手部检测
数据可视化
流媒体
深度估计
视频抠像
视频编解码
语音
超分辨率
Starred repositories
Models and examples built with TensorFlow
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
The world's simplest facial recognition api for Python and the command line
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
A generative speech model for daily dialogue.
Official Code for DragGAN (SIGGRAPH 2023)
Instant voice cloning by MIT and MyShell. Audio foundation model.
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
OpenMMLab Detection Toolbox and Benchmark
Deezer source separation library including pretrained models.
State-of-the-art 2D and 3D Face Analysis Project
Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Rembg is a tool to remove images background
リアルタイムボイスチェンジャー Realtime Voice Changer
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用
Image augmentation for machine learning experiments.
Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Code to accompany "A Method for Animating Children's Drawings of the Human Figure"
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
Super Resolution for images using deep learning.