Starred repositories
The Best Agent Harness. Meet Sisyphus: The Batteries-Included Agent that codes like you.
"DeepTutor: AI-Powered Personalized Learning Assistant"
A set of ready to use scientific skills for Claude
Official code for StoryMem: Multi-shot Long Video Storytelling with Memory
DiffusionNFT: Online Diffusion Reinforcement with Forward Process
StreamAvatar: Streaming Diffusion Models for Real-Time Interactive Human Avatars
The official repo for "Vidi: Large Multimodal Models for Video Understanding and Editing"
A high-performance, 100% client-side tool for removing Gemini AI watermarks. Built with pure JavaScript, it leverages a mathematically precise Reverse Alpha Blending algorithm rather than unpredict…
HY-World 1.5: A Systematic Framework for Interactive World Modeling with Real-Time Latency and Geometric Consistency
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Offical Implementation of SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations
A real-time streaming conversational video system that transforms text interactions into continuous, high-fidelity video responses using autoregressive diffusion.
HunyuanVideo-1.5: A leading lightweight video generation model
We present FlashPortrait, an end-to-end video diffusion transformer capable of synthesizing ID-preserving, infinite-length videos while achieving up to 6$\times$ acceleration in inference speed.
A refactored codebase for Gaussian Splatting. Training 3DGS in 50 seconds!
tukuaiai / vibe-coding-cn
Forked from EnzeD/vibe-coding我的开发经验+提示词库=vibecoding工作站;My development experience + prompt dictionary = Vibecoding workstation;ניסיון הפיתוח שלי + מילון פרומפטים = תחנת עבודה Vibecoding;私の開発経験 + プロンプト辞書 = Vibecoding ワークステーション;나…
Implementation of "Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length"
One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer
Performs Bazel Target Diffing between two revisions in Git, allowing for Test Target Selection and Selective Building
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
🚀 AI 全自动短视频引擎 | AI Fully Automated Short Video Engine
[NeurIPS 2025] Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale (CVPR 2025)