Starred repositories
🚀 AI 全自动短视频引擎 | AI Fully Automated Short Video Engine
NextFlow🚀: Unified Sequential Modeling Activates Multimodal Understanding and Generation
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
The Context Graph Factory for AI. Build, manage, and deploy AI-optimized Context Graphs.
We present FlashPortrait, an end-to-end video diffusion transformer capable of synthesizing ID-preserving, infinite-length videos while achieving up to 6$\times$ acceleration in inference speed.
Learning on the Job: An Experience-Driven, Self-Evolving Agent for Long-Horizon Tasks
Official project page for "From Inpainting to Editing: A Self-Bootstrapping Framework for Context-Rich Visual Dubbing" (X-Dub).
BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation,…
HY-Motion model for 3D character animation generation.
Fun-Audio-Chat is a Large Audio Language Model built for natural, low-latency voice interactions.
Fun-ASR is an end-to-end speech recognition large model launched by Tongyi Lab.
SoulX-FlashTalk is the first 14B model to achieve sub-second start-up latency (0.87s) while maintaining a real-time throughput of 32 FPS on an 8xH800 node.
BuildingAI is an enterprise-grade open-source intelligent agent platform designed for AI developers, AI entrepreneurs, and forward-thinking organizations. Through a visual configuration interface (…
AI Agent Engineering Platform built on an Open Source TypeScript AI Agent Framework
PersonaLive! : Expressive Portrait Image Animation for Live Streaming
A next.js web application that integrates AI capabilities with draw.io diagrams. This app allows you to create, modify, and enhance diagrams through natural language commands and AI-assisted visual…
A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics, and features robust zero-shot text-to-speech
SDPose-OOD的comfyui节点。implementation of SDPose-OOD in ComfyUI
[AAAI 2026] Playmate2: Training-Free Multi-Character Audio-Driven Animation via Diffusion Transformer with Reward Feedback
MotionGPT3: Human Motion as a Second Modality, a MoT-based framework for unified motion understanding and generation
✨ WithAnyone is capable of generating high-quality, controllable, and ID consistent images
Lumos-Custom Project: research for customized video generation in the Lumos Project.
Mixture-of-Groups Attention for End-to-End Long Video Generation
