- Hangzhou, China
Stars
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models
FlashMLA: Efficient Multi-head Latent Attention Kernels
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
DeepEP: an efficient expert-parallel communication library
Train transformer language models with reinforcement learning.
ByteCheckpoint: An Unified Checkpointing Library for LFMs
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)
verl: Volcano Engine Reinforcement Learning for LLMs
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
A PyTorch native platform for training generative AI models
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
deepspeedai / Megatron-DeepSpeed
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2
Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
A QoS-based scheduling system brings optimal layout and status to workloads such as microservices, web services, big data jobs, AI jobs, etc.
An industrial deep learning framework for high-dimension sparse data
Kubernetes-native Deep Learning Framework
DLRover: An Automatic Distributed Deep Learning System
Policy based networking for cloud native applications
flannel is a network fabric for containers, designed for Kubernetes
