Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
关于Transformer模型的最简洁pytorch实现,包含详细注释
A complete computer science study plan to become a software engineer.
Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2023
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux, Android, iOS and Web
Readest is a modern, feature-rich ebook reader designed for avid readers offering seamless cross-platform access, powerful tools, and an intuitive interface to elevate your reading experience.
A simple C++11 Thread Pool implementation
workspace是基于C++11的轻量级异步执行框架,支持:通用任务异步并发执行、优先级任务调度、自适应动态线程池、高效静态线程池、异常处理机制等。
revyos / sg2044-vendor-kernel
Forked from sophgo/linux-riscvLinux kernel stable tree
AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
YiRage (Yield Revolutionary AGile Engine) - Multi-Backend LLM Inference Optimization. Extends Mirage with comprehensive support for CUDA, MPS, CPU, Triton, NKI, cuDNN, and MKL backends.
AMD RAD's multi-GPU Triton-based framework for seamless multi-GPU programming
Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond
nanomsg-next-generation -- light-weight brokerless messaging
ZeroMQ core engine in C++, implements ZMTP/3.1
Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)
An optimized neural network operator library for chips base on Xuantie CPU.
注释的nano_vllm仓库,并且完成了MiniCPM4的适配以及注册新模型的功能
FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
The simplest, fastest repository for training/finetuning medium-sized GPTs.
