Stars
Hands-On System Programming with C++, published by Packt
Patterns and resources of low latency programming.
A curated list of resources on operating system design and implementation.
An implementation of the Raft distributed consensus protocol using the Tokio framework.
Accelerating MoE with IO and Tile-aware Optimizations
A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
rvLLM: High-performance LLM inference in Rust. Drop-in vLLM replacement.
PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
FlashInfer: Kernel Library for LLM Serving
A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.
Sandboxes for every agent. Embeddable, stateful, snapshots, and hardware isolation.
Fast and memory-efficient exact attention
A course to build the SQL layer of a distributed database.
A course to build distributed key-value service based on TiKV model
High-performance wait-free memory reclamation for wait-free data structures (ASMR). Bounded memory usage, predictable latency.
Raft distributed consensus algorithm implemented in Rust.
Implementation of Chandy–Lamport snapshot algorithm for recording a consistent global state of an asynchronous distributed system
Chandy-Lamport distributed snapshot implementation
My UCLA ECE M116C Fall 2022 Resources
A circular buffer written in C using Posix calls to create a contiguously mapped memory space. BSD Licensed.