Skip to content
View panli889's full-sized avatar

Organizations

@laincloud

Block or report panli889

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A framework for efficient model inference with omni-modality models

Python 4,270 743 Updated Apr 13, 2026

HAMi-core compiles libvgpu.so, which ensures hard limit on GPU in container

C 293 147 Updated Apr 3, 2026

System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge

Go 3,703 615 Updated Apr 13, 2026

Fast and memory-efficient exact attention

Python 23,332 2,612 Updated Apr 13, 2026

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,776 682 Updated Apr 13, 2026

A collection of modern C++ libraries, include coro_http, coro_rpc, compile-time reflection, struct_pack, struct_json, struct_xml, struct_pb, easylog, async_simple etc.

C++ 2,117 313 Updated Apr 11, 2026

Material for gpu-mode lectures

Jupyter Notebook 5,948 600 Updated Feb 1, 2026

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 25,741 5,324 Updated Apr 13, 2026

FlashInfer: Kernel Library for LLM Serving

Python 5,381 891 Updated Apr 13, 2026

KV cache store for distributed LLM inference

C++ 405 37 Updated Nov 13, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 20,649 3,644 Updated Apr 13, 2026

A Datacenter Scale Distributed Inference Serving Framework

Rust 6,546 1,015 Updated Apr 13, 2026

gopy generates a CPython extension module from a go package.

Go 2,302 132 Updated Apr 10, 2026

AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。

Jupyter Notebook 6,705 882 Updated Dec 22, 2025

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 16,600 2,353 Updated Sep 3, 2025

KAI Scheduler is an open source Kubernetes Native scheduler for AI workloads at large scale

Go 1,229 180 Updated Apr 13, 2026

LeaderWorkerSet: An API for deploying a group of pods as a unit of replication

Go 693 144 Updated Apr 13, 2026

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 13,357 2,281 Updated Apr 13, 2026

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,971 286 Updated May 15, 2025

A lightweight data processing framework built on DuckDB and 3FS.

Python 4,947 442 Updated Mar 5, 2025

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,804 1,031 Updated Mar 30, 2026

Cost-efficient and pluggable Infrastructure components for GenAI inference

Go 4,719 548 Updated Apr 13, 2026

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 5,082 672 Updated Apr 13, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 76,427 15,531 Updated Apr 13, 2026

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Python 3,747 312 Updated May 21, 2025

Serving multiple LoRA finetuned LLM as one

Python 1,152 62 Updated May 8, 2024

Development repository for the Triton language and compiler

MLIR 18,932 2,760 Updated Apr 13, 2026

Kubernetes WithOut Kubelet - Simulates thousands of Nodes and Clusters.

Smarty 3,081 243 Updated Apr 13, 2026

DLRover: An Automatic Distributed Deep Learning System

Python 1,644 211 Updated Apr 2, 2026
Next