Skip to content
View monkey2000's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report monkey2000

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 2,633 251 Updated Dec 29, 2025

An unofficial cuda assembler, for all generations of SASS, hopefully :)

Python 564 96 Updated Apr 20, 2023
SystemVerilog 165 28 Updated Nov 19, 2024

High-Throughput, Cost-Effective Billion-Scale Vector Search with a Single GPU [to appear in SIGMOD'26]

Cuda 21 4 Updated Sep 26, 2025

Simulator code of the paper "Dissecting and Modeling the Architecture of Modern GPU Cores"

HTML 53 7 Updated Oct 15, 2025

Unofficial description of the CUDA assembly (SASS) instruction sets.

Python 190 19 Updated Jul 18, 2025

Smart pointers for the (GNU) C programming language

CMake 1,708 146 Updated Nov 2, 2022

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 2,972 298 Updated Dec 22, 2025

A Coverage Explorer for Reverse Engineers

Python 2,477 329 Updated Jul 18, 2024

Implementation of "Beyond Classification: Inferring Function Names in Stripped Binaries via Domain Adapted LLMs" (NDSS'25)

Python 43 3 Updated Jun 5, 2025

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 9,105 898 Updated Dec 24, 2025

Playwright MCP server

TypeScript 24,893 2,029 Updated Dec 27, 2025

Verilog library for ASIC and FPGA designers

Verilog 1,379 299 Updated May 8, 2024

Tinymist [ˈtaɪni mɪst] is an integrated language service for Typst [taɪpst].

Rust 2,658 116 Updated Dec 29, 2025

💫 Toolkit to help you get started with Spec-Driven Development

Python 58,460 5,105 Updated Dec 4, 2025

NixOS module for NVIDIA vGPU

Nix 23 7 Updated Jul 19, 2025

A toy implementation of Qwen3 inference

Python 3 Updated Jul 13, 2025

An extremely fast Python type checker and language server, written in Rust.

Python 15,890 172 Updated Dec 29, 2025

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

C++ 1,151 107 Updated Dec 28, 2025

This patch removes restriction on maximum number of simultaneous NVENC video encoding sessions imposed by Nvidia to consumer-grade GPUs.

Python 4,438 350 Updated Dec 25, 2025

Deploy headless browsers in Docker. Run on our cloud or bring your own. Free for non-commercial uses.

TypeScript 12,098 934 Updated Dec 29, 2025

实现Linux Wayland下腾讯会议屏幕共享(非虚拟相机). Hook library that enables screenshare with Tencent Wemeet on Linux Wayland, without the need of using virtual cameras.

C++ 500 17 Updated Sep 12, 2025

A fast type checker and language server for Python

Rust 5,121 233 Updated Dec 29, 2025

NixOS MicroVMs

Nix 2,059 169 Updated Dec 25, 2025

GDB-compatible RISC-V Debugger for CH32V003 that runs on a Raspberry Pi Pico

C 238 22 Updated Feb 1, 2025

Open Source Inventory Management System

Python 6,152 1,190 Updated Dec 26, 2025

A fault-injection framework using Chisel and FIRRTL

Scala 36 14 Updated Sep 17, 2025

FPGA implementation of a CDR targeting a Xilinx Kintex-7 for data rates up to 250 MHz

VHDL 18 7 Updated Nov 15, 2021

Test of the USB3 IP Core from Daisho on a Xilinx device

Verilog 100 32 Updated Oct 3, 2019
Next