Skip to content
View Enigmatisms's full-sized avatar
☢️
R & D
☢️
R & D

Block or report Enigmatisms

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

qqr is an RL training framework for open-ended agents.

Python 167 14 Updated Jan 16, 2026

Guide to prepare for HFT interviews (SWEs)

202 25 Updated Dec 19, 2025

A tool for bandwidth measurements on NVIDIA GPUs.

C++ 607 68 Updated Apr 15, 2025

Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models

Python 2,995 193 Updated Jan 14, 2026

Machine Learning Engineering Open Book

Python 16,404 1,014 Updated Jan 11, 2026

High-performance, light-weight C++ LLM and VLM Inference Software for Physical AI

C++ 192 21 Updated Jan 5, 2026

Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"

Python 1,608 191 Updated Aug 12, 2020

CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning

Cuda 342 20 Updated Jan 8, 2026

Open-source release accompanying Gao et al. 2025

Python 494 52 Updated Dec 11, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 18,523 3,064 Updated Jan 20, 2026

Accelerating MoE with IO and Tile-aware Optimizations

Python 550 44 Updated Jan 19, 2026

2021年最新总结,推荐工程师合适读本,计算机科学,软件技术,创业,思想类,数学类,人物传记书籍

11,243 3,300 Updated Jun 20, 2025

vLLM Kunlun (vllm-kunlun) is a community-maintained hardware plugin designed to seamlessly run vLLM on the Kunlun XPU.

Python 230 39 Updated Jan 20, 2026

Tile-Based Runtime for Ultra-Low-Latency LLM Inference

Python 534 26 Updated Dec 23, 2025

C++ python bytecode disassembler and decompiler

C++ 4,244 800 Updated Aug 30, 2025

cuTile is a programming model for writing parallel kernels for NVIDIA GPUs

Python 1,837 99 Updated Jan 20, 2026

Helpful kernel tutorials and examples for tile-based GPU programming

Python 573 33 Updated Jan 20, 2026

NVIDIA cuTile learn

Python 150 1 Updated Dec 9, 2025

An experimental implementation of compiler-driven automatic sharding of models across a given device mesh.

Python 50 13 Updated Jan 19, 2026

An early research stage expert-parallel load balancer for MoE models based on linear programming.

Python 490 32 Updated Nov 19, 2025

Open MPI main development repository

C 2,511 941 Updated Jan 16, 2026

A cinematic Git commit replay tool for the terminal, turning your Git history into a living, animated story.

Rust 3,970 91 Updated Jan 19, 2026

Enjoy the magic of Diffusion models!

Python 11,507 1,100 Updated Jan 20, 2026

Large-Area Fabrication-Aware Computational Diffractive Optics (SIGGRAPH Asia & TOG 2025)

Python 17 3 Updated Nov 20, 2025

The repository provides code for running inference with the SAM 3D Body Model (3DB), links for downloading the trained model checkpoints and datasets, and example notebooks that show how to use the…

Python 2,506 253 Updated Dec 19, 2025

Unifying 3D Mesh Generation with Language Models

Python 1,134 75 Updated Mar 28, 2025

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology

C++ 1,326 182 Updated Dec 17, 2025

A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training

Python 619 36 Updated Jan 20, 2026

Core Functional Library for Distributed Training

Python 10 48 Updated Jan 20, 2026
Next