Skip to content
View Zehaos's full-sized avatar

Block or report Zehaos

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[ASPLOS 2026] CLM: Removing the GPU Memory Barrier for 3D Gaussian Splatting with CPU Offloading

Python 147 8 Updated Dec 8, 2025

Quantized LLM training in pure CUDA/C++.

C++ 230 14 Updated Jan 6, 2026

Accelerating MoE with IO and Tile-aware Optimizations

Python 511 38 Updated Jan 5, 2026

DGGT: Feedforward 4D Reconstruction of Dynamic Driving Scenes using Unposed Images

Python 364 32 Updated Dec 11, 2025

A TUI-based utility for real-time monitoring of InfiniBand traffic and performance metrics on the local node

C 61 5 Updated Dec 19, 2025

Tile-Based Runtime for Ultra-Low-Latency LLM Inference

Python 512 24 Updated Dec 23, 2025

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 1,217 84 Updated Aug 28, 2025

StreamingVLM: Real-Time Understanding for Infinite Video Streams

Python 801 51 Updated Oct 15, 2025

Ola: Pushing the Frontiers of Omni-Modal Language Model

Python 382 16 Updated Jun 13, 2025

DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving

Python 182 15 Updated Dec 29, 2025

FlagGems is an operator library for large language models implemented in the Triton Language.

Python 822 195 Updated Jan 6, 2026

Collection of kernels written in Triton language

174 9 Updated Apr 5, 2025

A lightweight profiler for NCCL

C 5 2 Updated May 21, 2025

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

Python 5,701 312 Updated Jan 5, 2026

Learning to Drive via Real-World Simulation at Scale

112 4 Updated Dec 31, 2025

A NCCL extension library, designed to efficiently offload GPU memory allocated by the NCCL communication library.

C++ 77 7 Updated Dec 17, 2025

A portable Multimodal Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to your ML and analytics workloads.

Python 263 44 Updated Dec 23, 2025

Accelerated Computer Vision Lab (ACCV-Lab) is a systematic collection of packages with the common goal to facilitate end-to-end efficient training in the ADAS domain, each package offering tools & …

Python 41 8 Updated Dec 16, 2025

NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the effective training time by minimizing the downtime due to fa…

Python 243 40 Updated Dec 23, 2025

SAM 3D Objects

Python 5,301 525 Updated Dec 31, 2025
Python 172 10 Updated Nov 26, 2025

Discover Unknown Unsafe Events via Generative Simulation

Python 186 20 Updated Dec 26, 2025

High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale

Rust 5,091 377 Updated Jan 6, 2026

DriveArena: A Closed-loop Generative Simulation Platform for Autonomous Driving

Python 424 36 Updated Sep 17, 2025

Depth Anything 3

Python 3,842 335 Updated Dec 12, 2025

[ECCV 2024] DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving

Python 523 25 Updated Nov 29, 2024

[CVPR 2025] OmniSplat: Taming Feed-Forward 3D Gaussian Splatting for Omnidirectional Images with Editable Capabilities

Python 31 Updated Jun 6, 2025
Next