Skip to content
View sanchitintel's full-sized avatar
  • San Francisco Bay Area

Block or report sanchitintel

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🎯An accuracy-first, highly efficient quantization toolkit for LLMs, designed to minimize quality degradation across Weight-Only Quantization, MXFP4, NVFP4, GGUF, and adaptive schemes.

Python 849 78 Updated Feb 25, 2026

Causal depthwise conv1d in CUDA, with a PyTorch interface

Cuda 726 156 Updated Feb 18, 2026

SYCL* Templates for Linear Algebra (SYCL*TLA) - SYCL based CUTLASS implementation for Intel GPUs

C++ 67 81 Updated Feb 25, 2026

Puzzles for learning Triton

Jupyter Notebook 2,313 199 Updated Nov 18, 2024

A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.

Python 767 109 Updated Feb 25, 2026

[ICLR 2025] Palu: Compressing KV-Cache with Low-Rank Projection

Python 155 14 Updated Feb 20, 2025

Curated collection of papers in MoE model inference

343 12 Updated Oct 20, 2025

Perplexity GPU Kernels

C++ 564 75 Updated Nov 7, 2025

A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.

Python 595 32 Updated Aug 12, 2025

Reproducing R1 for Code with Reliable Rewards

Python 289 17 Updated May 5, 2025

Tile primitives for speedy kernels

Cuda 3,185 244 Updated Feb 24, 2026

s1: Simple test-time scaling

Python 6,634 765 Updated Jun 25, 2025
Python 2 Updated Apr 5, 2024

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Jupyter Notebook 32,638 3,930 Updated Feb 18, 2026

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,752 269 Updated Jul 18, 2025

oneAPI - Data Parallel C++ course for students

C++ 44 12 Updated Nov 4, 2024

Fast Matrix Multiplications for Lookup Table-Quantized LLMs

C++ 388 18 Updated Apr 13, 2025

A batched offline inference oriented version of segment-anything

Python 1,322 81 Updated Aug 22, 2025

Applied AI experiments and examples for PyTorch

Python 319 30 Updated Aug 22, 2025

Helpful tools and examples for working with flex-attention

Python 1,136 71 Updated Feb 8, 2026

Run PyTorch LLMs locally on servers, desktop and mobile

Python 3,625 247 Updated Sep 10, 2025

PyTorch media decoding and encoding

Python 968 97 Updated Feb 25, 2026

A PyTorch native platform for training generative AI models

Python 5,086 718 Updated Feb 25, 2026

A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.

Python 23,759 9,794 Updated Sep 1, 2025

TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

Python 1,698 163 Updated Feb 23, 2026

Train transformer language models with reinforcement learning.

Python 17,452 2,509 Updated Feb 25, 2026

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Python 13,183 1,404 Updated Feb 22, 2026

Go ahead and axolotl questions

Python 11,332 1,257 Updated Feb 25, 2026

PyTorch native post-training library

Python 5,690 706 Updated Feb 25, 2026

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 85,979 13,032 Updated Feb 19, 2026
Next