Stars
Fast, Sharp & Reliable Agentic Intelligence
Breakthrough Method for Agile Ai Driven Development
Open-source framework for the research and development of foundation models.
PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning
Official JAX implementation of End-to-End Test-Time Training for Long Context
Accelerating MoE with IO and Tile-aware Optimizations
Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models
Physics of Language Models: Part 4.2, Canon Layers at Scale where Synthetic Pretraining Resonates in Reality
Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.
Efficient End2End Compiler for Mixed-Precision Deep Learning
An intuitive and low-overhead instrumentation tool for Python
Reverse Engineering Gemma 3n: Google's New Edge-Optimized Language Model
🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.
RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing …
Official Repo for Open-Reasoner-Zero
Small tool to disable macOS 15's annoying new screencapture nag popups
open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality
Bucketed top-k for PyTorch using a priority queue


