Lists (1)
Sort Name ascending (A-Z)
Stars
Tooling for optimized, validated, and reproducible GPU-accelerated AI runtime in Kubernetes
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
A bridge to use Langchain output as an OpenAI-compatible API
The interaction control harness for customer-facing AI agents - optimized for building controlled, consistent, and predictable customer interactions with LLMs.
Kubernetes enhancements for Network Topology Aware Gang Scheduling & Autoscaling
The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.
verl: Volcano Engine Reinforcement Learning for LLMs
NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the effective training time by minimizing the downtime due to fa…
Fast Matrix Multiplications for Lookup Table-Quantized LLMs
A Datacenter Scale Distributed Inference Serving Framework
The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
macOS packaging for ungoogled-chromium
Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!
A cross-platform command-line utility that creates projects from cookiecutters (project templates), e.g. Python package projects, C projects.
Community-maintained Kubernetes config and Helm chart for Langfuse
Disaggregated serving system for Large Language Models (LLMs).
Open source Loom alternative. Beautiful, shareable screen recordings.




