Lists (1)
Sort Name ascending (A-Z)
Stars
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…
SGLang is a high-performance serving framework for large language models and multimodal models.
A cross-platform command-line utility that creates projects from cookiecutters (project templates), e.g. Python package projects, C projects.
verl: Volcano Engine Reinforcement Learning for LLMs
The interaction control harness for customer-facing AI agents - optimized for building controlled, consistent, and predictable customer interactions with LLMs.
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
Running large language models on a single GPU for throughput-oriented scenarios.
Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.
H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
3D ResNets for Action Recognition (CVPR 2018)
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"
Video classification tools using 3D ResNet




