Model Optimizations
Supervised Fine-tuning (SFT)
Supervised Fine-Tuning (SFT) trains a language model on curated examples of good behavior, resulting in a custom model that performs better on your specific use case. TensorZero has fine-tuning integrations with OpenAI, GCP Vertex AI, Fireworks AI, and Together AI. See the Supervised Fine-Tuning Guide to learn more. Additionally, we provide recipes for self-hosted fine-tuning withaxolotl, torchtune, and unsloth.
Reinforcement Fine-tuning (RFT)
Guide coming soon…Direct Preference Optimization (DPO)
A direct preference optimization (DPO) — also known as preference fine-tuning — recipe fine-tunes an LLM on a dataset of preference pairs. You can use demonstration feedback collected with TensorZero to curate a dataset of preference pairs and fine-tune an LLM on it. We provide a recipe for DPO (Preference Fine-tuning) with OpenAI.Prompt Optimization
GEPA
GEPA is an automated prompt optimization method that evolves prompts through iterative evaluation, analysis, and mutation. It uses LLMs to analyze inference results and propose prompt improvements, then filters variants using Pareto frontier selection to balance multiple objectives. See the GEPA Guide to learn more.MIPRO
MIPRO (Multi-prompt Instruction PRoposal Optimizer) is a method for automatically improving system instructions and few-shot demonstrations in LLM applications — including ones with multiple LLM functions or calls.Inference-Time Optimization
Best-of-N & Mixture-of-N Sampling
Best-of-N Sampling generates multiple candidate responses from a single model and selects the best one based on a scoring function or verifier. Mixture-of-N Sampling extends this by generating candidates across multiple models or variants, combining diversity with quality selection. Both techniques improve output quality at inference time without requiring additional training or fine-tuning. See Inference-Time Optimizations to learn more.Dynamic In-Context Learning

Custom Recipes
You can also create your own recipes. Put simply, a recipe takes inference and feedback data collected by TensorZero and generates a new set of variants for your functions. You should should be able to use virtually any LLM engineering workflow with TensorZero, ranging from automated prompt engineering to advanced RLHF workflows. For example, see our recipes for self-hosted supervised fine-tuning (SFT) withaxolotl, torchtune, and unsloth.
Examples
We are working on a series of complete runnable examples illustrating TensorZero’s data & learning flywheel.- Optimizing Data Extraction (NER) with TensorZero — This example shows how to use TensorZero to optimize a data extraction pipeline. We demonstrate techniques like fine-tuning and dynamic in-context learning (DICL). In the end, an optimized GPT-4o Mini model outperforms GPT-4o on this task — at a fraction of the cost and latency — using a small amount of training data.
- Agentic RAG — Multi-Hop Question Answering with LLMs — This example shows how to build a multi-hop retrieval agent using TensorZero. The agent iteratively searches Wikipedia to gather information, and decides when it has enough context to answer a complex question.
- Writing Haikus to Satisfy a Judge with Hidden Preferences — This example fine-tunes GPT-4o Mini to generate haikus tailored to a specific taste. You’ll see TensorZero’s “data flywheel in a box” in action: better variants leads to better data, and better data leads to better variants. You’ll see progress by fine-tuning the LLM multiple times.
- Image Data Extraction — Multimodal (Vision) Fine-tuning — This example shows how to fine-tune multimodal models (VLMs) like GPT-4o to improve their performance on vision-language tasks. Specifically, we’ll build a system that categorizes document images (screenshots of computer science research papers).
- Improving LLM Chess Ability with Best/Mixture-of-N Sampling — This example showcases how best-of-N sampling and mixture-of-N sampling can significantly enhance an LLM’s chess-playing abilities by selecting the most promising moves from multiple generated options.