CVlization: Curated AI Training and Inference Recipes That Just Work

A curated collection of 195+ state-of-the-art open source AI capabilities, packaged in self-contained Docker environments. Find a recipe, test it, copy what you need.

CVlization stands on the shoulders of giants - we package and test amazing open source projects so you can use them with confidence.

Quick Start

git clone --depth 1 https://github.com/kungfuai/CVlization
cd CVlization

# Install CLI (optional - or just use bash scripts)
pip install .

# Optional: install with remote execution support
pip install .[remote]    # SSH runner
pip install .[aws]       # SageMaker runner
pip install .[deploy]    # Serverless deployment (Cerebrium)

# Browse examples
cvl list                  # compact grouped overview
cvl list -k gpt           # search by keyword
cvl list --format list    # detailed view
# or browse examples/ on GitHub

# Run any example
cvl run nanogpt build
cvl run nanogpt train

# Copy into your project (bundled with cvlization)
cvl export perception/image_classification/torch -o your-project/

That's it! Each example is self-contained with its own Dockerfile and dependencies. (We battled CUDA versions and dependency conflicts so you don't have to.)

Examples

Our examples/ directory is organized by capability (what the model does) rather than modality (what data it processes).

Directory Structure

examples/
  analytical/          # Prediction & forecasting (time series, tabular ML)
  perception/          # Understand signals (vision, speech, multimodal)
  generative/          # Create content (text, images, video, audio, avatars)
  physical/            # Robotics & embodied AI (vision-language-action models)
  agentic/             # AI agents (RAG, tool use, optimization, workflows)

Catalog of Examples

Legend: ✅ = Tested and maintained | 🧪 = Experimental

🔍 Perception (Understanding Signals)

Capability	Example Directory	Implementations	Status
Image Classification	`examples/perception/image_classification`	torch, cifar10-speedrun	✅
Object Detection	`examples/perception/object_detection`	mmdet, torchvision, rt-detr, yolov13	✅
Segmentation	`examples/perception/segmentation`	instance (mmdet, torchvision), semantic (mmseg, torchvision), panoptic (detectron2, mmdet, torchvision), sam, sam_lora_finetuning, sam3, sam3_finetuning, flowrvs	✅
Pose Estimation	`examples/perception/pose_estimation`	dwpose, mmpose	✅
Tracking	`examples/perception/tracking`	global_tracking_transformer, soccer_visual_tracking	✅
Line Detection	`examples/perception/line_detection`	torch	✅
Document AI	`examples/perception/doc_ai`	OCR (chandra_ocr, deepseek_ocr, docling, doctr, dots_ocr, nanonets_ocr, olmocr_2, paddleocr_vl, surya), VLMs (donut_doc_classification, donut_doc_parse, granite_docling, granite_docling_finetune), Layout (doclayout_yolo), Parsing (dolphin_v2, churro_3b, extract0, nvidia_nemotron_parse), Leaderboard (leaderboard)	✅
Vision-Language Models	`examples/perception/vision_language`	florence_2 (+ finetune), gemma3_vision (+ grpo, sft), internvl3, joycaption_llava, kosmos2_grounded_ocr, lighton_ocr, llama3_vision, llava_next_video, minicpm_v_2_6, molmoe_1b, moondream2 (+ finetune), moondream3, owl_vit, paligemma2 (detection, segmentation), phi_3_5_vision_instruct, phi_4_multimodal_instruct, pixtral_12b, qwen3_vl	✅
3D Reconstruction	`examples/perception/3d_reconstruction`	dust3r, mast3r, monst3r, hunyuanworld_mirror, map_anything, nerf_tf (experimental)	✅

✨ Generative (Creating Content)

Capability	Example Directory	Implementations	Status
LLMs (text generation)	`examples/generative/llm`	Pretraining (nanogpt, modded_nanogpt, nanomamba), Full pipeline (nanochat: pretrain, sft, rl), Fine-tuning (peft_mistral7b_sft, trl_sft, miles_grpo, unsloth: gpt_oss_grpo, gpt_oss_sft, llama_3b_sft, qwen_7b_sft, gemma3_4b_sft), Inference (mixtral8x7b, sglang, vllm, dllm, nanbeige4_3b_thinking, nomos_1, rnj_1_instruct), Diffusion LLM (semicat), Interpretability (gemma_scope_2_270m_it)	✅
Image Generation	`examples/generative/image_generation`	cfm, ddpm, diffuser_unconditional, dit, dreambooth, edm2, flux, next_scene_qwen, qwen_image_layered, rae, repa, stable_diffusion, uva_energy (experimental), vqgan	✅
Video Generation	`examples/generative/video_generation`	animate_diff, animate_diff_cog, animate_x, cogvideox, ctrl_world, deforum, dove, flashvsr, framepack, hunyuan_video_1_5, kandinsky_5, krea_realtime_scope (experimental), longcat_video, ltx2, mimic_motion, minisora, phantom (experimental), propainter, real_video, reward_forcing, skyreals, svd_cog, svd_comfy, turbodiffusion, vace, vace_comfy (experimental), video2x, video_enhancement, video_in_between, wan2gp, wan2gp_wan, wan_animate, wan_comfy, worldcanvas	✅
Text-to-Speech (TTS)	`examples/generative/audio`	cosyvoice3, vibevoice_realtime, voxcpm	✅
Avatar & Talking Head	`examples/generative/video_generation/avatar`	anytalker, egstalker, fastavatar, flashportrait, hunyuanvideo_avatar, imtalker, lite_avatar, live_avatar, livetalk, longcat_video_avatar, personalive, wan_s2v	✅

📊 Analytical (Prediction & Forecasting)

Capability	Example Directory	Implementations	Status
Time Series Forecasting	`examples/analytical/time_series`	Foundation models (chronos_zero_shot, moirai_zero_shot, uni2ts_finetune (experimental)), Supervised (patchtst_supervised), Statistical baselines (statsforecast_baselines), Hierarchical (hierarchical_reconciliation), Anomaly detection (anomaly_transformer, merlion_anomaly_dashboard)	✅
Tabular ML - AutoML	`examples/analytical/tabular/automl`	autogluon_structured, pycaret_structured	✅
Tabular ML - Causal Inference	`examples/analytical/tabular/causal`	causalml_campaign_optimization, dowhy_berkeley_bias, dowhy_policy_uplift, econml_heterogeneous_effects	✅
Tabular ML - Uncertainty Quantification	`examples/analytical/tabular/uncertainty`	Conformal (conformal_lightgbm, mapie_conformal), Quantile (catboost_quantile, quantile_lightgbm), Bayesian (pymc_bayesian_regression)	✅
Tabular ML - Business Use Cases	`examples/analytical/tabular`	Customer analytics (gbt_telco_churn), Marketing (gbt_upsell_propensity), Risk scoring (gbt_credit_default), Regression (gbt_housing_prices), Recommendation (ranking_lightgbm), Survival (pycox_retention), Anomaly detection (pyod_fraud_detection), Feature engineering (autofe_structured)	✅

🦾 Physical (Robotics & Embodied AI) - Experimental

Capability	Example Directory	Implementations	Status
Vision-Language-Action Models	`examples/physical`	openvla_single_step, openvla_simplerenv	🧪
World Models	`examples/physical`	stable_worldmodel_dmc	🧪

🤖 Agentic (AI Agents & Workflows) - Experimental

Capability	Example Directory	Implementations	Status
RAG & Knowledge	`examples/agentic/rag`	langgraph_helpdesk, clara	🧪
LlamaIndex Agents	`examples/agentic/llamaindex`	graph_rag_cognee, jsonalyze_structured_qa, react_finance_query_agent	🧪
Prompt Optimization	`examples/agentic/optimization`	dspy_gepa_promptops, mcts_prompt_agent	🧪
Tool Use & Coding	`examples/agentic`	Code (autogen_pair_programmer), Data analysis (smolagents_data_analyst), Data preparation (physio_signal_prep), Local AI (llamacpp_assistant)	🧪
Formal Reasoning	`examples/agentic/formal`	nanoproof (Lean theorem proving), qed-nano (Olympiad level math proof generation)	🧪
Long Context (RLM)	`examples/agentic/long_context`	rlm_needle (needle-in-haystack), rlm_doc_qa (document QA), rlm_claude_code (Claude Code native)	🧪

Note: These examples are regularly updated and tested to ensure compatibility with the latest dependencies. We recommend starting with the nanogpt example.

Browse on GitHub: perception/ • generative/ • analytical/ • physical/ • agentic/

Browse via CLI: cvl list for a compact overview, or cvl list -k <keyword> to search

Running an Example

You can run examples using the cvl CLI or directly with bash scripts.

Option 1: Using cvl CLI (recommended)

cvl run nanogpt build
cvl run nanogpt train

Option 2: Using bash scripts directly

cd examples/generative/llm/nanogpt
bash build.sh
bash train.sh

More examples:

# RL post-training with GRPO (cutting-edge reasoning model training)
cvl run miles_grpo train

# Generate video from text prompt (Tencent HunyuanVideo)
cvl run hunyuan-video-1-5 predict --prompt "A cat playing piano"

# Document AI extraction (IBM Granite-Docling)
cvl run granite-docling predict -i input_pdf=@document.pdf

# RAG agent with LangGraph
cvl run agentic-rag-langgraph-helpdesk predict --question "How do I list examples?"

For detailed instructions and available options, see the README.md in each example directory.

License Note: Each example may reference projects with different licenses. Check the license file in each example directory.

Remote Execution (Experimental)

Run examples on cloud infrastructure instead of locally:

# AWS SageMaker (managed training)
cvl run nanogpt train --runner sagemaker --spot --output-path s3://bucket/outputs

# SkyPilot (any cloud: AWS, GCP, Azure, Lambda Labs)
cvl run nanogpt train --runner skypilot --gpu A100:1 --cloud aws

# SSH (existing GPU server)
cvl run nanogpt train --runner ssh user@gpu-host

Manage remote jobs:

cvl jobs list --runner sagemaker
cvl jobs logs <job-id>
cvl jobs kill <job-id>

See cvl/runners/README.md for setup instructions.

Serverless Deployment (Experimental)

Deploy inference endpoints to serverless GPU platforms:

# Deploy to Cerebrium
cvl deploy ltx2 --gpu L40

# Manage deployed services
cvl services list
cvl services status ltx2
cvl services logs ltx2
cvl services delete ltx2

See cvl/deployers/README.md for setup and supported platforms.

Benefits of Using CVlization

Centralized Caching: All examples use ~/.cache/cvlization/ for models and datasets, avoiding re-downloads across examples:

Automatic caching for HuggingFace Hub, PyTorch, and custom downloads
Managed by build scripts - no manual setup required
Shared across all examples to save disk space and bandwidth

Self-Contained Docker Environments: Each example is isolated with pinned dependencies:

CUDA and dependency conflicts already resolved - saves hours or days of setup
Source code is mounted at runtime (not baked into images) for easy iteration
Dependency versions are pinned where possible for reproducibility

Production-Ready Patterns: Copy what works into your projects:

Consistent build/train/predict script structure across all examples
Battle-tested configurations for 190+ AI capabilities
Examples regularly updated and tested for compatibility

Requirements

Docker (Install Docker)
NVIDIA GPU (most examples require 16GB+ VRAM; A10 or better recommended)

nvidia-docker for GPU access

# Ubuntu
sudo apt-get install -y nvidia-container-toolkit

Use CVlization on Colab

No installation needed - run examples directly in Google Colab: Colab notebook: CIFAR-10 classification

For Contributors

CVlization includes Claude Code skills for AI-assisted development and automated verification:

verify-training-pipeline - Validates training examples are properly structured, build successfully, train without errors, and log appropriate metrics
verify-inference-example - Validates inference examples build correctly and run inference successfully

These skills enable Claude to automatically verify examples end-to-end, helping maintain code quality across the repository.

Contributing Guidelines:

See CONTRIBUTING.md for standardization patterns and best practices
All examples should follow the build/train/predict script pattern
Use verification metadata in example.yaml to track testing status

Project Structure

examples/: Dockerized AI examples (perception, generative, analytical, physical, agentic)
cvl/: CLI tool source code
cvlization/: Optional reusable library components
doc/: Project documentation, including doc/runners/ for cloud setup guides
tests/: Unit and integration tests

CVlization Library

The cvlization/ directory provides optional reusable components:

Available:

Training pipeline abstractions (image classification, object detection, LLMs, diffusion)
Dataset builders with caching (PyTorch, TensorFlow, HuggingFace)
Model factories (Torchvision, MMDetection, MMSegmentation)
Utilities (metrics, logging, download helpers)

Installation: pip install -e .

Note: Examples are self-contained and don't require the library. For production use, copying example code directly is often simpler than depending on the library package.

Documentation

Detailed documentation can be found in the doc/ directory:

Licenses

CVlization Library & CLI: MIT License

The cvlization package and cvl CLI tool are released under the MIT License
Safe for commercial use

Examples Directory: Mixed Licenses

Examples may reference projects with various licenses (copyleft, non-commercial, etc.)
Examples are NOT included when you pip install cvlization
Always check the license file in each example directory before using in production

Note: Each example packages different open-source projects with their own licenses. Review licenses carefully for your use case.

Name		Name	Last commit message	Last commit date
Latest commit History 1,650 Commits
.claude/skills		.claude/skills
.github/workflows		.github/workflows
archived_examples		archived_examples
benchmarks		benchmarks
bin		bin
cvl		cvl
cvlization		cvlization
datasets/omr		datasets/omr
doc		doc
docs		docs
examples		examples
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.pylintrc		.pylintrc
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Notice.md		Notice.md
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CVlization: Curated AI Training and Inference Recipes That Just Work

Quick Start

Table of Contents

Examples

Directory Structure

Catalog of Examples

🔍 Perception (Understanding Signals)

✨ Generative (Creating Content)

📊 Analytical (Prediction & Forecasting)

🦾 Physical (Robotics & Embodied AI) - Experimental

🤖 Agentic (AI Agents & Workflows) - Experimental

Running an Example

Remote Execution (Experimental)

Serverless Deployment (Experimental)

Benefits of Using CVlization

Requirements

Use CVlization on Colab

For Contributors

Project Structure

CVlization Library

Documentation

Licenses

About

Uh oh!

Releases 24

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CVlization: Curated AI Training and Inference Recipes That Just Work

Quick Start

Table of Contents

Examples

Directory Structure

Catalog of Examples

🔍 Perception (Understanding Signals)

✨ Generative (Creating Content)

📊 Analytical (Prediction & Forecasting)

🦾 Physical (Robotics & Embodied AI) - Experimental

🤖 Agentic (AI Agents & Workflows) - Experimental

Running an Example

Remote Execution (Experimental)

Serverless Deployment (Experimental)

Benefits of Using CVlization

Requirements

Use CVlization on Colab

For Contributors

Project Structure

CVlization Library

Documentation

Licenses

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 24

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages