Hi — I'm a builder of production ML systems & LLM tooling 👋
I design and ship full-stack AI/ML products: fine-tuning and deploying LLMs, agent workflows, scalable data pipelines, and production inference stacks (Docker / Kubernetes / on-prem GPU). I like projects that connect research-y models to real-world systems and measurable outcomes.
🔭 Current focus
Post-training / PEFT (QLoRA, LoRA) for LLMs and agentic systems
Fast, reproducible model deployment (vLLM, Ollama, LangGraph, containerized stacks)
Production data pipelines and feature engineering for high-throughput environments
🛠 Tech highlights
Python · PyTorch · TensorFlow · PEFT/QLoRA · vLLM · LangGraph · Docker · Kubernetes / OpenShift · Go · Angular · Spark · Polars · Snowflake · REST / gRPC · CV (real-time inference)
⭐ Selected projects
Click each project to open the repo.
MiniMind — Chinese → English Translation Agent
A LangGraph + vLLM translation agent that performs multi-pass translation, quality validation and automatic fixes — built to create a high-quality zh→en dataset and production translation pipeline.
PEFT Fine-Tuning Recipes — Classification
Hands-on PEFT recipes and scripts (QLoRA / LoRA) for classification tasks: reproducible examples, training configs and evaluation scripts.
Stable Diffusion WebUI — Docker
One-click Docker setup for Stable Diffusion WebUI, with GPU support and common model installers — intended for local image generation and experimentation.
SMB File Server
Lightweight SMB file server templates and deployment scripts for self-hosted file sharing.
Attendance System
A simple attendance / check-in system (web + API) built for easy deployment and integration with internal tools.
TabularML
Experiments and templates for tabular machine learning workflows: feature engineering, model baselines, and model evaluation scripts.
People-Counting in Real-Time
Real-time computer-vision pipeline for estimating crowd/people counts from video streams with lightweight inference optimizations.
📂 How I organize repos
Reproducible examples (Docker + scripts) so others can run locally or on a GPU node
Clear README with quickstart + minimal config for common dev environments (macOS / Linux)
Small, focused notebooks and tests for core functionality


