Skip to content
View ductaip's full-sized avatar

Organizations

@CNTT-UTH

Block or report ductaip

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ductaip/README.md

Hi there, I'm Phan Duc Tai πŸ‘‹

AI Engineer | MLOps Specialist | Kaggle Competitor

Email LinkedIn Kaggle


πŸš€ About Me

I am an Award-winning AI Researcher and Software Engineer specializing in High-Performance Inference, Large Language Models (LLMs), and Agentic RAG architectures. With a strong foundation in algorithmic problem-solving, I bridge the gap between SOTA machine learning research and scalable, production-ready backend systems.

  • πŸ”­ Currently working on: - AIMO 3 (Kaggle): Engineering SC-TIR inference pipelines on H100 GPUs for IMO-level mathematical reasoning.
    • NanoReason: Distilling Verifiable Chain-of-Thought (CoT) into Small Language Models via Step-Aware LoRA.
  • 🌱 Deep diving into: Speculative Decoding, Contextual Retrieval, and Kubernetes-based MLOps.
  • πŸ† Achievements: ICPC National Honorable Mention, OLP Consolation Prize (IT Majors), IBM AI Engineering Professional.

πŸ† Featured Projects & Research

  • Architecture: Agentic GraphRAG utilizing a ReAct loop to autonomously route queries across Neo4j Knowledge Graphs and Hybrid Vector spaces.
  • MLOps: Fault-tolerant, event-driven pipeline orchestrated with ZenML and Apache Kafka.
  • Inference: Deployed Llama-3.1 8B via vLLM with PagedAttention in a Docker/K8s environment to maximize GPU throughput.

πŸ”¬ NanoReason-3B (Step-Aware LoRA)

  • Research: Engineered a custom HierarchicalStepLoss function in PyTorch to enforce a strict 4-step CoT format (Understand β†’ Plan β†’ Execute β†’ Verify).
  • Impact: Achieved 78% Zero-Shot accuracy on the GSM8K benchmark by effectively transferring reasoning capabilities from a 32B teacher to a 3B student model.

🌍 Kaggle Competitions

  • CSIRO Biomass Estimation: Deployed DINOv2-Large (ViT-L/14) with multi-model ensembling, Dihedral-3 TTA, and constraint-aware post-processing to handle highly skewed distributions.

πŸ’» Tech Stack & Tools

AI & Machine Learning:
PyTorch Hugging Face vLLM Agentic RAG Computer Vision

System Design & Backend:
Python Go C++ FastAPI

MLOps & Infrastructure:
Docker Kubernetes Kafka ZenML

Databases:
Neo4j Qdrant MongoDB Redis Postgres

Pinned Loading

  1. HKUDS/nanobot HKUDS/nanobot Public

    "🐈 nanobot: The Ultra-Lightweight OpenClaw"

    Python 33.3k 5.5k

  2. NeuralTwin-Enterprise-RAG-System-MLOps-Pipeline NeuralTwin-Enterprise-RAG-System-MLOps-Pipeline Public

    Advanced Agentic GraphRAG system with ZenML MLOps, HyDE, cross-encoder reranking, Small-to-Big hierarchical retrieval, and full cloud finetuning/evaluation on SageMaker.

    Python 4

  3. Notebook-Algorithm Notebook-Algorithm Public

    C++ 2

  4. E2E-Cloud-Native-DevOps-Pipeline E2E-Cloud-Native-DevOps-Pipeline Public

    Shell 2

  5. Restaurant-Management-System Restaurant-Management-System Public

    Restaurant Management System is a web application built with Next.js and TypeScript. It offers features for restaurant management, including table reservations, order tracking, menu management, and…

    TypeScript 15 1