I am an Award-winning AI Researcher and Software Engineer specializing in High-Performance Inference, Large Language Models (LLMs), and Agentic RAG architectures. With a strong foundation in algorithmic problem-solving, I bridge the gap between SOTA machine learning research and scalable, production-ready backend systems.
- π Currently working on: - AIMO 3 (Kaggle): Engineering SC-TIR inference pipelines on H100 GPUs for IMO-level mathematical reasoning.
- NanoReason: Distilling Verifiable Chain-of-Thought (CoT) into Small Language Models via Step-Aware LoRA.
- π± Deep diving into: Speculative Decoding, Contextual Retrieval, and Kubernetes-based MLOps.
- π Achievements: ICPC National Honorable Mention, OLP Consolation Prize (IT Majors), IBM AI Engineering Professional.
- Architecture: Agentic GraphRAG utilizing a ReAct loop to autonomously route queries across Neo4j Knowledge Graphs and Hybrid Vector spaces.
- MLOps: Fault-tolerant, event-driven pipeline orchestrated with ZenML and Apache Kafka.
- Inference: Deployed Llama-3.1 8B via vLLM with PagedAttention in a Docker/K8s environment to maximize GPU throughput.
- Research: Engineered a custom
HierarchicalStepLossfunction in PyTorch to enforce a strict 4-step CoT format (Understand β Plan β Execute β Verify). - Impact: Achieved 78% Zero-Shot accuracy on the GSM8K benchmark by effectively transferring reasoning capabilities from a 32B teacher to a 3B student model.
- CSIRO Biomass Estimation: Deployed DINOv2-Large (ViT-L/14) with multi-model ensembling, Dihedral-3 TTA, and constraint-aware post-processing to handle highly skewed distributions.


