Deepanshu deepanshut041

Hi, I'm Deepanshu Tyagi 👋

🚀 Software Engineer / AI Engineer (5+ years) focused on CUDA/C++, ML Systems / Distributed Training, and LLM infrastructure.
I like building systems that connect low-level performance with real-world ML products — from GPU kernels to scalable AI pipelines.

🔥 What I'm Working On (May 2026)

⚡ NanoTorch (C++/CUDA + Python)

A minimalist deep learning framework built from scratch:

Dynamic autograd + tensor ops (CPU/CUDA)
Core layers + optimizers
Serialization + checkpointing
ONNX export + benchmarks vs PyTorch
🔗 https://github.com//nanotorch

🧠 CUDA + ML Systems

CUDA kernel optimization (matmul/elementwise/fused ops)
Profiling + memory efficiency
Distributed training experiments + benchmarking

🤖 LLM Agents + RAG

LangGraph/LangChain multi-agent workflows
Hybrid retrieval (pgvector + keyword)
Enterprise ingestion (Microsoft Graph / Google Drive)
Evaluation + grounding for reliability

🏆 Highlights

Built production systems: microservices, Kubernetes, PostgreSQL, AWS
AI systems: agents, RAG pipelines, retrieval + evaluation
Strong GPU/C++ focus (CUDA, performance engineering)
Ontario Graduate Certificate (16 months) — Wireless Information Networking
GPA: 3.33/4.0 (4 semesters)

🛠️ Tech Stack

📌 Featured

NanoTorch — C++/CUDA DL framework
🔗 https://github.com//nanotorch
RAG / Multi-Agent Pipelines
🔗 https://github.com//rag-agents
CUDA Kernels / Experiments
🔗 https://github.com//cuda-kernels

📫 Connect

📧 deepanshut041@gamil.com
💼 LinkedIn:
🐙 GitHub: https://github.com/deepanshut041

Provide feedback

Saved searches

Use saved searches to filter your results more quickly