- ML Infra Intern @ Danggeun Market (Karrot) (2025.07. ~ 2025.10.)
- Built Python-based LLM serving benchmark workflow with automated load testing and GitHub Actions deployment
- Optimized vLLM serving on AWS + Kubernetes with Tensor Parallelism & Quantization
- Achieved 50% serving cost reduction while meeting latency & throughput targets
- Served & evaluated Gemma 3, GPT-OSS, Llama-4, Qwen 3, OCR/Embedding models
-
LLM Serving Optimization
- Kubernetes-based LLM serving infra
- Containerized benchmark pipeline via GitHub Actions
-
AI Service Support
- Supported multimodal LLM, OCR, and embedding models
- Evaluated latency, throughput with real workloads
- B.S. in Electrical and Computer Engineering, Seoul National University (2010.03. ~ 2014.02.)
- M.S. in Electrical and Computer Engineering, Seoul National University (2014.03. ~ 2025.08.)
- Thesis: Named Entity Recognition with Strong and Weak Supervisions
- Research areas: LLMs, NLP, Weak Supervision, Distributed Algorithms
-
THUNDER: Named Entity Recognition Using a Teacher-Student Model with Dual Classifiers
-
Parallel Algorithms for Regret Minimization Queries using Spark
- KCC 2019 · Outstanding Paper Award
-
Distributed & Parallel Skyline Algorithm with Balanced Quadtrees
- KCC 2018 ·
-
More in Google Scholar
Python | PyTorch | HuggingFace | TensorFlow
AWS | Kubernetes | Docker | GitHub Actions
Spark | Hadoop | SQL | Scala | Java