Baizx98 Baizx98

Baizx98 | 今夜白

About Me

I am a Computer Science PhD student working on efficient large language model inference systems. My recent work sits at the intersection of KV Cache optimization, hierarchical memory management, and practical serving-system design.

I am especially interested in building methods that are not only effective on paper, but also honest under real system constraints such as memory fragmentation, bandwidth pressure, and latency-throughput tradeoffs.

Current Interests

🧠 KV Cache pruning, compression, and offloading
📏 long-context inference optimization
🚀 vLLM-style serving systems and performance tuning
📱 edge-side deployment for resource-constrained devices

Research Vibe

I like research that feels a bit like system detective work:

🔍 Find where the real bottleneck is hiding.
🧠 Figure out whether the cost comes from memory, movement, or scheduling.
⚙️ Turn that pain point into something measurable and optimizable.
📊 Test whether the idea still holds under realistic workloads.

Toolbox

Python for fast prototyping, C++ for systems work, Linux for getting close to the machine, and agent tools for making research workflows a little less manual.

Beyond Research

✍️ I enjoy explaining system ideas as much as building them
🚴 I spend time on badminton, cycling, photography, and sci-fi / mystery reading
✨ I like projects that feel rigorous, useful, and a little elegant

GitHub Snapshot

Motto

Optimize what matters. Keep the system honest.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Baizx98 Baizx98

Achievements

Achievements

Highlights

Block or report Baizx98