Lists (1)
Sort Name ascending (A-Z)
Stars
分享AI Infra知识&代码练习:PyTorch/vLLM/SGLang框架入门⚡️、性能加速🚀、大模型基础🧠、AI软硬件🔧等
ClawFeed — AI-powered news digest with structured summaries from Twitter/RSS feeds and web dashboard
Autonomous Polymarket trading agent powered by Claude Code. Expiry convergence strategy, 94.7% win rate over 2 weeks live testing.
My learning notes for ML SYS.
Post-training with Tinker
Checkpoint-engine is a simple middleware to update model weights in LLM inference engines
verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"
Scaling Deep Research via Reinforcement Learning in Real-world Environments.
🔍 Search-o1: Agentic Search-Enhanced Large Reasoning Models [EMNLP 2025]
Qihoo360 / 360-LLaMA-Factory
Forked from hiyouga/LlamaFactoryadds Sequence Parallelism into LLaMA-Factory
大模型算法岗面试题(含答案):常见问题和概念解析 "大模型面试题"、"算法岗面试"、"面试常见问题"、"大模型算法面试"、"大模型应用基础"
主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题
总结梳理自然语言处理工程师(NLP)需要积累的各方面知识,包括面试题,各种基础知识,工程能力等等,提升核心竞争力
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
Fully open reproduction of DeepSeek-R1
verl: Volcano Engine Reinforcement Learning for LLMs
Minimal reproduction of DeepSeek R1-Zero
LiveBench: A Challenging, Contamination-Free LLM Benchmark
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)
Arena-Hard-Auto: An automatic LLM benchmark.
WildEval / ZeroEval
Forked from allenai/WildBenchA simple unified framework for evaluating LLMs
The official evaluation suite and dynamic data release for MixEval.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
