verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"

Python 1,745 158 Updated Feb 27, 2026

GAIR-NLP / DeepResearcher

Scaling Deep Research via Reinforcement Learning in Real-world Environments.

Python 718 50 Updated Oct 15, 2025

RUC-NLPIR / Search-o1

🔍 Search-o1: Agentic Search-Enhanced Large Reasoning Models [EMNLP 2025]

Python 1,188 103 Updated Nov 17, 2025

Qihoo360 / 360-LLaMA-Factory

Forked from hiyouga/LlamaFactory

adds Sequence Parallelism into LLaMA-Factory

Python 605 42 Updated Feb 5, 2026

aceliuchanghong / FAQ_Of_LLM_Interview

大模型算法岗面试题(含答案):常见问题和概念解析 "大模型面试题"、"算法岗面试"、"面试常见问题"、"大模型算法面试"、"大模型应用基础"

Jupyter Notebook 1,803 126 Updated Mar 26, 2026

wdndev / llm_interview_note

主要记录大语言大模型（LLMs）算法（应用）工程师相关的知识及面试题

HTML 13,596 1,338 Updated Apr 30, 2025

DA-southampton / NLP_ability

总结梳理自然语言处理工程师(NLP)需要积累的各方面知识，包括面试题，各种基础知识，工程能力等等，提升核心竞争力

Python 7,477 1,204 Updated Aug 24, 2022

PeterGriffinJin / Search-R1

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 4,337 374 Updated Nov 13, 2025

MrGGLS / BlockPruner

A block pruning framework for LLMs.

Python 28 3 Updated May 17, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 25,965 2,412 Updated Nov 24, 2025

verl-project / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 20,333 3,536 Updated Mar 30, 2026

Jiayi-Pan / TinyZero

Minimal reproduction of DeepSeek R1-Zero

Python 13,004 1,584 Updated Feb 27, 2026

microsoft / AI-System

System for AI Education Resource.

Python 4,248 524 Updated Oct 25, 2024

LiveBench / LiveBench

LiveBench: A Challenging, Contamination-Free LLM Benchmark

Python 1,114 99 Updated Mar 23, 2026

allenai / open-instruct

AllenAI's post-training codebase

Python 3,666 520 Updated Mar 30, 2026

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)

Python 9,271 909 Updated Mar 30, 2026

lmarena / arena-hard-auto

Arena-Hard-Auto: An automatic LLM benchmark.

Python 1,011 149 Updated Jun 21, 2025

yuanzhoulvpi2017 / vscode_debug_transformers

Python 425 35 Updated Feb 10, 2025

WildEval / ZeroEval

Forked from allenai/WildBench

A simple unified framework for evaluating LLMs

HTML 267 31 Updated Apr 14, 2025

JinjieNi / MixEval

The official evaluation suite and dynamic data release for MixEval.

Python 256 41 Updated Nov 10, 2024

hiyouga / LlamaFactory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 69,281 8,437 Updated Mar 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MrGGLS MrGGLS

Achievements

Achievements

Block or report MrGGLS

Lists (1)

✨ Inspiration

Stars

CalvinXKY / InfraTech

kevinho / clawfeed

LainNet-42 / polymarket-auto-trading-agent

MoonshotAI / kimi-cli

zhaochenyang20 / Awesome-ML-SYS-Tutorial

thinking-machines-lab / tinker-cookbook

MoonshotAI / checkpoint-engine

MoonshotAI / Kimi-Linear

godweiyang / GrabGPU

langfengQ / verl-agent