ArbiterGe

Ge Lun ArbiterGe

I am a current PhD student in the class of 2019 from Beijing University of Posts and Telecommunications (BUPT)

1 follower · 4 following

Beijing University of Posts and Telecommunications (BUPT)
Beijing haidian district west TuCheng Road 10, Beijing university of posts and telecommunications.
01:06 (UTC +08:00)

Stars

27 stars written in Python

Clear filter

huggingface / trl

Train transformer language models with reinforcement learning.

Python 17,470 2,514 Updated Feb 27, 2026

openai / spinningup

An educational resource to help anyone learn deep reinforcement learning.

Python 11,611 2,435 Updated Aug 5, 2024

lyhue1991 / eat_tensorflow2_in_30_days

Tensorflow2.0 🍎🍊 is delicious, just eat it! 😋😋

Python 9,972 2,455 Updated Sep 22, 2022

CarperAI / trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Python 4,740 483 Updated Jan 8, 2024

OpenDriveLab / UniAD

[CVPR 2023 Best Paper Award] Planning-oriented Autonomous Driving

Python 4,499 515 Updated Oct 29, 2025

google-deepmind / dm_control

Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.

Python 4,465 737 Updated Feb 13, 2026

AI4Finance-Foundation / ElegantRL

Massively Parallel Deep Reinforcement Learning. 🔥

Python 4,295 969 Updated Feb 20, 2026

liuzhao1225 / YouDub-webui

Python 4,016 413 Updated Dec 11, 2025

opendilab / DI-engine

OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.

Python 3,598 429 Updated Dec 7, 2025

eric-mitchell / direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Python 2,858 233 Updated Aug 11, 2024

allenai / RL4LMs

A modular RL library to fine-tune language models to human preferences

Python 2,379 202 Updated Mar 1, 2024

openai / lm-human-preferences

Code for the paper Fine-Tuning Language Models from Human Preferences

Python 1,377 170 Updated Jul 25, 2023

Replicable-MARL / MARLlib

One repository is all that is necessary for Multi-agent Reinforcement Learning (MARL)

Python 1,269 192 Updated Nov 28, 2024

huawei-noah / SMARTS

Scalable Multi-Agent RL Training School for Autonomous Driving

Python 1,107 216 Updated Jan 31, 2025

google-research / seed_rl

SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference. Implements IMPALA and R2D2 algorithms in TF2 with SEED's architecture.

Python 837 145 Updated Nov 29, 2022

kaixindelele / DRLib

DRLib：a Concise Deep Reinforcement Learning Library, Integrating HER, PER and D2SR for Almost Off-Policy RL Algorithms.

Python 564 70 Updated Apr 2, 2024

sjtu-marl / malib

A parallel framework for population-based multi-agent reinforcement learning.

Python 548 65 Updated Dec 14, 2023

aravindr93 / mjrl

Reinforcement learning algorithms for MuJoCo tasks

Python 448 110 Updated Mar 13, 2025

BlackHC / BatchBALD

Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning.

Python 247 54 Updated Jun 5, 2024

younggyoseo / Ape-X

PyTorch Implementation of Distributed Prioritized Experience Replay(Ape-X)

Python 155 17 Updated Apr 28, 2019

laming-chen / fast-map-dpp

Fast Greedy MAP Inference for DPP

Python 130 31 Updated May 11, 2020

angelkim88 / CARLA-Lane_Detection

Python 81 31 Updated Mar 13, 2020

valeoai / LearningByCheating

Forked from dotchen/LearningByCheating

Driving in CARLA using model-free deep reinforcement learning

Python 60 17 Updated Feb 2, 2021

Python 1 3 Updated Apr 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly