Skip to content
View ArbiterGe's full-sized avatar
  • Beijing University of Posts and Telecommunications (BUPT)
  • Beijing haidian district west TuCheng Road 10, Beijing university of posts and telecommunications.
  • 01:06 (UTC +08:00)

Block or report ArbiterGe

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
27 stars written in Python
Clear filter

Train transformer language models with reinforcement learning.

Python 17,470 2,514 Updated Feb 27, 2026

An educational resource to help anyone learn deep reinforcement learning.

Python 11,611 2,435 Updated Aug 5, 2024

Tensorflow2.0 🍎🍊 is delicious, just eat it! 😋😋

Python 9,972 2,455 Updated Sep 22, 2022

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Python 4,740 483 Updated Jan 8, 2024

[CVPR 2023 Best Paper Award] Planning-oriented Autonomous Driving

Python 4,499 515 Updated Oct 29, 2025

Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.

Python 4,465 737 Updated Feb 13, 2026

Massively Parallel Deep Reinforcement Learning. 🔥

Python 4,295 969 Updated Feb 20, 2026

OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.

Python 3,598 429 Updated Dec 7, 2025

Reference implementation for DPO (Direct Preference Optimization)

Python 2,858 233 Updated Aug 11, 2024

A modular RL library to fine-tune language models to human preferences

Python 2,379 202 Updated Mar 1, 2024

Code for the paper Fine-Tuning Language Models from Human Preferences

Python 1,377 170 Updated Jul 25, 2023

One repository is all that is necessary for Multi-agent Reinforcement Learning (MARL)

Python 1,269 192 Updated Nov 28, 2024

Scalable Multi-Agent RL Training School for Autonomous Driving

Python 1,107 216 Updated Jan 31, 2025

SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference. Implements IMPALA and R2D2 algorithms in TF2 with SEED's architecture.

Python 837 145 Updated Nov 29, 2022

DRLib:a Concise Deep Reinforcement Learning Library, Integrating HER, PER and D2SR for Almost Off-Policy RL Algorithms.

Python 564 70 Updated Apr 2, 2024

A parallel framework for population-based multi-agent reinforcement learning.

Python 548 65 Updated Dec 14, 2023

Reinforcement learning algorithms for MuJoCo tasks

Python 448 110 Updated Mar 13, 2025

Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning.

Python 247 54 Updated Jun 5, 2024

PyTorch Implementation of Distributed Prioritized Experience Replay(Ape-X)

Python 155 17 Updated Apr 28, 2019

Fast Greedy MAP Inference for DPP

Python 130 31 Updated May 11, 2020

Driving in CARLA using model-free deep reinforcement learning

Python 60 17 Updated Feb 2, 2021

A Library for Active Preference-based Reward Learning Algorithms

Python 54 12 Updated Dec 16, 2023

HuaHuoLabel is a multifunctional AI data label tool, which supports data label of five computer vision tasks, including single-category classification, multi-category classification, semantic segme…

Python 10 2 Updated Jan 19, 2024

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Python 1 3 Updated Apr 26, 2019