-
Beijing University of Posts and Telecommunications (BUPT)
- Beijing haidian district west TuCheng Road 10, Beijing university of posts and telecommunications.
-
01:06
(UTC +08:00)
Stars
Train transformer language models with reinforcement learning.
An educational resource to help anyone learn deep reinforcement learning.
Tensorflow2.0 🍎🍊 is delicious, just eat it! 😋😋
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
[CVPR 2023 Best Paper Award] Planning-oriented Autonomous Driving
Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.
Massively Parallel Deep Reinforcement Learning. 🔥
OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.
Reference implementation for DPO (Direct Preference Optimization)
A modular RL library to fine-tune language models to human preferences
Code for the paper Fine-Tuning Language Models from Human Preferences
One repository is all that is necessary for Multi-agent Reinforcement Learning (MARL)
Scalable Multi-Agent RL Training School for Autonomous Driving
SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference. Implements IMPALA and R2D2 algorithms in TF2 with SEED's architecture.
DRLib:a Concise Deep Reinforcement Learning Library, Integrating HER, PER and D2SR for Almost Off-Policy RL Algorithms.
A parallel framework for population-based multi-agent reinforcement learning.
Reinforcement learning algorithms for MuJoCo tasks
Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning.
PyTorch Implementation of Distributed Prioritized Experience Replay(Ape-X)
Driving in CARLA using model-free deep reinforcement learning
A Library for Active Preference-based Reward Learning Algorithms
HuaHuoLabel is a multifunctional AI data label tool, which supports data label of five computer vision tasks, including single-category classification, multi-category classification, semantic segme…
StanfordVL / baselines
Forked from openai/baselinesOpenAI Baselines: high-quality implementations of reinforcement learning algorithms