Advantage Actor Critic

This repository contains many Reinforcement Learning algorithms that I've implemented over the years out of curiosity. Naturally, the algorithms are not designed to be used by other people. However, some more recent implementations are more user-friendly, such as the algorithms implemented in the A2C folder.

Advantage Actor Critic

Implementation of the Actor-Critic Algorithm using Advantage Estimation to reduce the variance of the policy gradient. This implementation also includes Entropy Regularization to improve exploration.

Code can be found in ./A2C/simpleActorCritic.py

LunarLander

PyBullet Hopper

Parallel Advantage Actor Critic (A2C)

A parallelized version of the Advantage Actor-Critic Algorithm. Instead of exploring only one environment, we run N-Environments in parallel. As a result, we reduce the computational bottleneck created by complex environments and make much more efficient use of the GPU.

I've tested the algorithm on LunarLander and PyBullet Hopper and saw a significant reduction in computation time per step. I've also tested the algorithm on the NES Mario environment.

Note that the agent still likes to jump into holes and enemies. The reason for this is difficult to pinpoint. It could be caused by a poorly chosen convolutional architecture, too short training time, or both. The agent also fails to learn the second level after completing the first level. The main reason for this is that the agent overfits too much on the first level, making it difficult to adapt to the second level without degrading the performance on the first level.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
A2C		A2C
CrossEntropy		CrossEntropy
DeepQLearning		DeepQLearning
Rainbow		Rainbow
VanillaPolicyGradient		VanillaPolicyGradient
gymutils		gymutils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Advantage Actor Critic

LunarLander

PyBullet Hopper

Parallel Advantage Actor Critic (A2C)

About

Uh oh!

Releases

Packages

Languages

License

CommanderCero/RL_Algorithms

Folders and files

Latest commit

History

Repository files navigation

Advantage Actor Critic

LunarLander

PyBullet Hopper

Parallel Advantage Actor Critic (A2C)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages