Proximal Policy Optimization

date

Mar 16th 2018

category

machine-learning

short description

Analyzed and improved Proximal Policy Optimization algorithm.

Goals

Comparing the algorithm performance with other baseline techniques for OpenAI game environment
Exploring performance based on input data preprocessing , using different Neural Network architectures & CPU vs GPU training
Modifying different hyperparameters to analyze their impact on the overall performance of the algorithm

link to repository

Implementation

The model is developed using TensorFlow and input data is collected from OpenAI GYM’s MS-PACMAN environment.
Performance of different neural network architectures is explored: “CNN vs LSTM - Reward function”
GPU based training was done using Google Collaboratory
Reference : OpenAI GYM Baselines

Output

Different models based on the modified hyperparemeters, CPU training & GPU training. Performance comparison(rewards & loss function) plots.

back

made with by Achyutha Bharadwaj. © 2019.