Skip to content

Latest commit

 

History

History
50 lines (27 loc) · 773 Bytes

README.md

File metadata and controls

50 lines (27 loc) · 773 Bytes

Implementation of PPO (Proximal Policy Optimization)

This is a tensorflow implementation of proximal policy optimization (PPO) algorithm for continuous action

Original Paper

here

Demo

Pendulum-v0

Results

Total Scores Vs Number of iteration (Pendulum-v0)

Scores

Losses (Pendulum-v0)

Losses

Dependencies

  • python 3.5
  • tensorflow 1.1.0
  • openAI

Usage

For Training Run:

$ python3 trainer.py

For Demo Run:

$ python3 play.py

Credit

Reference Project

PPO