Request for Baseline Code to Compare to Purejaxrl Performance #114

reo-g · 2024-09-09T07:01:16Z

reo-g
Sep 9, 2024

Hello, I am currently working on a research project in reinforcement learning and am very interested in stoix.
I am running a custom environment based on gymnax with PPO-Continuous on anakin using 3 GPUs, but the performance is unstable compared to purejaxrl (with 1 GPU).

Is there any baseline code available that can help compare anakin and purejaxrl, like the ones provided in the repository?
I'd like to start by testing both on environments such as CartPole to ensure proper comparison.

Thank you in advance.

EdanToledo · 2024-09-09T08:06:44Z

EdanToledo
Sep 9, 2024
Maintainer

Hello, so purejaxRL PPO and stoix's PPO are very similar, however the main difference is that computation is divided over all your devices say you said. What this means is that parameters like the number of parallel environments is divided by the number of devices so it's possible your PPO hyperparameters are making it unstable. May I ask what hyperparameters you are using? In my experiments using continuous PPO on the Brax environments, I found the Stoix PPO to be very stable and high performing given suitable hyperparameters.

1 reply

EdanToledo Sep 9, 2024
Maintainer

Regarding benchmarking code, I unfortunately don't have a set of scripts set up as I did it manually when I first created the library.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for Baseline Code to Compare to Purejaxrl Performance #114

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Request for Baseline Code to Compare to Purejaxrl Performance #114

reo-g Sep 9, 2024

Replies: 1 comment · 1 reply

EdanToledo Sep 9, 2024 Maintainer

EdanToledo Sep 9, 2024 Maintainer

reo-g
Sep 9, 2024

Replies: 1 comment 1 reply

EdanToledo
Sep 9, 2024
Maintainer

EdanToledo Sep 9, 2024
Maintainer