Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward bonus cheating #3

Open
mryellow opened this issue Jul 26, 2017 · 3 comments
Open

Forward bonus cheating #3

mryellow opened this issue Jul 26, 2017 · 3 comments

Comments

@mryellow
Copy link
Owner

Seem to remember some talk (and observing behaviour) indicating a bug in that forward bonus code. Where an agent can extract a bonus if only 4 of the eyes are seeing wall. In effect being instructed cutting corners is rewarding, even though the outcome isn't so good.

@mryellow
Copy link
Owner Author

Adding terminal state and negative reward might do the trick without tweaking too much.

06d4c2a

@mryellow
Copy link
Owner Author

mryellow commented Jul 27, 2017

Does negative terminal state avoid good "escape from facing wall" experiences?

Wallowing around at a wall relies on the robustness of collision detection in cocos MapLayers. Holes in collision detection have been observed at higher velocities. Extra checks are in place for border walls.

Perhaps best to trust the engine, expand the border checks to be a slide behaviour, remove the terminal flag and turn down the episode length a touch (giving the chance to wallow, but not fill experience memory with the same useless experiences).

Also worth giving the agent a fighting chance with initial state. Random rotation seemed to produce continued rotation, but that was probably the agent dropping to epsilon 0.05 very quickly and displaying it's policy of driving in circles. rotation was confirmed to only be a direction and not a continued angular velocity. Random is probably best, although facing the middle is another option.

  • Slide on border
  • No wall terminal for mode 0
  • Shorten episode max length
  • Start with random rotation towards centre

mryellow added a commit that referenced this issue Jul 27, 2017
@mryellow
Copy link
Owner Author

mryellow commented Jul 27, 2017

Holes in collision detection have been observed at higher velocities.

Yeah definitely a problem. Skips the first edge and catches the next inside the wall.

Could be... los-cocos/cocos@4ee4903

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant