-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Forward bonus cheating #3
Comments
Adding terminal state and negative reward might do the trick without tweaking too much. |
Does negative terminal state avoid good "escape from facing wall" experiences? Wallowing around at a wall relies on the robustness of collision detection in cocos MapLayers. Holes in collision detection have been observed at higher velocities. Extra checks are in place for border walls. Perhaps best to trust the engine, expand the border checks to be a slide behaviour, remove the terminal flag and turn down the episode length a touch (giving the chance to wallow, but not fill experience memory with the same useless experiences). Also worth giving the agent a fighting chance with initial state. Random rotation seemed to produce continued rotation, but that was probably the agent dropping to epsilon 0.05 very quickly and displaying it's policy of driving in circles.
|
Yeah definitely a problem. Skips the first edge and catches the next inside the wall. Could be... los-cocos/cocos@4ee4903 |
Seem to remember some talk (and observing behaviour) indicating a bug in that forward bonus code. Where an agent can extract a bonus if only 4 of the eyes are seeing wall. In effect being instructed cutting corners is rewarding, even though the outcome isn't so good.
The text was updated successfully, but these errors were encountered: