Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potentially wrong reward #237

Open
Max-Fu opened this issue Nov 21, 2020 · 1 comment
Open

Potentially wrong reward #237

Max-Fu opened this issue Nov 21, 2020 · 1 comment

Comments

@Max-Fu
Copy link

Max-Fu commented Nov 21, 2020

Inside the gym environment, there are two robot speed: self.speed and self.robot_speed; while self.robot_speed is set to a constant, self.speed is the true speed. Yet in the reward function, the function calls self.robot_speed instead of self.speed (check this). I think this creates the reward mis-specification problem (i.e. DDPG learns trivial policy). Can one of the repo creators check if this is indeed an error? Thanks! (I just restarted my run and will check if this solve the issue.)

@CourchesneA
Copy link
Collaborator

@Max-Fu I think there has not been a lot of test and tuning of that reward function. Please submit a PR if you can improve the current version

MasWag added a commit to MasWag/gym-duckietown that referenced this issue Aug 15, 2024
MasWag added a commit to MasWag/gym-duckietown that referenced this issue Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants