Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] hyperparameter optimization: objective of optuna study #469

Open
5 tasks done
bias-ster opened this issue Aug 23, 2024 · 0 comments
Open
5 tasks done

[Question] hyperparameter optimization: objective of optuna study #469

bias-ster opened this issue Aug 23, 2024 · 0 comments
Labels
duplicate This issue or pull request already exists question Further information is requested

Comments

@bias-ster
Copy link

❓ Question

Hi,

I’ve been adapting your code for PPO hyperparameter optimization for my custom environment and I have a question regarding the evaluation metric used.

In exp_manager.py, on line 810, I noticed that the optimization objective is defined using:
reward = eval_callback.last_mean_reward

This means that only the last evaluation is used to determine if the current trial is the best one. I was wondering if there’s a specific reason for this approach. Would you consider using:
'reward = eval_callback.best_mean_reward'
instead?

Checklist

@bias-ster bias-ster added the question Further information is requested label Aug 23, 2024
@araffin araffin added the duplicate This issue or pull request already exists label Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants