How do i know when ive reached an optimum while training #153

Quetzalcohuatl · 2021-03-07T16:18:19Z

for example when training tic tac toe, is the optimum reached when win rate == 0.50? my win rate is so far always above 0.50. i havent used the evaluate function yet because i feel like the win_rate printed after every epoch is already an evaluation?

YuriCat · 2021-03-07T16:47:13Z

An opponent player in the evaluation phase is a random player in default.

I think the winning rate of a perfect player versus a random player is about 98% in Tic-Tac-Toe, because random players sometimes choose correct actions.

Generally speaking, "optimal" policy cannot be defined in multi-player games, while the maximum entropy Nash equilibrium is recognized as the representative policy.

Quetzalcohuatl · 2021-03-07T17:19:26Z

YuriCat, can you add an arg in the train function that has evaluate to a different agent? Like i want to evaluate against my model from 20 epochs ago to see if it is improving or not. How can i do this ? I only see its supported in evaluate.py but not train.py

YuriCat · 2021-03-08T06:36:39Z

Thanks for your suggestion.
Selecting opponents is what we are considering right now.
Do you have any good idea to specify the old model in configuration?

By the way, comparing against a model just before the current model may give us an interesting result, since policies trained by RL are sometimes with in a loop like Rock-Scissors-Paper.

Quetzalcohuatl changed the title ~~How do i know when ive reached an optimum~~ How do i know when ive reached an optimum while training Mar 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do i know when ive reached an optimum while training #153

How do i know when ive reached an optimum while training #153

Quetzalcohuatl commented Mar 7, 2021 •

edited

Loading

YuriCat commented Mar 7, 2021

Quetzalcohuatl commented Mar 7, 2021

YuriCat commented Mar 8, 2021

How do i know when ive reached an optimum while training #153

How do i know when ive reached an optimum while training #153

Comments

Quetzalcohuatl commented Mar 7, 2021 • edited Loading

YuriCat commented Mar 7, 2021

Quetzalcohuatl commented Mar 7, 2021

YuriCat commented Mar 8, 2021

Quetzalcohuatl commented Mar 7, 2021 •

edited

Loading