Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect starting ratings with an initial pass through data #67

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

dexonsmith
Copy link
Contributor

Add option --detect-starting-ratings, which does an initial pass through games and sets starting ratings in many cases.

Everything is just in the one-game-at-a-time data, and it's not at all configurable, but it at least allow some hardcoded experiments.

Add option `--detect-starting-ratings`, which does an initial pass through
games and sets starting ratings in many cases.

Everything is just in the one-game-at-a-time data, and it's not at all
configurable, but it at least allow some hardcoded experiments.
@dexonsmith dexonsmith requested a review from anoek March 26, 2024 20:33
@dexonsmith
Copy link
Contributor Author

Initial results look not ideal, since all ratings are depressed. Need to at least fine-tune the auto-detection.

@dexonsmith
Copy link
Contributor Author

Numbers look kind of okay the full 30M data set after 23248a0 (I was tuning for the first 1-2M games, then did a long run with all 30M). There are some anomalies... but I'm not sure how to interpret the results anyway...

I'm not sure this auto-detection experiment is going to tell us what we want to know. These two scenarios seem quite different:

  1. auto-detection (or self-selection of level) starting from now (after 30M games have been played)
  2. auto-detection starting from the beginning of time

For (1), you have the advantage that the existing ratings strongly influence new ratings, so new players have very little effect on the rating pool.

For (2), you can/should recalibrate ratings and the constants in rank_to_rating.

This experiment is (2), but without recalibrating rank_to_rating. I'm not sure if/how we can use this to judge the correct levels for turning on self-selected starting ranks.

(There could be other flaws in the experiment as well.)

@dexonsmith
Copy link
Contributor Author

I'm not sure this auto-detection experiment is going to tell us what we want to know.

@anoek, let me know if you have specific ideas about the data you're hoping for here, and I can play around more. I'm hesitant to clean it up and make it ready for review until it's more clear what we're trying to learn.

  • If we want some confirmation that the choices for starting ranks are reasonable for ratings v5, one idea is to only set auto-detected starting ranks for players whose first game is in the last 1-2 years, and see what happens then.
  • If we want to backfill auto-detected starting ranks for ratings v6, then maybe we can/should be recalibrating rating_to_rank somehow in conjunction.

@GreenAsJade
Copy link

@dexonsmith @anoek I guess the other "thing we need to learn" is "are the chosen self-select starting ranks going to be OK"

Is that something this is shooting for ?

@dexonsmith
Copy link
Contributor Author

@dexonsmith @anoek I guess the other "thing we need to learn" is "are the chosen self-select starting ranks going to be OK"

Is that something this is shooting for ?

Right; I think @anoek is hoping to learn that, but I don't think this experiment as-is will tell us that information... right now, it's better set up to answer what would happen if we back-filled starting ratings as part of v6 (but I think that requires recalibration of rank_to_rating to get right).

I'm not quite sure how to nudge it toward answering "the select-select ranks in the context of v5" question. One idea is to auto-detect only for new-players from the last 1-2 years of data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants