-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track per-client metrics over time #91
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
tracking the metrics of each user over time over the course of training the global model can be very useful for distribution of metrics, monitoring outlier users and debugging.
The user metric should be measured at a central iteration only if it actually was sampled of course.
We can have a post processor (https://apple.github.io/pfl-research/reference/postprocessor.html#pfl.postprocessor.base.Postprocessor) that dumps the metrics to disk for offline analysis (a postprocessor have access to an individual user's metrics).
The offline part to analyze and visualize the per-client metrics over time is outside the scope for this GH issue.
This solution must be compatible with distributed simulations. this may require an all-gather if multi-node simulations, but being restricted to single node multi-gpu simulations for this feature is OK.
The result should be (csv?) file(s) with per-client metrics.
The text was updated successfully, but these errors were encountered: