-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discrete mean differs from continuous mean significantly #36
Comments
Me too. Has the landlord resolved the issue |
This typically happens when parameters are used that result in very large epsilons. If I run |
@wulu473 No, I just randomly test different values of sampling_probability to see the epsilons, other params are stay the same as in the readme. I wonder if there is a way to resolve this, i.e., returning the epsilons instead of throwing an error? |
Hi ! |
Thanks for the feedback. I agree that the error message may be hard to understand without knowing the implementation. I can look into making the discretization step more robust for unrealistically high epsilons if this is something people are running into repeatedly. At the very least I'll update the error message. |
Hi @wulu473 I posted this issue on opacus: pytorch/opacus#604 which elaborates on my previous comment I didn't know whether to post it here or there. I posted it on opacus because it is how I was exposed to it. I hope you don't mind ! |
@wulu473 We are currently looking into this a bit here in Helsinki because we face the error when computing I used the below implementation with parameters that match the Code for reproducing:
prv_broken_values.csv:
Hope the information is sufficient. We will try to have a look if we can contribute to a fix. Best |
Hi @wulu473, (disclaimer that I just debugged the issue and am neither an expert of your implementation or of privacy accounting). Observation I debugged the code and arrived at some point at the It seems that the We suspect that the integration breaks down when the gridspacing between between Proposed solution Determine the points grid based on
If I run this, I don't get the error anymore and the epsilon for the readme example for DP-SGD is identical. Question |
Thanks for looking into this. The proposed solution seems very sensible. Initially I had chosen points that give a robust solution for most realistic cases but it seems that there are some cases where it's insufficient. In general, adding any additional points is safe and won't affect the robustness negatively. If you have this fix ready and it's not too much trouble, any contribution via a PR would be appreciated. |
I ran the DPSGD example with the
sampling_probability=0.125
, it got an error said “Discrete mean differs from continuous mean significantly”. Could you please explain why is that?The text was updated successfully, but these errors were encountered: