Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrete mean differs from continuous mean significantly #36

Closed
xuefeng-xu opened this issue Mar 6, 2023 · 10 comments · Fixed by #38
Closed

Discrete mean differs from continuous mean significantly #36

xuefeng-xu opened this issue Mar 6, 2023 · 10 comments · Fixed by #38

Comments

@xuefeng-xu
Copy link

I ran the DPSGD example with the sampling_probability=0.125, it got an error said “Discrete mean differs from continuous mean significantly”. Could you please explain why is that?

from prv_accountant.dpsgd import DPSGDAccountant

accountant = DPSGDAccountant(
    noise_multiplier=0.8,
    sampling_probability=0.125,
    eps_error=0.1,
    delta_error=1e-10,
    max_steps=1000
)

eps_low, eps_estimate, eps_upper = accountant.compute_epsilon(delta=1e-6, num_steps=1000)
Traceback (most recent call last):
  File "test.py", line 3, in <module>
    accountant = DPSGDAccountant(
  File "/Users/***/miniconda3/envs/torch/lib/python3.8/site-packages/prv_accountant/dpsgd.py", line 23, in __init__
    super().__init__(prvs=PoissonSubsampledGaussianMechanism(noise_multiplier=noise_multiplier,
  File "/Users/***/miniconda3/envs/torch/lib/python3.8/site-packages/prv_accountant/accountant.py", line 87, in __init__
    dprvs = [discretisers.CellCentred().discretise(tprv, domain) for tprv in tprvs]
  File "/Users/***/miniconda3/envs/torch/lib/python3.8/site-packages/prv_accountant/accountant.py", line 87, in <listcomp>
    dprvs = [discretisers.CellCentred().discretise(tprv, domain) for tprv in tprvs]
  File "/Users/***/miniconda3/envs/torch/lib/python3.8/site-packages/prv_accountant/discretisers.py", line 50, in discretise
    raise RuntimeError("Discrete mean differs from continuous mean significantly.")
RuntimeError: Discrete mean differs from continuous mean significantly.
@JohntyZhou
Copy link

Me too. Has the landlord resolved the issue

@wulu473
Copy link
Contributor

wulu473 commented May 18, 2023

This typically happens when parameters are used that result in very large epsilons. If I run compute-dp-epsilon -s 0.8 -p 0.1 -d 1e-6 -i 1000 I already get epsilons larger than 40 which is not a very meaningful privacy protection. Were these the parameter you were interested in specifically?

@xuefeng-xu
Copy link
Author

xuefeng-xu commented May 18, 2023

This typically happens when parameters are used that result in very large epsilons. If I run compute-dp-epsilon -s 0.8 -p 0.1 -d 1e-6 -i 1000 I already get epsilons larger than 40 which is not a very meaningful privacy protection. Were these the parameter you were interested in specifically?

@wulu473 No, I just randomly test different values of sampling_probability to see the epsilons, other params are stay the same as in the readme. I wonder if there is a way to resolve this, i.e., returning the epsilons instead of throwing an error?

@jeandut
Copy link

jeandut commented Sep 4, 2023

Hi !
I am running into the same issue and the error message is not really helpful. If as @wulu473 says it's because epsilon is considered too loose the message should state that precisely. Also even if say 50 is not up to say industry standards, Opacus should allow experimentations with such a large value.

@wulu473
Copy link
Contributor

wulu473 commented Sep 4, 2023

Thanks for the feedback. I agree that the error message may be hard to understand without knowing the implementation. I can look into making the discretization step more robust for unrealistically high epsilons if this is something people are running into repeatedly. At the very least I'll update the error message.

@jeandut
Copy link

jeandut commented Sep 14, 2023

Hi @wulu473 I posted this issue on opacus: pytorch/opacus#604 which elaborates on my previous comment I didn't know whether to post it here or there. I posted it on opacus because it is how I was exposed to it. I hope you don't mind !

@Solosneros
Copy link
Contributor

Solosneros commented Oct 3, 2023

@wulu473 We are currently looking into this a bit here in Helsinki because we face the error when computing noise_multiplier with opacus for few-shot models. The $(\epsilon, \delta)$ values that seem to cause the error are not unrelatically high. E.g. the prv_accountant.dpsgd.DPSGDAccountant fails for a value that would correspond to $\epsilon \approx 6.8$ and $\delta = 1e^{-5}$ with RDP (first row of prv_broken_values.csv). The prv_accountant version is 0.2.0

I used the below implementation with parameters that match the epsilon_error and delta_error used in opacus. (Not sure if they are reasonable but I would think so.)

Code for reproducing:

from prv_accountant.dpsgd import DPSGDAccountant
import pandas as pd

df = pd.read_csv("results/prv_broken_values.csv")
for i, row in df.iterrows():
    try:
        accountant = DPSGDAccountant(
            noise_multiplier=row["sigma"],
            sampling_probability=row["sample_rate"],
            eps_error=0.01,
            delta_error=row["delta"] / 1000,
            max_steps=int(row["steps"])
        )

        eps_low, eps_estimate, eps_upper = accountant.compute_epsilon(num_steps=int(row["steps"]), delta=row["delta"])
        print(eps_upper)
    except:
        print("Broken prv")

prv_broken_values.csv:

sigma,sample_rate,steps,delta,corresponding_rdp_epsilon
1.1190338134765625,0.111111,90,1e-05,6.799326796587654
1.462371826171875,0.125,320,1e-05,9.531460587058456
1.3625717163085938,0.125,320,1e-05,10.637274866862626
1.429534912109375,0.125,320,1e-05,9.869719941118808
1.3596649169921875,0.125,320,0.0001,9.475935768346865
1.4189605712890625,0.125,320,0.0001,8.856518572214307
1.348785400390625,0.125,320,0.0001,9.597854678608819
1.5253524780273438,0.125,320,1e-07,10.738462714495805
1.57373046875,0.125,320,1e-07,10.255346101802672
0.997283935546875,0.125,80,1e-05,8.94039379409982
0.9454460144042969,0.125,80,1e-05,9.884252821197466
0.9023284912109375,0.125,80,1e-05,10.815695963263234
0.8504867553710938,0.125,40,1e-05,9.015349608271817
0.8737449645996094,0.125,40,1e-05,8.548087958662755
0.8320808410644531,0.125,40,1e-05,9.418355574895706
0.7974414825439453,0.125,40,1e-05,10.253677328523128
0.7291364669799805,0.125,16,1e-05,8.788822485767234
0.6870651245117188,0.125,16,1e-05,9.90809677856695
0.7284049987792969,0.125,16,1e-05,8.80783162412548
0.6993370056152344,0.125,16,1e-05,9.571710266550044
1.462371826171875,0.125,320,1e-05,9.531460587058456
1.3625717163085938,0.125,320,1e-05,10.637274866862626
1.429534912109375,0.125,320,1e-05,9.869719941118808
0.997283935546875,0.125,80,1e-05,8.94039379409982
0.9454460144042969,0.125,80,1e-05,9.884252821197466
0.9023284912109375,0.125,80,1e-05,10.815695963263234
0.8504867553710938,0.125,40,1e-05,9.015349608271817
0.8737449645996094,0.125,40,1e-05,8.548087958662755
0.8320808410644531,0.125,40,1e-05,9.418355574895706
0.7974414825439453,0.125,40,1e-05,10.253677328523128
1.0302276611328125,0.111111,90,1e-05,7.905717689020909
1.0704879760742188,0.111111,90,1e-05,7.366555785990796
1.0301132202148438,0.111111,90,1e-05,7.90732222804079

Hope the information is sufficient. We will try to have a look if we can contribute to a fix.

Best

@Solosneros
Copy link
Contributor

Hi @wulu473,

(disclaimer that I just debugged the issue and am neither an expert of your implementation or of privacy accounting).

Observation

I debugged the code and arrived at some point at the mean() function of the PrivacyRandomVariableTruncated class. The grid (points variable) used to compute the mean is constant apart from the lowest (self.t_min) and highest point (self.t_max). See the line of code here. It looks like this [self.tmin, -0.1, -0.01, -0.001, -0.0001, -1e-05, 1e-05, 0.0001, 0.001, 0.01, 0.1, self.tmax].

It seems that the tmin and tmax are of the order of [-12,12] for the examples that I posted above and even up to [-48,48] for the example that @jeandut posted in the opacus# issue whereas they are more like [-7,7] for the readme example for DP-SGD.

We suspect that the integration breaks down when the gridspacing between between tmin / tmax get's too large.

Proposed solution

Determine the points grid based on tmin and tmax. E.g., using this implementation that is inspired by opacus implemenation but determines the start and end of the logspace based on tmin and tmax.

lower_exponent = int(np.log10(np.abs(self.t_min)))
upper_exponent = int(np.log10(self.t_max))
points = np.concatenate([[self.t_min], -np.logspace(start=lower_exponent, stop=-5, num=10), [0],
                        np.logspace(start=upper_exponent, stop=-5, num=10)[::-1], [self.t_max]])

If I run this, I don't get the error anymore and the epsilon for the readme example for DP-SGD is identical.

Question
Is this is harmless fix or is there some theory from the PRV accountant that speaks against extending the grid here?

@wulu473
Copy link
Contributor

wulu473 commented Oct 5, 2023

Thanks for looking into this. The proposed solution seems very sensible. Initially I had chosen points that give a robust solution for most realistic cases but it seems that there are some cases where it's insufficient. In general, adding any additional points is safe and won't affect the robustness negatively.

If you have this fix ready and it's not too much trouble, any contribution via a PR would be appreciated.

@Solosneros
Copy link
Contributor

Great!

Please have a look at #38.

Btw. I just saw that there is still #35. Would be great if you could close it or merge it.

@wulu473 wulu473 linked a pull request Oct 9, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants