fix: make prv accountant robust to larger epsilons #606

Solosneros · 2023-10-10T05:06:28Z

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Docs change / refactoring / dependency upgrade

Motivation and Context / Related issue

Hi,

this PR fixes #601 and #604.

It will introduce the same fix as in microsoft/prv_accountant#38. Lukas (author of prv accountant, @wulu473) said that In general, adding any additional points is safe and won't affect the robustness negatively.

The cause of these errors seems to be the grid for computing the mean() function of the PrivacyRandomVariableTruncated class. The grid (points variable) used to compute the mean is constant apart from the lowest (self.t_min) and highest point (self.t_max).

This PR determines the grid (points variable) based on the lowest and highest point. More information is below.

Best

Observation

I debugged the code and arrived at some point at the mean() function of the PrivacyRandomVariableTruncated class. The grid (points variable) used to compute the mean is constant apart from the lowest (self.t_min) and highest point (self.t_max). See the line of code here. It looks like this [self.tmin, -0.1, -0.01, -0.001, -0.0001, -1e-05, 1e-05, 0.0001, 0.001, 0.01, 0.1, self.tmax].

It seems that the tmin and tmax are of the order of [-12,12] for the examples that I posted above and even up to [-48,48] for the example that @jeandut posted in the #604 issue whereas they are more like [-7,7] for the readme example for DP-SGD.

We suspect that the integration breaks down when the gridspacing between between tmin / tmax get's too large.

Proposed solution

Determine the points grid based on tmin and tmax but determines the start and end of the logspace based on tmin and tmax.

Before: (

opacus/opacus/accountants/analysis/prv/prvs.py

Lines 99 to 106 in 95df090

    
           points = np.concatenate( 
        
               [ 
        
                   [self.t_min], 
        
                   -np.logspace(-5, -1, 5)[::-1], 
        
                   np.logspace(-5, -1, 5), 
        
                   [self.t_max], 
        
               ] 
        
           )

)

After:

# determine points based on t_min and t_max
lower_exponent = int(np.log10(np.abs(self.t_min)))
upper_exponent = int(np.log10(self.t_max))
points = np.concatenate(
    [
        [self.t_min],
        -np.logspace(start=lower_exponent, stop=-5, num=10),
        [0],
        np.logspace(start=-5, stop=upper_exponent, num=10),
        [self.t_max],
    ]
)

How Has This Been Tested (if it applies)

I ran the examples from the issues #601 and #604 and they don't break anymore.

import opacus
target_delta = 0.001
target_epsilon = 20
steps = 5000
sample_rate=0.19120458891013384

for target_epsilon in [20, 50]:
    noise_multiplier = opacus.privacy_engine.get_noise_multiplier(target_delta=target_delta, target_epsilon=target_epsilon, steps=steps, sample_rate=sample_rate, accountant="prv")
    prv_accountant = opacus.accountants.utils.create_accountant("prv")
    prv_accountant.history = [(noise_multiplier, sample_rate, steps)]
    obtained_epsilon = prv_accountant.get_epsilon(delta=target_delta)
    print(f"target epsilon {target_epsilon}, obtained epsilon {obtained_epsilon}")

target epsilon 20, obtained epsilon 19.999332284974717
target epsilon 50, obtained epsilon 49.99460075990896

target_epsilon = 4
batch_size = 50
epochs = 5
delta = 1e-05
expected_len_dataloader = 500 // batch_size
sample_rate = 1/expected_len_dataloader


noise_multiplier = opacus.privacy_engine.get_noise_multiplier(target_delta=target_delta, target_epsilon=target_epsilon, epochs=epochs, sample_rate=sample_rate, accountant="prv")
prv_accountant = opacus.accountants.utils.create_accountant("prv")
prv_accountant.history = [(noise_multiplier, sample_rate, int(epochs / sample_rate))]
obtained_epsilon = prv_accountant.get_epsilon(delta=target_delta)
print(f"target epsilon {target_epsilon}, obtained epsilon {obtained_epsilon}")

target epsilon 4, obtained epsilon 3.9968389923130356

Checklist

The documentation is up-to-date with the changes I made.
I have read the CONTRIBUTING document and completed the CLA (see CONTRIBUTING).
All tests passed, and additional code has been covered with new tests.

Not able to run all tests locally and unsure if new tests should be added.

facebook-github-bot · 2023-10-10T05:06:45Z

@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-11-27T18:51:46Z

@Solosneros has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2023-11-27T18:51:56Z

@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-11-28T16:19:50Z

This pull request has been merged in ad084da.

fix: determine points for prv mean computation based on inputs

e26140a

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 10, 2023

This was referenced Oct 10, 2023

Allowing users to use large target_epsilon for debugging and research #604

Closed

PRV Accountant fails for specific input values for its args #601

Closed

Merge branch 'main' into fix_prv_mean_computation

b442f9c

facebook-github-bot closed this in ad084da Nov 28, 2023

facebook-github-bot added the Merged label Nov 28, 2023

Solosneros deleted the fix_prv_mean_computation branch November 28, 2023 16:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: make prv accountant robust to larger epsilons #606

fix: make prv accountant robust to larger epsilons #606

Solosneros commented Oct 10, 2023

facebook-github-bot commented Oct 10, 2023

facebook-github-bot commented Nov 27, 2023

facebook-github-bot commented Nov 27, 2023

facebook-github-bot commented Nov 28, 2023

	points = np.concatenate(
	[
	[self.t_min],
	-np.logspace(-5, -1, 5)[::-1],
	np.logspace(-5, -1, 5),
	[self.t_max],
	]
	)

fix: make prv accountant robust to larger epsilons #606

fix: make prv accountant robust to larger epsilons #606

Conversation

Solosneros commented Oct 10, 2023

Types of changes

Motivation and Context / Related issue

How Has This Been Tested (if it applies)

Checklist

facebook-github-bot commented Oct 10, 2023

facebook-github-bot commented Nov 27, 2023

facebook-github-bot commented Nov 27, 2023

facebook-github-bot commented Nov 28, 2023