-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add section on Event Labeling Queries #70
base: main
Are you sure you want to change the base?
Conversation
A few quick notes:
cc @badih |
Yeah, that would be a nice option. I actually started out trying to do it that way, but ran into some issues. Once you pass the cap you have to still return something for that source row. For rows that exceeded the cap you can't give back a "not labeled" indicator or that can leak information. I think you could try to return just a random label, maybe by setting to 0 and then flipping with the same flip probability as other rows, but it wasn't immediately clear to me that this also wouldn't increase the DP bound on how much information was revealed about that user. It also wasn't clear to me if random labels on some events would be worst for utility than scaling all the noise. I think we can probably make it work, but like I said this was just a quick first draft to get started. |
I think this should be fine for a DP bound and the argument will look really similar to sensitivity capping + Laplace with per-event queries. Consider two databases D and D' that are adjacent. After rate-limiting to sensitivity That is, we can analyze each of the Footnotes
|
Updated the doc to include the version that caps the number of events labeled per matchkey. |
There has been considerable interest in supporting event level outputs in IPA (IPA issue 60, PATCG issue 41). This was discussed at the May 2023 PATCG meeting where the consensus was we could support this so long as we can enforce a per user bound on the information released.
Here we outline a first draft of how an Event Labeling Query that labels events with noisy labels can be done in a way that lets us maintain a per matchkey DP bound. We also consider how these new queries can be compatible with an IPA system that flexibly supports either aggregation queries or event labeling queries.
w/ @benjaminsavage