-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
300 remember all previously verified configurations #665
base: main
Are you sure you want to change the base?
300 remember all previously verified configurations #665
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pull request with tests, @steinybot.
Some questions:
- What does using Git's blob storage ultimately get you?
- Will these blobs always remain local? (i.e. can they be pushed to the remote? If they could, that would be bad as an attacker could push fake verified blobs that are automatically accepted by others)
- When are unused blobs garbage collected?
- How would one "unverify" a configuration with this scheme?
In short, it would be great to understand why we're using blob storage when a file under the .git
directory would suffice and not raise the above questions. This justification should be included in the PR description.
See #505 for the previous attempt at this feature. Excited for us to proceed if we can have satisfactory answers to the above questions.
# | ||
# @param arg [Hash] | ||
# @return [String] | ||
def sign_signatures(arg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto on comment about arg name.
end | ||
|
||
# Returns the object signature for some blob. | ||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing doc for the argument type.
# | ||
# @param arg [String] | ||
# @return [String] | ||
def sign(arg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could there be a more descriptive name than the generic term arg
?
The reason was to solve the "quick lookup" requirement since the time taken to lookup a git object is same as retrieving a file. That depends on the file system but in the worst case it is no worse than how well git itself is performing. Having it as git objects also makes it possible to audit signatures, although in its current state this doesn't work very well as there needs to be a tree object to group them all together. If we do it right (i.e. using a tree) then there would be a trivial way of seeing all the overcommit hooks and configuration without any other files and also do diffs between them without including changes from the rest of the repository. Git is going to move to SHA-256 at some point so this would simplify things. I also wasn't really a big fan of putting non-git stuff in the
From a normal users perspective yes they would be local since there wouldn't be a refspec that points to them so they wouldn't be pushed BUT now I realise that a malicious attacker could intentionally make these signature objects reachable from a refspec (i.e. branch or tag) and then a fetch would retrieve them. This is a serious problem.
That's something I hadn't considered. Having unreachable objects is not a good idea.
With the current implementation then it would be to run
Clearly there are some things that I regrettably overlooked. The current problems are:
To solve 1 I think we should be using proper digital signatures rather than just hashes, something like Ed25519. With this we would generate a private key and store that in Having public signatures means that it wouldn't be a big step to start using something like PGP to create a web of trust. In my opinion having everyone on a team being responsible for reviewing every single config change is a bit of a nightmare and usually just results in everyone signing it without checking. We could add a new ref like how git-notes work, something like This doesn't quite solve 2 but that could also just be a new ref under The downside of using the refs in this way is that the tree grows as the number of signatures grow. I'm not sure if git notes has this problem but it would seem like it would. We could always limit the branching factor on the tree if it becomes a problem. Thoughts? |
Thanks for your reply. Apologies for the delay in response, as I am traveling. If I simplify what you're proposing down to to its core elements, it appears you are asking:
These still seem more complicated than a simple file of SHA256 signatures, for benefits that don't seem to have strong justification. With respect to 1: a web of trust model adds complexity and is still considered broken. With respect to 2: I've never used My personal experience running this project is that additional complexity needs to have strong reasoning in order to be included. Revoking a signed configuration should be simple (delete a line). Revoking all signed configurations should be simple (delete a file). If I've mischaracterized these suggestions, please correct me. At this point in time, I am still strongly in favor of someone taking #505 to completion. |
Initial attempt at solving #300.
This works by creating git blobs for each part of the signature and then taking each object id (SHA1) and pairing that with the hash (SHA256) of the contents of that blob. Those pairs are then written to another blob, delimited by a space, and one pair per line.
When checking if a signature has been verified then we compare the contents of the signature blob and not just the hash.
We can't just rely on the object id (SHA1) as collision attacks are feasible. A malicious repo owner could use such an attack to create an object with the same SHA1 as the configuration and hook contents.
I have kept the
overcommit.configuration.signature
in the git config for now but I don't think it is necessary. Removing it would loose the ability to show theNo previously recorded signature for configuration file
error but that doesn't seem like a huge loss. It may still be useful for change detection which is what the previoussignature_changed?
method did but nothing actually used it for that purpose, it was for signature verification instead.I haven't tried using this locally yet so it could do with some manual testing before merging.
There are quite a few tests that fail for me but they also fail on master. Will see what the tests do on this PR.
Let me know what you think.