-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctly compute UncompressedSize on zstd:chunked pull, don’t set it on estargz #2130
Conversation
Cc: @giuseppe . This should fix containers/skopeo#2437 (comment) . |
I picked this into my pet pr, and it works slightly better, but now it breaks in the podman additional-store test:
Basically, $ hack/bats -T --root 010
...
same failure, down to the same hashes |
Skopeo is involved, though (for the copy to the store dir), and I haven't rebuilt skopeo, so it's possible that the bad thing is happening in that step. |
Thanks! I can‘t see how the two-ID output can happen. I’ll set up a VM and test (a better version of) this PR, as well as those tests, tomorrow. |
PR updated to compute the size correctly for zstd:chunked with tar-split, manually tested per containers/skopeo#2437 (comment) (and inspecting the created layer metadata). Ready for review. |
Testing that on a slightly unclean VM, Podman commit 2aacd4e212525db4ee06be8e44e4405400d4df9d + this c/storage fix passes 010 successfully in And, on the command line, |
Scratch that last part, that’s not invoking the tested version. |
Ugh. The podman-pause thing is my fault, I couldn't find a good solution to a nasty problem. Please try this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about adding the uncompressed size to the ToC? Seems like it'd possibly simplify things?
This looks sane to me but I only gave it a superficial review.
pkg/chunked/compression_linux.go
Outdated
@@ -288,6 +288,36 @@ func ensureTOCMatchesTarSplit(toc *internal.TOC, tarSplit []byte) error { | |||
return nil | |||
} | |||
|
|||
// tarSizeFromTarSplit computes the total tarball size, computing only tarSplit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// tarSizeFromTarSplit computes the total tarball size, computing only tarSplit | |
// tarSizeFromTarSplit computes the total tarball size using only the tarSplit metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I don’t know what I was thinking.
case storage.FileType: | ||
// entry.Size is the “logical size”, which might not be the physical size for sparse entries; | ||
// but the way tar-split/tar/asm.WriteOutputTarStream combines FileType entries and returned files contents, | ||
// sparse files are not supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we don't error out today?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we should be erroring out, that’s something that should happen at tar-split construction.
The physical size outright isn’t available in the tar-split format, so here this consumer is only documenting the impact of that assumption.
(As is, sparse files are barely supported by archive/tar or the tar-split fork: they are indicated either by a special file type, or as a regular file a special PAX record; and there is no API to expose the data / hole segments. Also, c/storage/pkg/archive
does not special-case them at all.)
Anyway, I’m open to adding a comment to the |
The current value obtained by summing the sizes of regular file contents does not match the size of the uncompressed layer tarball. We don't have a convenient source to compute the correct size for estargz without pulling the full layer and defeating the point; so we must allow for the size being unknown. For recent zstd:chunked images, we have the full tar-split, so we can compute the correct size; that will happen in the following commits. Signed-off-by: Miloslav Trmač <[email protected]>
Empty tar-split shouldn't ever happen, but being precise here doesn't hurt. Signed-off-by: Miloslav Trmač <[email protected]>
Signed-off-by: Miloslav Trmač <[email protected]>
On a VM manually building things: All of:
pass Similarly, containers/podman@9d8e3b0 passes Either way, I haven’t been able to reproduce this:
I have now filed containers/podman#24265 to exercise that in the usual environment. Is there anything else I should be looking at? |
> go mod edit -replace github.com/containers/storage=github.com/mtrmac/storage@chunked-size > go mod tidy && go mod vendor + a HACK to override the bloat check Signed-off-by: Miloslav Trmač <[email protected]>
@mtrmac please forgive me, I've switched gears for today and have commitments I must attend to. The crucial element I don't see in your comment is running the full CI suite using |
Thanks, I was looking for references to |
Thank you. That's really pretty huge. |
/lgtm |
// - If UncompressedDigest is set, this must be set to a valid value. | ||
// - Otherwise, if TOCDigest is set, this is either valid or -1. | ||
// - If neither of this digests is set, this should be treated as if it were | ||
// an uninitialized value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need indentation for list items (same as for code blocks, i.e. an extra space).
// - If UncompressedDigest is set, this must be set to a valid value. | |
// - Otherwise, if TOCDigest is set, this is either valid or -1. | |
// - If neither of this digests is set, this should be treated as if it were | |
// an uninitialized value. | |
// - If UncompressedDigest is set, this must be set to a valid value. | |
// - Otherwise, if TOCDigest is set, this is either valid or -1. | |
// - If neither of this digests is set, this should be treated as if it were | |
// an uninitialized value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, gofmt
didn’t want to help with this (and struct field comments are formatted in HTML just as an uninterpreted code block), so I punted.
You’re right, this is the right thing to do. Fixed in #2136 .
if tarSplit != nil { | ||
uncompressedTarSize, err = tarSizeFromTarSplit(tarSplit) | ||
if err != nil { | ||
return nil, fmt.Errorf("computing size from tar-split") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: errors.New
, or fmt.Errorf("computing site from tar-split: %w", err)
, or just err
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Fixed in #2136 .
Note to self (and in case it is useful to others): Commit “DO NOT MERGE: Test with a zstd:chunked testimage” in containers/podman#24287 . |
Is there more info on that? Sounds worth adding to |
@cgwalters That’s not directly a c/storage property; it’s a c/image choice of a a deterministic image ID, in https://github.com/containers/image/blob/cba49408c0ea237a6aa6dba5e81b74f4a8f23480/storage/storage_dest.go#L671-L684 . Yes, we might eventually need a user-facing explanation; I’m not sure where is a good place for it, the internal c/* projects are not too likely to be known to users. Maybe a Podman blog post. |
Signed-off-by: Miloslav Trmač <[email protected]>
The current value obtained by summing the sizes of regular file contents does not match the size of the uncompressed layer tarball.
We don't have a convenient source to compute the correct size for estargz without pulling the full layer and defeating the point.
For recent zstd:chunked images, we have the full tar-split, so we can compute the correct size;
for now, this doesn't do that. That might slow down image size computation.Absolutely untested, and we probably do want the tar-split-based computation to happen.