Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large image size #107

Open
reavessm opened this issue Jul 20, 2021 · 7 comments
Open

Large image size #107

reavessm opened this issue Jul 20, 2021 · 7 comments

Comments

@reavessm
Copy link

I don't know if this is the right place because this isn't really an issue, but more of a question.

Why do the gentoo containers have such a large image size? Currently, latest on am64 is 287.76 MB, while the Fedora image is 58.39 MB and Ubuntu is down to 22.95 MB. I'm seeing that /usr/libexec/gcc is taking 111 MB, and I understand that it wouldn't be Gentoo without GCC, but is there any other place to trim some fat?

@ultrabug
Copy link
Collaborator

As you can see we stick with the official tarballs and they are meant to offer an environment from which you can build and install your Gentoo Linux, I guess that's why.

@KSmanis
Copy link
Contributor

KSmanis commented Jul 28, 2021

AFAIK other distros slim down their Docker images by removing cruft such as unnecessary packages, man pages, etc. We could definitely apply something similar here, but it would probably require guidance from a Gentoo Developer, i.e., someone with a good understanding of the stage3 tarball structure and what is required for a working container.

@nnzv
Copy link

nnzv commented Nov 21, 2023

You can delete unnecessary packages, like cmake, that the software doesn't need to run. Force unmerge with:

emerge -W --rage-clean <foo>

Deleting directories like /var/db is okay as users aren't expected to enter there.

rm -rf /var/db # also /var/cache/distfiles /var/tmp/portage /usr/share/{doc,man} /var/cache/binpkgs 

For more details, check https://wiki.gentoo.org/wiki/Knowledge_Base:Freeing_disk_space.

@nnzv
Copy link

nnzv commented Nov 21, 2023

@ajakk
Copy link
Member

ajakk commented Nov 21, 2023

You can delete unnecessary packages, like cmake, that the software doesn't need to run. Force unmerge with:

We shouldn't remove build dependencies, because it's just going to make it harder to install things downstream.

Deleting directories like /var/db is okay as users aren't expected to enter there.

No, it isn't, because that breaks your Gentoo installation by wiping out the Portage state stored in /var/db/pkg. I think it might make sense to remove the documentation and manpages, since it's approaching 1/3rd of the total image size. As for the other directories, they're empty:

$ podman run -it gentoo/stage3 find /var/cache/distfiles /var/tmp/portage /var/cache/binpkgs
/var/cache/distfiles
find: '/var/tmp/portage': No such file or directory
/var/cache/binpkgs

@nnzv
Copy link

nnzv commented Nov 21, 2023

We shouldn't remove build dependencies, because it's just going to make it harder to install things downstream.

In a regular Gentoo system, we shouldn't. In containers, it doesn't make much sense to keep build dependencies (excluding run dependencies like Ruby) if you've already compiled the software. Obviously, this is to achieve a tiny Docker image.

RUN emerge foo && emerge -W --rage-clean foo-dependency

No, it isn't, because that breaks your Gentoo installation by wiping out the Portage state stored in /var/db/pkg

Yeah, my bad, the fat thing is /var/db/repos/gentoo. In a remote case where you want to execute a shell session for the container and want to restore the ebuild repository, you can simply do:

emaint sync -r gentoo

At least, that's what works for me. Regarding the "empty" directories, it depends on the thing you emerge to the system; a fresh container doesn't have many things to delete.

@berney
Copy link

berney commented Jun 14, 2024

The images, gentoo/portage and gentoo/stage3 are effectively just docker versions of the tarballs you can download from normal Gentoo distribution channels. These are just docker images of the same, effectively a way to distribute the equivalent thing via docker registries.
They serve as being the same building blocks for making a gentoo distro in a docker container, as the tarballs do for a virtual machine or physical host.

For the end image that you want to run, say a web server like nginx, ideally it would be a single nginx binary, distroless, no gcc, no emerge, not bash, etc.
For that you can either use a multistage build similar Arzano's page linked #107 (comment).

Or use Kubler:

Kubler is a build tool that uses Gentoo to build packages, and creates a docker image with just the packages - the final image does not have portage, emerge, the rest of the file system - it just has the packages and whatever you explicitly created.

If you want an official slimmed gentoo docker image, that still has emerge etc, but doesn't have the manpages etc, that should be a separate image like gentoo/gentoo (or something like that).
I think the gentoo/portage and gentoo/stage3 images should map to upstream tarballs. If the tarballs have the manpages, the docker images should too.
So, if the tarballs should drop the manpages, and then the docker images won't have the manpages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants