Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure a better builder_cache architecture. #28

Open
ashpect opened this issue Aug 19, 2024 · 5 comments
Open

Figure a better builder_cache architecture. #28

ashpect opened this issue Aug 19, 2024 · 5 comments

Comments

@ashpect
Copy link
Contributor

ashpect commented Aug 19, 2024

Normally, buildkit would persist the cache in var/lib/buildkit inside the container/system you would run it.
In case you run it, let's say on privileged mode as a docker container, it would mount the containerd layer storage and store at that place.
However, our use case dictates buildkit to run as an ephemeral container whenever we want to build an image.
To achieve persistent cache, we need to mount cache somewhere else.

  1. We could use --export-cache and --import-cache flag to point the build to a registry. This is nice but there might be issues with registry rewrites on cache pushing. View the discussion here.
    However, this drastically reduces the build time as base layers would be same nonetheless. The issue arises when I make runtime py node , then runtime node rb then if we try runtime py go, we won't fetch py's cache.
  2. Building metacall has known limited languages, so after initialisation ; we could first run a build with all languages runtime lang1 lang2 ... and simply export this to the registry and then all other subsequent builds, we only import from the registry and do not export to it, preventing rewrites.
  3. If we don't want to do the above as the initialise build time would be higher, we could just once and for all upload the layers to docker hub or some form of hosted repos and not worry about first build at all. Simply importing in your k8s architecture. However, this won't work in an air gapped environment, so keeping 2nd as the fallback seems logical.
  4. We could mount a volume to the registry or any other container and store the layers as artifacts, and just provide the cache path to buildkit. However, would need to check if there are any rewrites happening if any, and if not, is it feasible. Need to experiment and look.
@ashpect
Copy link
Contributor Author

ashpect commented Aug 19, 2024

The best option seems like to combine 1st, 2nd and 3rd.
We could do this :

  1. On each push to master in metacall/core, we set up a pipeline to run builder for all images and push to a hosted library.
  2. Whenever you require a image for a set of languages, simply install and run builder:
    By default, we pull from registry
  3. We provide the flag --air-gapped or --local-registry being more user friendly, this builds the image as mentioned in the first case, normal cache export and import each time.
    Why not 2nd ? : Cause the overhead of building of all languages is heavy if we are not doing recurring builds. Building all languages is only justified if the builds are repeated and tested for many languages.
    This is only required for a dev point of view, hence we can run
    builder runtime --all-langs which builds all languages and exports them, and we also provide a flag builder runtime py node --no-export which doesn't export and overwrites our cache and we can continue with other languages for builds.
    This lets us have best of all 3 possibilities.

@viferga
Copy link
Member

viferga commented Aug 22, 2024

I would implement something like:
./builder --startup localhost:5000

To build all images and push them to the repo you desire.

@ashpect ashpect mentioned this issue Aug 23, 2024
@ashpect ashpect closed this as completed Aug 25, 2024
@ashpect ashpect reopened this Aug 25, 2024
@ashpect
Copy link
Contributor Author

ashpect commented Aug 25, 2024

So, considering the comment and combining all 3 options for maximum user benefits and blazing builds, when using builder in an ephemeral type of container -

How to use arguments, taking runtime images as an example (can be used with dev and deps as well)-

  1. ./builder runtime py node rb --import-cache "dockerhub link to premade registry with caches" - fastest build, if you just want to have selective images composition with the latest stable cache

For dev purposes -
2. ./builder runtime --startup "local registry" - builds runtime images and pushes caches to the given registry.
It can be followed by ./builder runtime py node rb --import-cache "local registry" for same effects as the previous one but for dev purposes

  1. ./builder runtime py node rb - selective realtime images building as usual

@ashpect
Copy link
Contributor Author

ashpect commented Aug 26, 2024

Screenshot 2024-08-26 at 11 40 45 PM
Successfully using the cache for the subsequent builds after startup. Check the whole run here

@ashpect ashpect closed this as completed Aug 26, 2024
@ashpect ashpect reopened this Aug 26, 2024
@ashpect
Copy link
Contributor Author

ashpect commented Aug 26, 2024

Implemented all of above : one last thing which we can do is that once we get a stable build for all languages , we can push the cache to public common registry and can point the import cache registry to docker hub by default unless mentioned. Would be extremely user friendly. In light of this, leaving this issue open. Could be merged into #31 or even simply mention default env in docker compose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants