-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Requires running under rootful podman #98
Comments
We're also investigating if we can do at least (some) of the filesystem work with |
libguestfs is just a way to run VMs, so the nested virt concerns above apply. |
Right, so I was just reading about the internals and yeah libguestfs uses qemu to boot a kernel and sets up an "appliance" to talk to it. :| |
The 3rd option (beyond host kernel and virt) is https://github.com/lkl/linux which is relatively new and specifically cptofs is about this problem but...I really don't think it's worth trying to scope this in right now. |
libguestfs doesn't require KVM: https://libguestfs.org/guestfs-faq.1.html I guess it just falls back to emulation if there's no KVM. The question is how fast it is. |
Mounting directly uses FUSE and is pretty poor, but supposedly using the shell can be quite good. We can benchmark of course. FTR, this works on rootless podman machine on macOS: #!/usr/bin/env bash
set -euo pipefail
fname="${1}"
truncate -s 100M "${fname}"
mkfs.ext4 "${fname}"
guestfish --rw -a "${fname}" << EOF
run
list-filesystems
mount /dev/sda /
copy-in test.sh /
cat /test.sh
quit
EOF
echo "DONE"
rm "${fname}" Containerfile FROM fedora:39
RUN dnf -y install libguestfs
ENV LIBGUESTFS_BACKEND=direct
COPY test.sh /test.sh
ENTRYPOINT ["/test.sh"] |
Note that https://github.com/cgwalters/osbuildbootc/ doesn't use libguestfs, but it does use the underlying tool (supermin) to construct a VM root filesystem out of the container rootfs and works unprivileged today. Honestly I think that code and approach there is much simpler than the "higher level" libguestfs approach because we have the ability to drive things at a low level. So if we go down this path I think it'd make sense to look at merging that code. (The other thing osbuildbootc does it defers all the heavy lifting to |
That said what would make much more sense in a modern times is to use virtiofs as the root filesystem instead, it probably wouldn't be too hard. I just haven't dug into it. |
For example, forcing indirection through libguestfs's high level APIs reintroduce the same problems that osbuild creates today that motivates ostreedev/ostree#3094 - what we're doing often wants to do quite low level filesystem and block device things. libguestfs is just high level sugar for executing arbitrary code in a transient VM, and we can construct a transient VM without it. |
I'm worried that doing the whole build under supermin might be extremely slow if KVM is not there. Whereas if we just offload the final copying part, it might be fine. I know that @achilleas-k is working on some benchmarks. |
Also, full QEMU emulation isn't supported on RHEL. I wonder if |
libguestfs doesn't have an exception, its main use case is just targeted being used from Linux hosts. |
I am currently catching up on podman-desktop/extension-bootc#93. What's the current status of this issue? The root requirement can be documented (as pointed out in podman-desktop/extension-bootc#93) but I want to have a better understanding. |
I doubt we're going to do anything major here soon, I think we should just document switching or initializing with |
I don't think we have any ways to fix it. EDIT: Just to clarify, the issue is that we need to mount the disk file so we can write the files into it. That can be done only by a root in the top-level user namespace. Root in a rootless container simply cannot do it. |
Right, to elaborate on that slightly it would create wildly distinct mechanisms for "day 1" versus "day 2". It's not impossible...but would be extremely hard to maintain over time. |
I think this should actually mostly land on the bootc side; making partitions unprivileged is easy. So moving to containers/bootc#859 |
A paper cut we hit today is that podman desktop defaults to rootless, and bib doesn't work with that because we need loopback. The core problem is we need to write Linux filesystems. The important Linux filesystems like XFS/ext4 in general really want to be only written by code from the Linux kernel.
Running the Linux kernel is either done by reusing the host kernel (privileged), or running a VM. But on the podman machine case we're already in a VM, which gets us into nested virt, and on Mac at least that's going to involve full emulation which usually mostly works but isn't considered a production scenario and definitely hits weird random bugs.
My inclination because we're already running this container with
--privileged
is just to behind the scenes reuse the fact that podman machine uses FCOS today and thecore
user has passwordlesssudo
enabled and basically reuse that to re-execute ourselves with real root privileges. Yes, this would not really be "rootless" but I personally don't care about that and I don't think users would really in general either.The text was updated successfully, but these errors were encountered: