Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial proposal for #4234 (cgroups inheritance when using k0s in docker) #5059

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

turdusmerula
Copy link

@turdusmerula turdusmerula commented Sep 30, 2024

Description

In the issue #4234 we discussed the lack of cgroup inheritance when k0s is launched inside a container and the fact that the kubelet does not use the limits set for the container.

Here is what I've tested so far and does not work:

  • it is not possible to run the kubelet inside the container cgroup slice
  • when the kubelet is run inside a cgroup with limitations and pods are created inside this cgroup the limitation operate correctly. However the eviction mechanism of the kubelet does not work as the kubelet takes for default limits the system total resources.

Here is what I propose in a partial solution to this problem. To make the kubelet aware of it's cgroup limits the system reserved resources are set for the kubelet inside the kubelet-config.yaml file by k0s with its current limits.
What it basically does is k0s takes its limits from the cgroup it belongs to, then from those limitations and the system capacity it calculates the amount of resources the kubelet should left untouched for the system.

Note: it is not a docker specific solution, it may also be suitable for #4319 and #4255

For example if the container is launched with --cpuset-cpus="3,4" --memory 1G --cpus="0.2" then the following configuration is pushed in kubelet-config.yaml file:

systemReserved:
  cpu: 800m
  memory: "66037497856"
reservedSystemCPUs: 0-2,5-31

The node will then have the following configuration:

Capacity:
  cpu:                32
  ephemeral-storage:  958798960Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             65538320Ki
  pods:               110
Allocatable:
  cpu:                2
  ephemeral-storage:  883629120073
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             924Mi
  pods:               110

This is a partial solution as the kubelet and created pods are not inside a cgroup with some limitations and thus the limitations relies on the kubelet eviction mechanism instead of the kernel one.

This proposal need to be discussed as it imposes that we remove the two following configs from kubelet-config.yaml which are incompatible with the proposal (see https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/):

KubeReservedCgroup: system.slice
KubeletCgroups: /system.slice/containerd.service

Fixes #4234

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update

How Has This Been Tested?

  • Manual test
  • Auto test added

Here is the script I launched to test the code, with a home built version of the k0s container:

# start first controller
docker run  --name k0s-controller01 -it -d --privileged \
    --cgroupns=host --hostname k0s-controller01 \
    --cpuset-cpus="0" --memory 1G \
    -v /var/lib/k0s \
    -v ~/.ssh/id_rsa.pub:/keys/id_rsa.pub \
    k0s:v1.30.4-k0s.0 k0s controller
sleep 5

# create ingress worker
token=$(docker exec k0s-controller01 k0s token create --role=worker)
docker run  --name k0s-worker01 -it -d --privileged \
    --cgroupns=host --hostname k0s-worker01 \
    --cpuset-cpus="3,4" --memory 1G --cpus="0.2" \
    -v /var/lib/k0s \
    -p 80:80 \
    -p 443:443 \
    k0s:v1.30.4-k0s.0 k0s worker $token

# add more workers
token=$(docker exec k0s-controller01 k0s token create --role=worker)
docker run  --name k0s-worker02 -it -d --privileged \
    --cgroupns=host --hostname k0s-worker02 \
    --cpuset-cpus="5,6" --memory 4G \
    -v /var/lib/k0s \
    k0s:v1.30.4-k0s.0 k0s worker $token

token=$(docker exec k0s-controller01 k0s token create --role=worker)
docker run  --name k0s-worker03 -it -d --privileged \
    --cgroupns=host --hostname k0s-worker03 \
    --cpuset-cpus="7,8" --memory 4G \
    -v /var/lib/k0s \
    k0s:v1.30.4-k0s.0 k0s worker $token

Checklist:

  • My code follows the style guidelines of this project
  • My commit messages are signed-off
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

@turdusmerula turdusmerula requested a review from a team as a code owner September 30, 2024 22:18
@turdusmerula turdusmerula changed the title Partial proposal for #4234 Partial proposal for #4234 (cgroups inheritance when using k0s in docker) Sep 30, 2024
Copy link
Contributor

The PR is marked as stale since no activity has been recorded in 30 days

@github-actions github-actions bot added Stale and removed Stale labels Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cgroups inheritance when using k0s in docker
1 participant