Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use flatcar as base OS on capo clusters #778

Closed
1 task done
Tracked by #426 ...
cornelius-keller opened this issue Feb 1, 2022 · 21 comments
Closed
1 task done
Tracked by #426 ...

use flatcar as base OS on capo clusters #778

cornelius-keller opened this issue Feb 1, 2022 · 21 comments
Labels
area/kaas Mission: Cloud Native Platform - Self-driving Kubernetes as a Service epic/capo needs/refinement Needs refinement in order to be actionable provider/openstack Related to provider OpenStack team/rocket Team Rocket

Comments

@cornelius-keller
Copy link
Contributor

cornelius-keller commented Feb 1, 2022

Towards #426

User stories

  • As a giant swarm customer I want to have a secure and container optimized operating system in my clusters so that I don't need to worry about the performance and security of my Kubernetes clusters

TODO

Upstream qemu build boot on openstack but doesn't contain the valid kernel parameters to make ignition work. We have to either

  • use upstream image-builder repository and make a flatcar image for openstack ( image-source: https://stable.release.flatcar-linux.net/amd64-usr/current/flatcar_production_openstack_image.img.bz2) or
  • patch the already existing qemu-flatcar image to make ignition work on openstack

Tasks

additional details:

  • flatcar.oem.id isn't set as kernel argument) coreos-metadata service stops and therefore no ignition configuration is being processed
    coreos-metadata[682]: Caused by: Couldn't find flag 'flatcar.oem.id' or 'coreos.oem.id' in cmdline file (/proc/cmdline)
    
  • below listed kubeadmconfigtemplate (which got generated by CAPI/CAPO) on a local machine works fine
`kubeadmconfigtemplate` flatcar linux
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha5
kind: OpenStackMachineTemplate
metadata:
  annotations:
    meta.helm.sh/release-name: galaxy-cluster
    meta.helm.sh/release-namespace: org-giantswarm
  labels:
    app: cluster-openstack
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: 0.10.1
    application.giantswarm.io/team: rocket
    cluster.x-k8s.io/cluster-name: galaxy
    giantswarm.io/cluster: galaxy
    giantswarm.io/organization: giantswarm
    helm.sh/chart: cluster-openstack-0.10.1
  name: galaxy-galaxy-flatcar
  namespace: org-giantswarm
  ownerReferences:
    - apiVersion: cluster.x-k8s.io/v1beta1
      kind: Cluster
      name: galaxy
      uid: dd9f6431-14dd-43f6-8687-257d756b0919
spec:
  template:
    spec:
      cloudName: openstack
      flavor: n1.medium
      identityRef:
        kind: Secret
        name: openstack-cloud-config
      image: "Flatcar Container Linux stable-3139.2.0-kube-v1.21.10-mario"
      rootVolume:
        diskSize: 50
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  annotations:
    meta.helm.sh/release-name: galaxy-cluster
    meta.helm.sh/release-namespace: org-giantswarm
  labels:
    app: cluster-openstack
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: 0.10.1
    application.giantswarm.io/team: rocket
    cluster.x-k8s.io/cluster-name: galaxy
    giantswarm.io/cluster: galaxy
    giantswarm.io/organization: giantswarm
    helm.sh/chart: cluster-openstack-0.10.1
  name: galaxy-galaxy-flatcar-lon-1
  namespace: org-giantswarm
  ownerReferences:
    - apiVersion: cluster.x-k8s.io/v1beta1
      kind: Cluster
      name: galaxy
      uid: dd9f6431-14dd-43f6-8687-257d756b0919
spec:
  clusterName: galaxy
  replicas: 1
  selector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: galaxy
      cluster.x-k8s.io/deployment-name: galaxy-galaxy-flatcar-lon-1
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: cluster-openstack
        app.kubernetes.io/managed-by: Helm
        app.kubernetes.io/version: 0.10.1
        application.giantswarm.io/team: rocket
        cluster.x-k8s.io/cluster-name: galaxy
        cluster.x-k8s.io/deployment-name: galaxy-galaxy-flatcar-lon-1
        giantswarm.io/cluster: galaxy
        giantswarm.io/organization: giantswarm
        helm.sh/chart: cluster-openstack-0.10.1
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
          name: galaxy-galaxy-flatcar-lon-1
      clusterName: galaxy
      failureDomain: gb-lon-1
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1alpha4
        kind: OpenStackMachineTemplate
        name: galaxy-galaxy-flatcar
      version: v1.22.8
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
  annotations:
    meta.helm.sh/release-name: galaxy-cluster
    meta.helm.sh/release-namespace: org-giantswarm
  labels:
    app: cluster-openstack
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: 0.10.1
    application.giantswarm.io/team: rocket
    cluster.x-k8s.io/cluster-name: galaxy
    giantswarm.io/cluster: galaxy
    giantswarm.io/organization: giantswarm
    helm.sh/chart: cluster-openstack-0.10.1
  name: galaxy-galaxy-flatcar-lon-1
  namespace: org-giantswarm
  ownerReferences:
    - apiVersion: cluster.x-k8s.io/v1beta1
      kind: Cluster
      name: galaxy
      uid: dd9f6431-14dd-43f6-8687-257d756b0919
spec:
  template:
    spec:
      ignition:
        containerLinuxConfig:
          strict: false
          additionalConfig: |
            storage:
              links:
              # For some reason enabling services via systemd.units doesn't work on Flatcar CAPI AMIs.
              - path: /etc/systemd/system/multi-user.target.wants/coreos-metadata.service
                target: /usr/lib/systemd/system/coreos-metadata.service
              - path: /etc/systemd/system/multi-user.target.wants/kubeadm.service
                target: /etc/systemd/system/kubeadm.service
              files:
                - contents:
                    source: |
                      ssh-ed25519 <content from KubeadmConfigTemplate>
                  path: /etc/ssh/trusted-user-ca-keys.pem
                  permissions: "0600"
                - contents:
                    source: |
                      # Use most defaults for sshd configuration.
                      Subsystem sftp internal-sftp
                      ClientAliveInterval 180
                      UseDNS no
                      UsePAM yes
                      PrintLastLog no # handled by PAM
                      PrintMotd no # handled by PAM
                      # Non defaults (#100)
                      ClientAliveCountMax 2
                      PasswordAuthentication no
                      TrustedUserCAKeys /etc/ssh/trusted-user-ca-keys.pem
                      MaxAuthTries 5
                      LoginGraceTime 60
                      AllowTcpForwarding no
                      AllowAgentForwarding no
                  path: /etc/ssh/sshd_config
                  permission: "0600"
            systemd:
              units:
              - name: kubeadm.service
                dropins:
                - name: 10-flatcar.conf
                  contents: |
                    [Unit]
                    # kubeadm must run after coreos-metadata populated /run/metadata directory.
                    Requires=coreos-metadata.service
                    After=coreos-metadata.service
                    [Service]
                    # Ensure kubeadm service has access to kubeadm binary in /opt/bin on Flatcar.
                    Environment=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/opt/bin
                    # To make metadata environment variables available for pre-kubeadm commands.
                    EnvironmentFile=/run/metadata/*
      preKubeadmCommands:
      - envsubst < /etc/kubeadm.yml > /etc/kubeadm.yml.tmp
      - mv /etc/kubeadm.yml.tmp /etc/kubeadm.yml
      - 'files="/etc/ssh/trusted-user-ca-keys.pem /etc/ssh/sshd_config"; for f in $files; do tmpFile=$(mktemp); cat "${f}" | base64 -d > ${tmpFile}; if [ "$?" -eq 0 ]; then mv ${tmpFile} ${f};fi;  done;'
      - systemctl restart sshd
      format: ignition
      joinConfiguration:
        nodeRegistration:
          kubeletExtraArgs:
            cloud-provider: external
            node-labels: giantswarm.io/node-pool=galaxy-flatcar-lon-1
          name: ${HOSTNAME}
      postKubeadmCommands:
        - systemctl restart sshd
      users:
        - name: giantswarm
          groups: sudo
          sudo: ALL=(ALL) NOPASSWD:ALL
@cornelius-keller cornelius-keller added area/kaas Mission: Cloud Native Platform - Self-driving Kubernetes as a Service epic/capo team/rocket Team Rocket labels Feb 1, 2022
@cornelius-keller cornelius-keller added this to the CAPI openstack GA milestone Feb 1, 2022
@cornelius-keller cornelius-keller added the needs/refinement Needs refinement in order to be actionable label Feb 1, 2022
@cornelius-keller
Copy link
Contributor Author

@ericgraf please add some links to track upstream progress on flatcar on openstack

@ericgraf
Copy link

ericgraf commented Feb 1, 2022

Response from Kinvolk

We're not currently working on CAPI OpenStack support (and afaik are not planning to; please keep me honest).

Slack thread: https://gigantic.slack.com/archives/C9EUYTKM5/p1643724191649329

@erkanerol
Copy link

We need to check collectors in node-exporter. We have some problems with the ubuntu image now. giantswarm/node-exporter-app#136

@bavarianbidi
Copy link

bavarianbidi commented Apr 13, 2022

imo we have to fix the image build issue by our own.
Once this is done we have to check the state of the known blocking issues and then continue planning a desired shiping date (incl. some refinement stuff what else has to be done by our side)

rocket issues

  • qemu image build for flatcar isn't working - need some investigation

    2022/04/12 13:00:49 packer-builder-qemu plugin: stderr:
    ==> flatcar: Error finding "./packer/qemu/linux/flatcar/http/": stat ./packer/qemu/linux/flatcar/http/: no such file or directory
    ==> flatcar: Deleting output directory...
    2022/04/12 13:00:49 [INFO] (telemetry) ending qemu
    ==> Wait completed after 22 seconds 435 milliseconds
    2022/04/12 13:00:49 machine readable: error-count []string{"1"}
    ==> Some builds didn't complete successfully and had errors:
    2022/04/12 13:00:49 machine readable: flatcar,error []string{"Error finding \"./packer/qemu/linux/flatcar/http/\": stat ./packer/qemu/linux/flatcar/http/: no such file or directory"}
    Build 'flatcar' errored after 22 seconds 435 milliseconds: Error finding "./packer/qemu/linux/flatcar/http/": stat ./packer/qemu/linux/flatcar/http/: no such file or directory
    

blocking issues

provider independent implementation

provider specific implementation (phoenix)

general notable issues/PRs

@invidian
Copy link

Qemu builds should be fixed with kubernetes-sigs/image-builder#829 I think.

@bavarianbidi
Copy link

Qemu builds should be fixed with kubernetes-sigs/image-builder#829 I think.

image build was successful. will try to use a CAPO deployment based on flatcar later on.

@bavarianbidi
Copy link

by using a flatcar image (build from image-builder PR 829), instance got provisioned but doesn't continue with orchestration.
After setting the log-level on capo to debug, the generated user_data seems to be invalid (by running ignition-validate against the base64 decoded content).

Not sure as this happens as the ignition version in user-data is set to 2.3.0.

@kopiczko
Copy link

Proposed upstream change for OCCM kubernetes/cloud-provider-openstack#1928

@glitchcrab
Copy link
Member

Pawel needs to tidy up his mess. Ubuntu testing is moving on, plenty of testing to do yet

@kopiczko
Copy link

kopiczko commented Jul 8, 2022

I have my cluster-openstack branch here giantswarm/cluster-openstack#85

It works ok for creating a fresh cluster with both Ubuntu and Flatcar image. For Flatcar you have to set ingintion.enabled: true.

The problem is when upgrading CP nodes from Ubuntu to Flatcar:

      Reason: Upgrade "pawe1-cluster" failed: cannot patch "pawe1" with kind KubeadmControlPlane: admission webhook "validation.kubeadmcontrolplane.controlplane.cluster.x-k8s.io" denied the request: KubeadmControlPlane.controlplane.cluster.x-k8s.io "pawe1" is invalid: [spec.kubeadmConfigSpec.useExperimentalRetryJoin: Forbidden: cannot be modified, spec.kubeadmConfigSpec.format: Forbidden: cannot be modified]

I tried to edit the CR directly and:

# kubeadmcontrolplanes.controlplane.cluster.x-k8s.io "pawe1" was not valid:
# * spec.kubeadmConfigSpec.useExperimentalRetryJoin: Forbidden: cannot be modified

I don't have a solution to that yet.

BTW this needs to be disabled because useExperimentalRetryJoin is not supported with format: ignition.

@invidian
Copy link

invidian commented Jul 8, 2022

If you need, I guess useExperimentalRetryJoin could be implemented by Ignition. The only reason this hasn't been implemented is because we wanted to have something shippable, so we did not targeted experimental features.

@kopiczko
Copy link

kopiczko commented Jul 8, 2022

@invidian that would be ideal.

But I guess this will be too long for us to wait for the next CAPI release with that feature included.

I worked around that by disabling the webhook temporarily.

@kopiczko kopiczko added provider/openstack Related to provider OpenStack and removed needs/refinement Needs refinement in order to be actionable labels Jul 11, 2022
@kopiczko
Copy link

Dropping this note, because I already forgot it and had to rediscover. We need to set proper OEM in grub.cfg after the image is built. This is how I do this:

guestmount -m /dev/sda6 -a ./image /mnt

echo 'set oem_id="openstack"' > /mnt/grub.cfg
chmod 644 /mnt/grub.cfg

guestunmount /mnt

We have to do that so the image pulls ignition form user-data on OpenStack.

The OEM partition is completely empty when processed by image-builder.

@invidian do you have an idea if we can somehow add this to image-builder? Maybe adding an env var like FLATCAR_OEM=openstack?

@invidian
Copy link

@kopiczko I think it should be fine to follow https://flatcar-linux.org/docs/latest/setup/customization/other-settings/#adding-custom-kernel-boot-options in image-builder for Flatcar, as we boot the Flatcar instance while building the image, then reset it to the fresh state.

@kopiczko
Copy link

@invidian I guess there are technical possibilities to achieve this, but what would be the interface? Because it's still a QEMU not OpenStack image from image-builder POV.

So right now you build QEMU image like:

make FLATCAR_CHANNEL=stable FLATCAR_VERSION=3139.2.3 build-qemu-flatcar

I'm thinking about adding extra FLATCAR_OEM env var like:

make FLATCAR_CHANNEL=stable FLATCAR_OEM=openstack  FLATCAR_VERSION=3139.2.3 build-qemu-flatcar

Does that make sense to you?

@invidian
Copy link

How about adding build-openstack-flatcar target which can be based on build-qemu-flatcar with extra config required for OEM stuff?

@kopiczko
Copy link

I created an upstream issue for this kubernetes-sigs/image-builder#937

@kopiczko
Copy link

I opened another issue for the race between kubeadm and containerd: kubernetes-sigs/image-builder#939

@kopiczko
Copy link

We also need to add ignition support to mc-bootstrap and cluster-api-app:

@kopiczko
Copy link

This is an internal umbrella issue for upstream issues #1264

@JosephSalisbury JosephSalisbury added the needs/refinement Needs refinement in order to be actionable label Aug 8, 2022
@cornelius-keller
Copy link
Contributor Author

done, image builidng and testing tracked in follow up issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kaas Mission: Cloud Native Platform - Self-driving Kubernetes as a Service epic/capo needs/refinement Needs refinement in order to be actionable provider/openstack Related to provider OpenStack team/rocket Team Rocket
Projects
None yet
Development

No branches or pull requests

8 participants