Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADR 2: Config v2 #4920

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 109 additions & 0 deletions docs/architecture/adr-002-configv2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# ADR 2: Configuration v2

## Overview

This ADR is very high level proposal for the overall goals and high level aspirations for configuration v2. As such the puspose is not to go overly deep into where/how different fields are defined.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This ADR is very high level proposal for the overall goals and high level aspirations for configuration v2. As such the puspose is not to go overly deep into where/how different fields are defined.
This ADR is very high level proposal for the overall goals and high level aspirations for configuration v2. As such the purpose is not to go overly deep into where/how different fields are defined.


The examples provided in this ADR are currently not exhaustive. So do NOT pay too much attention to each field, they're meant to be more examples of the _types_ of configuration for each object.

## Context

The configration of k0s has organically growed from very small to something quite complex today. While that is natural we're starting to see growing number of issues stemmin from the complexity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The configration of k0s has organically growed from very small to something quite complex today. While that is natural we're starting to see growing number of issues stemmin from the complexity.
The configuration of k0s has organically growed from very small to something quite complex today. While that is natural we're starting to see growing number of issues stemmin from the complexity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The configration of k0s has organically growed from very small to something quite complex today. While that is natural we're starting to see growing number of issues stemmin from the complexity.
The configuration of k0s has organically grown from very small to something quite complex today. While that is natural we're starting to see growing number of issues stemming from the complexity.


- Lack of separation between per node configuration and cluster configuration
- Lack of feedback in dynamic config reconciliation status
- Lack of feedback in helm and stack applier status (should these be in the clusterconfig to begin with?)
- No separation between used defined configuration and the configuration that is used internally
- Lack of versioning
- Dynamic config has several problems when it comes to reconciliation of certain fields
- Some fields have side effects that aren't very predictable.
- The k0s' configuration for calico doesn't really match the calico configuration
- Config doesn't allow to disable components, it requires modifying the cmdline, specially with dynamic config, it would make sense to allow this from the configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Config doesn't allow to disable components, it requires modifying the cmdline, specially with dynamic config, it would make sense to allow this from the configuration.
- Config doesn't allow users to to disable components - doing so requires modifying the cmdline. With dynamic config, it would make sense to allow this from the configuration.

- In dynamicConfig it's not very straightforward what can and what cannot be modified

All of this results in confusions on both the uers side and also on maintainers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
All of this results in confusions on both the uers side and also on maintainers.
All of this results in confusions on both the user' side and also on maintainers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
All of this results in confusions on both the uers side and also on maintainers.
All of this results in confusions on both the user's side and also on maintainers.


## Config v2 goals

The maintainers have agreed on set of high level goals for config v2:

### Config as Kubernetes objects

Configs shall be formatted as Kubernetes object, even if we do not really store them on API. K8s object are the natural language for k0s users and it keeps the option open to actually store everything in the API.

### Per node and cluster wide config separation

Separate the per node configuration and cluster-wide configs into their own CRDs. This makes it clear for both the users and maintainers where to look for which config data. It also makes clear separation on which _things_ can be changed at runtime via dynamic config.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also makes clear separation on which things can be changed at runtime via dynamic config.

Does it actually make it clear which things can be changed and which cannot? Obviously changing helm chart extensions is allowed at runtime, and just as obviously changing the service CIDR range is not - but both would be part of the ClusterConfig, no?


As an example, we would have `ControllerConfig` and `ClusterConfig` for the controlplanes:

```yaml
# ControllerConfig contains only the node specific bits for a controller node
# So basically only the bits that we need to boot up etcd/kine and the API server
apiVersion: k0sproject.io/...
name: k0s
kind: ControllerConfig
spec:
etcd:
privateAddress: 1.2.3.4
api:
listenAddress: 1.2.3.4:6443,[::0]:6443
address: foobar.com:6443
sans:
- foobar.com
enableWorker: true
disableComponents: ["foo", "bar"]
status:
lastReconciledConfig: "<ClusterConfig version: "Full object">"
history: # some kind of history when was which config applied etc.
---
# ClusterConfig contains all the bits that are always cluster-wide
apiVersion: k0sproject.io/...
name: k0s
kind: ClusterConfig
spec:
network:
provider: calico
calico:
foo: bar
```

Similarly we need to have config object for the workers. An example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it proposed here that we will have one ControllerConfig for all Controllers and one WorkerConfig for all Workers ? Wouldn't be useful to have some kind of "label" matching so one could apply certain configuration just for some of their workers for instance ?

One use case I can imagine: some of my worker nodes have GPU and on those I want to setup containerd differently because reasons.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ricardo,

No, the idea is that we have one ControllerConfig for each controller and one WorkerConfig for each worker and one ClusterConfig for the whole cluster.

We also discussed the need of having one way of having WorkerConfigs templates, it can be a field in the ClusterConfig spec or it can be a new object called WorkerGroup or something similar.

My understanding is that in the ADR we shouldn't be very explicit on whether it should be a new object or not, but now that you mention it we should probably document having worker templates/groups as a goal.


```yaml
apiVersion: k0sproject.io/...
name: k0s
kind: WorkerConfig
spec:
kubelet:
args:
foo: bar
containerd:
args:
foo: bar
profile: foobar
```

Let's not break the configs into many separate CRD's, instead use more monolithic approach. One of the deciding factors is whether there's value in highly separated status information. If so, it might warrant a dedicated CRD.

## Versioning

It's clear we need to support both v1 (current) and v2 way of configuration. Also looking forward, we need to ensure that we can move beyond v2 in controlled fashion. Essentially this means that we need to have transformers in place to allow reading in v1 config but internally translating that into v2. And in future from v2 to v3 and so on.

We should utilize the well-known pattern for CRD versioning as outlined in https://book.kubebuilder.io/multiversion-tutorial/conversion-concepts.

## Default values

For config v2, we can change some of the defaults too. Currently we're "stuck" in some cases for non-optimal defaults. But since starting to use config v2 is a concious decision for the users we can change some of the defaults too. As an example, we could change the pod/container log paths to be within the k0s data directory. (`/var/lib/k0s`)

## Status

Proposed.

This has been quite extensively discussed among core maintainers and thus these high level goals and aspirations are well aligned already.

The plan is to enhance this proposal during the implementation phases. We need to have phased approach for the implementation anyways as this is very big topic and cutting through pretty much every piece of k0s.

## Consequences if not implemented

The confusions on both on users and on maintainers will continue, cause pain and bugs.
Loading