api: Adding support for 'api extensions' #121

gabrielmougard · 2024-02-16T15:22:14Z

Pre-requisite PR: canonical/microcluster#86

We'll need a system to centralize the extensions (understand optional features that other MicroCluster-backed services might use) of different MicroCluster-based services.

We propose to centralize the extensions in the MicroCluster database, where a service like MicroOVN in this case, writes its OVN related extensions when being bootstrapped.

This will be needed by #113 and canonical/microcloud#245 to check that the deployed MicroOVN used by MicroCloud, supports custom ip encapsulation for an OVN Geneve tunnel.

mkalcok · 2024-02-20T13:50:26Z

/canonical/self-hosted-runners/run-workflows a54d5c9

mkalcok · 2024-02-20T13:54:39Z

Overall LGTM, I feel much better about this feature being part of the microcluster rather than standalone implementation in microovn.
However, I assume that for this PR to pass we'll need to wait for microcluster to merge canonical/microcluster#86 and then upgrade our dependency, correct?

gabrielmougard · 2024-02-20T13:55:35Z

@mkalcok exactly. The dependency upgrade shouldn't be too problematic, there will be only one thing to change in microovn/cmd/microovnd/main.go

h.OnBootstrap = ovn.Bootstrap

would become

h.PreBootstrap = ovn.Bootstrap

or

h.PostBootstrap = ovn.Bootstrap

Not entirely sure, but I'm more confident about the PreBootstrap option.

gabrielmougard · 2024-05-02T07:46:51Z

@mkalcok updated

mkalcok · 2024-05-02T07:47:26Z

/canonical/self-hosted-runners/run-workflows 899b41d

gabrielmougard · 2024-05-03T07:08:33Z

@mkalcok the TLS system test is failing here. I don't quite understand why.. Do you have an idea (I must admit I running out of ideas)?

mkalcok · 2024-05-03T08:09:34Z

@gabrielmougard Originally I thought it's pretty simple case of bad bash syntax that's missing a space between expression and closing ] bracket in the if statement.

expression : [ -n SHA1 Fingerprint=CB:DC:0A:6F:99:21:70:DB:92:27:1A:D6:BD:2E:AC:1C:D0:5C:1E:7C]

then i realized that you didn't change any tests and the error is coming from the BATS library 🤔. I'll take a look. We are sourcing BATS directly from it's main branch. It's possible that some error was introduced there.

mkalcok · 2024-05-03T09:03:32Z

microovn/api/certificates/regenerate_ca.go

@@ -29,7 +29,7 @@ func regenerateCaPut(s *state.State, r *http.Request) response.Response {
 	responseData := types.NewRegenerateCaResponse()

 	// Check that this is the initial node that received the request and recreate new CA certificate
-	if !client.IsForwardedRequest(r) {
+	if !client.IsNotification(r) {


@gabrielmougard So I checked and the problem is not with BATS. Command microovn certificates regenerate-ca indeed does not work. It fails with

Error: command failed: failed to generate new CA: Put "http://control.socket/1.0/ca": context deadline exceeded

as the CI shows.

I also noticed that it absolutely obliterates the CPU 😆. So perhaps this change from IsForwardedRequest to IsNotification is not working as expected.

The way this command is supposed to work is that the original node that receives the request from client, forwards the request to the rest of the nodes in the cluster. This condition is then used to distinguish whether the node received direct request from client (and should forward requests to all other cluster members), or it's the forwarded message and the node should just do its own thing.

The high CPU consumption suggests to me that even the servers that received the forwarded message then try to forward it again to everyone else, creating kind of a death spiral.

But that's just a guess. I didn't dive too deep into it.

gabrielmougard · 2024-05-03T10:04:12Z

@mkalcok can we re run the tests please?

mkalcok · 2024-05-03T12:00:24Z

/canonical/self-hosted-runners/run-workflows eaf8922

mkalcok · 2024-05-03T13:04:33Z

@gabrielmougard I wonder about the upgrade test failure. Is it possible that changes to schema.go, even though they look just cosmetic, trigger some kind of schema change that require all cluster members to be upgraded before the dqlite cluster is accessible again?

gabrielmougard · 2024-05-03T13:10:55Z

@mkalcok that is very possible. We introduced a change in the SQL extension mechanism in MicroCluster canonical/microcluster#94 @masnax do you know how this would work with MicroOVN custom SQL updates?

mkalcok · 2024-05-03T13:52:27Z

Hmm..I tried slapping this piece of code after the snap upgrade because it was enough previously when the upgrade required microovn db schema upgrade. However it doesn't seem that the cluster recovers withing 30 seconds. You probably know more about the microcluster schema upgrades than I do. Could you try manually

setup microovn cluster with snap from store
upgrade microovn with snap from this branch (on every node)
figure out why is the cluster not coming up (microovn cluster list)

In any case, even after it gets resolved, since this change will introduce schema upgrade, we won't be able to backport it to branch-22.03 branch. Though I don't supposed that's an issue for you @gabrielmougard. (just fyi @fnordahl)

Tangentially (also cc @fnordahl) If this becomes part of the branch-24.03, we'll have a bit of a chicken and egg problem. Microovn will require whole cluster to be updated before it starts to work properly again, but underlying OVN will require gradual upgrade (chassis nodes -> regular central nodes -> central node that performs OVN db schema conversion). I wonder how we should approach this situation. Perhaps we could create 24.03-pre-upgrade track in the snap store and the upgrade process to 24.03 could look like this:

22.03/stable (Starting point)
24.03-pre-upgrade (Upgrade to this version will do the Microovn schema upgrade, but the underlying OVN will still be 22.03)
24.03/stable (Upgrade to this version will upgrade OVN to 24.03 and do the OVN db schema conversion)

masnax · 2024-05-03T17:02:37Z

Hmm..I tried slapping this piece of code after the snap upgrade because it was enough previously when the upgrade required microovn db schema upgrade. However it doesn't seem that the cluster recovers withing 30 seconds. You probably know more about the microcluster schema upgrades than I do. Could you try manually
* setup microovn cluster with snap from store

* upgrade microovn with snap from this branch (on every node)

* figure out why is the cluster not coming up (`microovn cluster list`)

This is a consequence of the fact that the introduction of API extensions is itself a schema extension. To give an example, imagine 3 nodes.

node01 runs snap refresh and detects that its expected schema version is ahead of the other cluster members, so it waits.
node02 runs snap refresh and detects the same thing, but is only blocked on node03
node03 runs snap refresh, and since the previous two nodes have already updated their expected schema versions, this node can proceed with committing the changes to the schema.
node03 now progresses to comparing API versions since its updated schema supports this. It detects that it has the highest expected API version because node01 and node02 did not have the necessary schema updates in the earlier steps to record their expected API version, so node03 waits for those nodes to detect that node03 has committed the schema, so they can record their expected API updates.

So you get into a situation where all 3 nodes are waiting for each other. After 30s the loop repeats so node01 will detect that its schema version matches the other nodes, and then waits on only node02 to update its API version. Node02 then finally does this, and the database opens for access.

So because the schema update that introduces API updates is part of the same update that increments the number of API updates, the update process takes at least 30s in this case.

mkalcok · 2024-05-03T18:06:53Z

Thanks for the detailed explanation @masnax. I believe we've gotten ourselves into a sticky situation. microovn.daemon is failing to start after upgrade with:

May 03 17:49:37 microovn-upgrade-1 microovn.daemon[19686]: time="2024-05-03T17:49:37Z" level=warning msg="Local API extensions: [internal:runtime_extension_v1 custom_encapsulation_ip], cluster members API extensions: [[internal:runtime_extension_v1 custom_encapsulation_ip] [internal:runtime_extension_v1 custom_encapsulation_ip] [internal:runtime_extension_v1 custom_encapsulation_ip] [internal:runtime_extension_v1 custom_encapsulation_ip]]"
May 03 17:49:37 microovn-upgrade-1 microovn.daemon[19686]: time="2024-05-03T17:49:37Z" level=error msg="Failed to send database upgrade request" error="Patch \"https://10.75.224.171:6443/cluster/internal/database\": Unable to connect to \"10.75.224.171:6443\": dial tcp 10.75.224.171:6443: connect: connection refused"
May 03 17:49:37 microovn-upgrade-1 microovn.daemon[19686]: time="2024-05-03T17:49:37Z" level=error msg="Failed to send database upgrade request" error="Patch \"https://10.75.224.189:6443/cluster/internal/database\": Unable to connect to \"10.75.224.189:6443\": dial tcp 10.75.224.189:6443: connect: connection refused"
May 03 17:49:37 microovn-upgrade-1 microovn.daemon[19686]: time="2024-05-03T17:49:37Z" level=error msg="Failed to send database upgrade request" error="Patch \"https://10.75.224.222:6443/cluster/internal/database\": Unable to connect to \"10.75.224.222:6443\": dial tcp 10.75.224.222:6443: connect: connection refused"
May 03 17:49:37 microovn-upgrade-1 ovsdb-client[19737]: ovs|00001|reconnect|INFO|unix:/var/snap/microovn/common/run/switch/db.sock: connecting...
May 03 17:49:37 microovn-upgrade-1 ovsdb-client[19737]: ovs|00002|reconnect|INFO|unix:/var/snap/microovn/common/run/switch/db.sock: connected
May 03 17:49:37 microovn-upgrade-1 ovs-vsctl[19738]: ovs|00001|vsctl|INFO|Called as ovs-vsctl set open_vswitch . external_ids:ovn-remote=
May 03 17:49:37 microovn-upgrade-1 ovs-vsctl[19738]: ovs|00002|db_ctl_base|ERR|external_ids:ovn-remote=: argument does not end in "=" followed by a value.
May 03 17:49:37 microovn-upgrade-1 microovn.daemon[19686]: Error: Daemon stopped with error: Failed to run post-start hook: Failed to update OVS's 'ovn-remote' configuration

I believe this comes from

microovn/microovn/ovn/start.go

Lines 47 to 51 in 2524d48

    
           _, err = VSCtl( 
        
           	s, 
        
           	"set", "open_vswitch", ".", 
        
           	fmt.Sprintf("external_ids:ovn-remote=%s", sbConnect), 
        
           )

and the sbConnect is just an empty string. What I find interesting though is that the function that's supposed to fetch the sbConnect

microovn/microovn/ovn/environment.go

Line 71 in 2524d48

func environmentString(s *state.State, port int) (string, string, error) {

is connecting to the database, but it's not failing. It's just returning empty strings.

masnax · 2024-05-03T20:17:20Z

Thanks for the detailed explanation @masnax. I believe we've gotten ourselves into a sticky situation. microovn.daemon is failing to start after upgrade with:

May 03 17:49:37 microovn-upgrade-1 microovn.daemon[19686]: time="2024-05-03T17:49:37Z" level=warning msg="Local API extensions: [internal:runtime_extension_v1 custom_encapsulation_ip], cluster members API extensions: [[internal:runtime_extension_v1 custom_encapsulation_ip] [internal:runtime_extension_v1 custom_encapsulation_ip] [internal:runtime_extension_v1 custom_encapsulation_ip] [internal:runtime_extension_v1 custom_encapsulation_ip]]"
May 03 17:49:37 microovn-upgrade-1 microovn.daemon[19686]: time="2024-05-03T17:49:37Z" level=error msg="Failed to send database upgrade request" error="Patch \"https://10.75.224.171:6443/cluster/internal/database\": Unable to connect to \"10.75.224.171:6443\": dial tcp 10.75.224.171:6443: connect: connection refused"
May 03 17:49:37 microovn-upgrade-1 microovn.daemon[19686]: time="2024-05-03T17:49:37Z" level=error msg="Failed to send database upgrade request" error="Patch \"https://10.75.224.189:6443/cluster/internal/database\": Unable to connect to \"10.75.224.189:6443\": dial tcp 10.75.224.189:6443: connect: connection refused"
May 03 17:49:37 microovn-upgrade-1 microovn.daemon[19686]: time="2024-05-03T17:49:37Z" level=error msg="Failed to send database upgrade request" error="Patch \"https://10.75.224.222:6443/cluster/internal/database\": Unable to connect to \"10.75.224.222:6443\": dial tcp 10.75.224.222:6443: connect: connection refused"
May 03 17:49:37 microovn-upgrade-1 ovsdb-client[19737]: ovs|00001|reconnect|INFO|unix:/var/snap/microovn/common/run/switch/db.sock: connecting...
May 03 17:49:37 microovn-upgrade-1 ovsdb-client[19737]: ovs|00002|reconnect|INFO|unix:/var/snap/microovn/common/run/switch/db.sock: connected
May 03 17:49:37 microovn-upgrade-1 ovs-vsctl[19738]: ovs|00001|vsctl|INFO|Called as ovs-vsctl set open_vswitch . external_ids:ovn-remote=
May 03 17:49:37 microovn-upgrade-1 ovs-vsctl[19738]: ovs|00002|db_ctl_base|ERR|external_ids:ovn-remote=: argument does not end in "=" followed by a value.
May 03 17:49:37 microovn-upgrade-1 microovn.daemon[19686]: Error: Daemon stopped with error: Failed to run post-start hook: Failed to update OVS's 'ovn-remote' configuration

I believe this comes from

microovn/microovn/ovn/start.go

Lines 47 to 51 in 2524d48

    
           _, err = VSCtl( 
        
           	s, 
        
           	"set", "open_vswitch", ".", 
        
           	fmt.Sprintf("external_ids:ovn-remote=%s", sbConnect), 
        
           )

and the sbConnect is just an empty string. What I find interesting though is that the function that's supposed to fetch the sbConnect

microovn/microovn/ovn/environment.go

Line 71 in 2524d48

func environmentString(s *state.State, port int) (string, string, error) {

is connecting to the database, but it's not failing. It's just returning empty strings.

Crap, this was my bad. I forgot to set PRAGMA foreign_keys = OFF before altering the internal_cluster_members table. So when the table was replaced, all tables with foreign keys pointing to that table were getting wiped thanks to ON DELETE CASCADE.

Actually microcluster should probably set that for each internal schema update since any external table we don't know about could be referencing any of our's.

masnax · 2024-05-03T21:17:11Z

@gabrielmougard I have 2 PRs up in microcluster, canonical/microcluster#123 and canonical/microcluster#122, which should fix the issues detected here.

gabrielmougard · 2024-05-03T21:20:19Z

@masnax thanks! I'll have a look

gabrielmougard · 2024-06-10T09:58:09Z

@mkalcok I updated the MicroCluster deps to use @masnax recent work, is it possible to re-run the CI?

Signed-off-by: Gabriel Mougard <[email protected]>

Signed-off-by: Gabriel Mougard <[email protected]> microovn/cmd/microovnd: Pass the MicroOVN extensions map to the MicroCluster initialization process. Signed-off-by: Gabriel Mougard <[email protected]>

gabrielmougard · 2024-06-10T13:26:19Z

@mkalcok I just ran the full check-system test suite on my machine and its all green. I let you run the CI when you think we won't be bothered by the rate limits :)

gabrielmougard · 2024-06-10T13:26:54Z

Thanks @masnax for your MicroCluster PRs !

mkalcok · 2024-06-10T14:57:13Z

/canonical/self-hosted-runners/run-workflows fdfc65c

mkalcok · 2024-06-11T11:50:09Z

@gabrielmougard reason for the failing upgrade tests seems to be that the cluster needs a bit more time to converge after internal schema upgrade. Adding

wait_microovn_online "$container" 60

here

microovn/tests/test_helper/bats/upgrade.bats

Lines 31 to 33 in 3f0c609

    
           done 
        
           perform_manual_upgrade_steps $TEST_CONTAINERS

should solve the issue.

Signed-off-by: Gabriel Mougard <[email protected]>

gabrielmougard · 2024-06-12T08:19:41Z

@mkalcok can you re-run the CI? I added the wait_microovn_online condition for each containers

mkalcok · 2024-06-12T08:20:46Z

/canonical/self-hosted-runners/run-workflows e7c67e4

fnordahl · 2024-06-12T08:41:36Z

So because the schema update that introduces API updates is part of the same update that increments the number of API updates, the update process takes at least 30s in this case.

@masnax seamless and painless upgrades are very important to our users, and as @mkalcok already pointed out we rely on microcluster to manage upgrade of the payload.

Our highest priority for this release is to make the upgrade process bullet proof, ensuring minimal data path downtime and keeping the end user informed (ref #130).

As far as I understand this PR effectively makes the microcluster unavailable for an extended period of time, without any means of informing the user of why or what to do.

This does not come across as great UX and conflicts with our main goal for this release, can the schema/extension migration process be improved in microcluster?

masnax · 2024-06-12T15:30:03Z

@fnordahl Sorry about that, I'm not sure what's happening with the upgrade process here. There was an issue earlier with the upgrade process taking a long time, but this was fixed (by this pr)

Running an upgrade locally appears instantaneous on my end, though I haven't tried it with MicroOVN's testsuite personally. I'll give it a run and let you know my findings.

fnordahl · 2024-06-12T16:06:44Z

Running an upgrade locally appears instantaneous on my end, though I haven't tried it with MicroOVN's testsuite personally. I'll give it a run and let you know my findings.

That's excellent, thank you for looking into it!

FWIW; I did a manual test just before posting the previous comment, and the cluster appeared unresponsive until all nodes were upgraded. I think that is our main issue because we really need the cluster and its CLI to be responsive throughout the upgrade process to guide our users.

masnax · 2024-06-12T22:54:48Z

FWIW; I did a manual test just before posting the previous comment, and the cluster appeared unresponsive until all systems were upgraded. I think that is our main issue because we really need the cluster and its CLI to be responsive throughout the upgrade process to guide our users.

In this case it is absolutely necessary for a cluster member to enter a very restricted state if it encounters an upgrade that changes its database schema. This is because we can't risk applying a change to the schema until we are sure that all cluster members expect the same upgrade. If any one system applies the upgrade blindly, then very abruptly all running systems will become unable to properly read from or write to the database. An additional risk is the possibility of conflicting upgrades occurring on different cluster members.

The upgrade process works as follows: after encountering a new schema upgrade, that cluster member enters a waiting state which restricts access to the database and API until it receives a notification from the final cluster member to receive the upgrade. Only at this point is the upgrade actually applied, and the database and API become open for regular access. Until this point, any non-upgraded cluster member will continue to function. One added benefit here is that since schema upgrades are non-backwards compatible, we maintain the freedom to revert the upgrade until it is applied on the final system, since nothing will have been committed until that point.

I can think of these ways to keep a user informed of the upgrade process and overall cluster status going forward:

Right now, the status errors returned by microcluster are not that descriptive. For the most part, if the cluster is not in a ready state, then the user will just see Daemon not yet initialized. The messages pertaining to the upgrade are included in the logs, but are not returned as API error messages, so we can extend these messages to more precisely report the current state of the cluster.
microovn cluster list always reports either ONLINE or UNREACHABLE so we can extend this to be a bit more nuanced about what the actual issue is. In many cases a node is reachable, but not "ready" yet we still report it as UNREACHABLE.
Additionally, we can add a core microcluster status API that can return some information to be incorporated into MicroOVN's status output.

As for the slowness, I've narrowed it down to two components:

There has been a long-standing bug in microcluster which has since been fixed, but is present in MicroOVN today, which results in systems that join a cluster too close together not receiving a local record for each other until the next heartbeat. This is occurring in the test suite because the cluster is formed in quick succession and then the snaps are refreshed. When the daemons restart, some cluster members report UNREACHABLE until the next heartbeat. You may notice if you place wait_microovn_online before the snap refresh, it will still block since some cluster members are reporting UNREACHABLE right after they have joined the cluster, but before the snap refresh.
ovn.Start takes approximately 10-20 seconds to run since the whole cluster is restarted, and microcluster will not consider a system to have "started" until its OnStart hook has completed.

mkalcok · 2024-06-13T08:01:51Z

Thank you for the detailed insights @masnax.
One question that I have (@gabrielmougard) is whether there's a possibility of avoiding schema changes in this PR at all. The way I understand it, is that you don't actually need to change the schema, you just bumped the microcluster dependency and because of that, we need a new way of representing schema extensions, which counts as a schema change.

What would be a risks/downsides of keeping the older version of microcluster and therefore avoiding the schema change completely?

gabrielmougard · 2024-06-13T09:41:18Z

@mkalcok Well, if we want to go in that direction, I don't see a clean way for a node in the cluster to know what features are enabled in a MicroOVN node. On the MicroCloud side (client querying the MicroOVN service), we could try our luck and call an API endpoint that might or might not be present in MicroOVN to expose MicroOVN's API extensions... If the endpoint exposing the extensions is there, that's great and we can read the extensions and proceed (or not) with our logic in MicroCloud. If it turns out that de deployed MicroOVN is too old and doesn't have such an endpoint (which will return in an error not found), then I guess we could consider this as a 'lack of extension' and not proceed. This is a very brittle and unclean approach IMHO. Also, this approach assumes that all the MicroOVN nodes are always perfectly aligned on the same version (since we can not rely on the strong consistency of the MicroCluster database).. what if this is not the case and that the leader with version A has the endpoint but one node is still version A-1 and has not the endpoint? Then we might proceed on MicroCloud and execute the API call to set up an underlay on the entire MicroOVN cluster which will result in a error or worse, in an inconsistent networking setup for the Geneve tunnel.

I agree with @masnax, there are a couple of improvements we can work on to make the UX better during the upgrade. But ultimately, I don't think there is a way around upgrading the schema (and bumping the version of MicroCluster) without introducing these very risky consistency issues that could even corrupt the MicroOVN networking setup..

fnordahl · 2024-06-13T10:31:32Z

We would welcome an improved schema conversion process.

If it is possible to get that done before our release, great, if not we need to be creative and think tactical if you want this included in the release.

mkalcok · 2024-06-13T11:19:44Z

@gabrielmougard my suggestion/question was more towards whether it's possible to implement the API feature without the schema change. Looking at this PR, there's no real change to the schema, it just changed because you bumped the microcluster version and now there's a new way to represent "schema extensions"

Would it be possible to add this API without bumping the microcluster library, therefore preventing the need for schema change?

gabrielmougard · 2024-06-13T12:35:29Z

Naively, I would say yes it is of course possible. Adding a new API endpoint in MicroOVN for exposing its API extensions will suffice. We'll then have to find a creative way in the client side in Microcloud to ensure everything is correct on all the MicroOVN nodes. I let @masnax correct me if I say something wrong.

mkalcok · 2024-06-13T13:06:18Z

What role does the upgraded microcluster library play in this change? Does it add any feature that makes it easier to implement "extensions" API?

mkalcok · 2024-06-13T13:12:27Z

RE merge conflicts:
To resolve the conflict that got created in the upgrade.bats, you can just move your change (the added wait_microovn_online function) here:

microovn/tests/test_helper/setup_teardown/upgrade.bash

Lines 58 to 60 in ce15d25

    
           done 
        
           perform_manual_upgrade_steps $TEST_CONTAINERS

(btw, it's sufficient to run it in single container. No need to run it in every container )

masnax · 2024-06-13T14:27:30Z

@mkalcok Just like the schema upgrade, the set of API extension must be something all cluster members need to agree on. It is effectively a record of the behaviour of the API. If any one cluster member unilaterally changes its API, then cluster-wide behaviour will be inconsistent and possibly broken.

For that to work, we need to record the API extensions of all cluster members in the global database so any one member can compare what set it expects to what all other cluster members (even currently offline ones) expect, and be instantly aware if a change happens on another system. This means a change to the schema.

So I don't think there is a way to add such a feature without reimplementing a mechanism that works in a similar way to the schema upgrade mechanism, because the API would be (and currently is) unstable when a single system runs snap refresh without coordinating with all other cluster members.

So maybe it's best to say that the real feature here is the addition and utilisation of a way to coordinate non-backwards compatible upgrades across the cluster.

It would be great to know what your requirements are for such a feature. Would the steps I outlined above be sufficient for moving this forward? That is:

A clearer error on blocked endpoints other than Daemon not yet initialized
Allow microovn status to work in a limited capacity during an upgrade and include microcluster status about the upgrade.

masnax · 2024-06-13T14:55:23Z

(btw, it's sufficient to run it in single container. No need to run it in every container )

This is actually dangerous because of the bug I mentioned earlier:

There has been a long-standing bug in microcluster which has since been fixed, but is present in MicroOVN today, which results in systems that join a cluster too close together not receiving a local record for each other until the next heartbeat. This is occurring in the test suite because the cluster is formed in quick succession and then the snaps are refreshed. When the daemons restart, some cluster members report UNREACHABLE until the next heartbeat. You may notice if you place wait_microovn_online before the snap refresh, it will still block since some cluster members are reporting UNREACHABLE right after they have joined the cluster, but before the snap refresh.

While some members might report all cluster members are ONLINE, others (particularly node 2 and node 3) may not yet trust each other, and report each other as UNREACHABLE.

Ideally, there should always be a check after cluster formation that reports all nodes are no longer in a PENDING state, and are reachable from every other system.

FWIW, if the upgrade occurs after all systems are fully reachable and set up, the total down time is only about a second more than the total down time for a restart of all systems.

mkalcok · 2024-06-13T15:39:07Z

This is actually dangerous because of the bug I mentioned earlier:

I see, thank you. I didn't fully realized how it affects the cluster list.

It would be great to know what your requirements are for such a feature.

We are trying to facilitate hassle-free upgrade of the underlying OVN cluster which, coincidentally, also involves schema upgrade of a clustered database 😆. We hold back the schema upgrade until every node in the cluster expects same version of the database. We don't want to leave the user blind during this process. For example, when 2/3 nodes in the cluster are upgraded, running cluster status outputs:

OVN Southbound: Upgrade or attention required!
Currently active schema: 20.21.0
Cluster report (expected schema versions):
	movn1: 20.33.0
	movn3: Missing API. MicroOVN needs upgrade
	movn2: 20.33.0

This lets user know that schema upgrade from 20.21.0 to 20.33.0 is pending and it's the host movn3 that still needs to be upgraded.

You touched on the issue of inconsistent APIs, in this case we deal with it here by assuming that anything that returns 404 on the endpoint that should report "expected schema version" is a node that needs to be upgraded. In the future upgrades, old hosts will simply report older schema version.

I think that if the current error message Daemon not yet initialized could be replaced with something that:

clearly states that cluster operation is limited due to not every member running same version
list of the cluster nodes that need to be upgraded

That would be something we'd be happy with

masnax · 2024-06-13T16:15:40Z

@mkalcok That all sounds reasonable to me, thank you :)

It shouldn't be hard to incorporate some cluster database elements into that cluster status endpoint, and allow it to work while a database upgrade is in progress.

One concern I have is about the cluster error message. There may be very many cluster members so reporting a list would require including some data structure in the error metadata.

Our principle has been to keep the error messages as simple strings for the most part without additional metadata that needs to be checked for and parsed. This is to prevent every error returned to the CLI needing to be checked for various types of metadata, and then transformed into a human-readable representation.

This isn't a hard rule or anything, but as an alternative do you think it would suffice to simply report a summary in the error message rather than include all cluster member statuses, of which there may be very many? As in simply report something like this:

  Cluster database upgrade to version 2 is in progress. Waiting for 5 cluster members to receive the update.

And then more detailed information can be available through the aforementioned cluster status command. Let me know what you think.

Connected to all of this, one thing we arrived at internally is that microcluster doesn't understand the difference between a restart or a reload of the daemon, so that the underlying OVN service can continue to run while the daemon is reloaded for a snap refresh. LXD follows this approach for snap refreshes to ensure instances remain online during a database upgrade, since the actual instances shouldn't care about that.

I'm unsure if MicroOVN has some internal reloading mechanism for snap refreshes? Perhaps we can formalize this in microcluster so that the daemon can detect if the intention is to simply reload, and if so we won't necessarily run the OnStop hook to shutdown ovn services, and then try to start them after the database has settled from an upgrade with the OnStart hook.

This would mean that during a microcluster-level schema upgrade, the underlying OVN service can continue to run. I'm not sure how relevant this will be to the next MicroOVN version since it appears to me that you are also performing some upgrades of OVN itself, so I'm not sure if that entails downtime anyway.

mkalcok · 2024-06-14T13:39:22Z

And then more detailed information can be available through the aforementioned cluster status command. Let me know what you think.

This sounds very reasonable to me. General error message can be terse and instruct user to run microovn status. Then if we'll have tools/API endpoints to determine which hosts still need upgrade, we can generate report about it there.

Regarding the refresh vs restart:

MicroOVN does not implement any onStop hook and I think that realistically, we want OVN services to be restarted on snap refresh, as new MicroOVN can include new version of OVN/OVS packages.
We tested regular (non-schema-changing) upgrades and dataplane downtime is minimal/acceptable (though we still aim to improve it). I don't think we tested dataplane outage in scenario with MicroOVN schema change on upgrade. I'll give it a try. Thank you.

Pending resolution of discussion in canonical#121. Signed-off-by: Frode Nordahl <[email protected]>

This was referenced Feb 16, 2024

microovn/ovn: support OVS ovn-encap-ip configuration from external initConfig #113

Merged

Add a new 'runtime extension' system to MicroCluster canonical/microcluster#86

Merged

gabrielmougard force-pushed the feat/microovn-extensions branch from 8eaf092 to a54d5c9 Compare February 20, 2024 13:31

gabrielmougard marked this pull request as ready for review February 20, 2024 13:32

gabrielmougard requested a review from a team as a code owner February 20, 2024 13:32

gabrielmougard force-pushed the feat/microovn-extensions branch from a54d5c9 to b30bb71 Compare April 16, 2024 09:02

gabrielmougard force-pushed the feat/microovn-extensions branch from b30bb71 to 4a8e57d Compare April 29, 2024 08:44

gabrielmougard changed the title ~~Api: Adding support for 'api extensions'~~ api: Adding support for 'api extensions' Apr 29, 2024

gabrielmougard force-pushed the feat/microovn-extensions branch from 4a8e57d to 899b41d Compare May 2, 2024 07:45

mkalcok reviewed May 3, 2024

View reviewed changes

gabrielmougard force-pushed the feat/microovn-extensions branch from 899b41d to eaf8922 Compare May 3, 2024 10:03

gabrielmougard force-pushed the feat/microovn-extensions branch from eaf8922 to 226bb44 Compare June 10, 2024 09:55

gabrielmougard added 2 commits June 10, 2024 15:24

refactor: Update MicroCluster methods

715009c

Signed-off-by: Gabriel Mougard <[email protected]>

microovn/api: Add MicroOVN extensions system for MicroCluster

fdfc65c

Signed-off-by: Gabriel Mougard <[email protected]> microovn/cmd/microovnd: Pass the MicroOVN extensions map to the MicroCluster initialization process. Signed-off-by: Gabriel Mougard <[email protected]>

gabrielmougard force-pushed the feat/microovn-extensions branch from c910da2 to fdfc65c Compare June 10, 2024 13:24

test: wait for the container to be in a ready state after the upgrade

e7c67e4

Signed-off-by: Gabriel Mougard <[email protected]>

fnordahl mentioned this pull request Jun 14, 2024

Mix of reef snap versions results in microceph.daemon service failures canonical/microceph#367

Closed

fnordahl added a commit to fnordahl/microovn that referenced this pull request Jun 17, 2024

go.mod: Run update-gomod with pinned microcluster.

835ef4b

Pending resolution of discussion in canonical#121. Signed-off-by: Frode Nordahl <[email protected]>

masnax mentioned this pull request Jun 25, 2024

Update microcluster dependency #143

Merged

gabrielmougard closed this Jun 26, 2024

api: Adding support for 'api extensions' #121

api: Adding support for 'api extensions' #121

Conversation

gabrielmougard commented Feb 16, 2024 • edited Loading

mkalcok commented Feb 20, 2024

mkalcok commented Feb 20, 2024

gabrielmougard commented Feb 20, 2024 • edited Loading

gabrielmougard commented May 2, 2024

mkalcok commented May 2, 2024

gabrielmougard commented May 3, 2024 • edited Loading

mkalcok commented May 3, 2024

mkalcok May 3, 2024 • edited Loading

Choose a reason for hiding this comment

gabrielmougard commented May 3, 2024

mkalcok commented May 3, 2024

mkalcok commented May 3, 2024

gabrielmougard commented May 3, 2024 • edited Loading

mkalcok commented May 3, 2024 • edited Loading

masnax commented May 3, 2024 • edited Loading

mkalcok commented May 3, 2024

masnax commented May 3, 2024

masnax commented May 3, 2024

gabrielmougard commented May 3, 2024

gabrielmougard commented Jun 10, 2024

gabrielmougard commented Jun 10, 2024

gabrielmougard commented Jun 10, 2024

mkalcok commented Jun 10, 2024

mkalcok commented Jun 11, 2024

gabrielmougard commented Jun 12, 2024

mkalcok commented Jun 12, 2024

fnordahl commented Jun 12, 2024

masnax commented Jun 12, 2024

fnordahl commented Jun 12, 2024

masnax commented Jun 12, 2024 • edited Loading

mkalcok commented Jun 13, 2024

gabrielmougard commented Jun 13, 2024 • edited Loading

fnordahl commented Jun 13, 2024

mkalcok commented Jun 13, 2024 • edited Loading

gabrielmougard commented Jun 13, 2024

mkalcok commented Jun 13, 2024

mkalcok commented Jun 13, 2024

masnax commented Jun 13, 2024 • edited Loading

masnax commented Jun 13, 2024 • edited Loading

mkalcok commented Jun 13, 2024

masnax commented Jun 13, 2024

mkalcok commented Jun 14, 2024

gabrielmougard commented Feb 16, 2024 •

edited

Loading

gabrielmougard commented Feb 20, 2024 •

edited

Loading

gabrielmougard commented May 3, 2024 •

edited

Loading

mkalcok May 3, 2024 •

edited

Loading

gabrielmougard commented May 3, 2024 •

edited

Loading

mkalcok commented May 3, 2024 •

edited

Loading

masnax commented May 3, 2024 •

edited

Loading

masnax commented Jun 12, 2024 •

edited

Loading

gabrielmougard commented Jun 13, 2024 •

edited

Loading

mkalcok commented Jun 13, 2024 •

edited

Loading

masnax commented Jun 13, 2024 •

edited

Loading

masnax commented Jun 13, 2024 •

edited

Loading