Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump to kind 0.20.0 #1297

Merged
merged 1 commit into from
Nov 20, 2023
Merged

Bump to kind 0.20.0 #1297

merged 1 commit into from
Nov 20, 2023

Conversation

skitt
Copy link
Member

@skitt skitt commented Jul 3, 2023

This adds support for Kubernetes 1.28.

Depends on ovn-org/ovn-kubernetes#4015

@skitt skitt added the e2e-projects Run E2E tests in each consuming project label Jul 3, 2023
@submariner-bot
Copy link

🤖 Created branch: z_pr1297/skitt/kind-0.20

@dfarrell07
Copy link
Member

Deployment (ovn) failed, re-running.

@dfarrell07
Copy link
Member

Deployment (ovn) failed again.

@dfarrell07
Copy link
Member

Deployment (ovn) failed again, seems to be consistent.

@tpantelis tpantelis enabled auto-merge (rebase) July 13, 2023 17:50
@skitt skitt mentioned this pull request Jul 24, 2023
@sridhargaddam
Copy link
Member

@aswinsuryan the OVN cluster deployment seems to be failing. Do you have any idea of the recent changes in OVN repo?

2023-07-28T12:39:25.4713041Z [cluster2] Deleted nodes: ["cluster2-worker" "cluster2-worker2" "cluster2-control-plane"]
2023-07-28T12:39:25.4713315Z �[36m[12:39:24.909] [dir=shipyard; cl=cluster2; fn=delete_cluster_on_fail]$ return 1�[0m
2023-07-28T12:39:25.4713569Z [cluster2] ERROR: Max attempts reached, failed to run 'provider_create_cluster'!

@tpantelis
Copy link
Contributor

tpantelis commented Aug 11, 2023

The OVN job consistently fails in this PR (oddly with very little output - seems to timeout) but didn't fail in the dependabot bump PR. The difference is that this PR updates the hashes in scripts/shared/lib/clusters_kind.

@aswinsuryan
Copy link
Contributor

It may be solved when dfarrell07#55 goes in. The OVN image we use is an old one as it does get update in quay for every PR merged in OVN. With this one we building image with latest code.

@tpantelis
Copy link
Contributor

It may be solved when dfarrell07#55 goes in. The OVN image we use is an old one as it does get update in quay for every PR merged in OVN. With this one we building image with latest code.

Ok but does that explain why the same job for #1333 succeeds?

@aswinsuryan
Copy link
Contributor

It may be solved when dfarrell07#55 goes in. The OVN image we use is an old one as it does get update in quay for every PR merged in OVN. With this one we building image with latest code.

Ok but does that explain why the same job for #1333 succeeds?

oh that is strange , I would expect it to be same.

@tpantelis
Copy link
Contributor

It may be solved when dfarrell07#55 goes in. The OVN image we use is an old one as it does get update in quay for every PR merged in OVN. With this one we building image with latest code.

Ok but does that explain why the same job for #1333 succeeds?

oh that is strange , I would expect it to be same.

This PR also updated the kind version hashes in scripts/shared/lib/clusters_kind so I'm sure that's the reason. So something in the newer version is causing an issue with OVN.

@mkolesnik
Copy link
Collaborator

Ok but does that explain why the same job for #1333 succeeds?

I might be wrong, but AFAICT the dependabot updates the go dependencies, while this one updated the actual K8s env that gets deployed, hence the first one is fine while this one clearly has some issues with OVN (which might get fixed with Daniel's patch)

It may be solved when dfarrell07#55 goes in. The OVN image we use is an old one as it does get update in quay for every PR merged in OVN. With this one we building image with latest code.

@dfarrell07 maybe based on this info we want to always rebuild the OVN image, not just on IC mode?

@aswinsuryan aswinsuryan mentioned this pull request Aug 15, 2023
@dfarrell07
Copy link
Member

Ok but does that explain why the same job for #1333 succeeds?

I might be wrong, but AFAICT the dependabot updates the go dependencies, while this one updated the actual K8s env that gets deployed, hence the first one is fine while this one clearly has some issues with OVN (which might get fixed with Daniel's patch)

It may be solved when dfarrell07#55 goes in. The OVN image we use is an old one as it does get update in quay for every PR merged in OVN. With this one we building image with latest code.

@dfarrell07 maybe based on this info we want to always rebuild the OVN image, not just on IC mode?

Maybe after the OVN IC PR goes in we can rebase this to test it with both the rebuilt and old images.

@dfarrell07
Copy link
Member

I think we need an owner to rebase this and resolve the conflicts to see it run with both the rebuilt OVN image and the old one.

@tpantelis
Copy link
Contributor

I think we need an owner to rebase this and resolve the conflicts to see it run with both the rebuilt OVN image and the old one.

I rebased but it still fails.

@skitt
Copy link
Member Author

skitt commented Aug 16, 2023

Ok but does that explain why the same job for #1333 succeeds?

I might be wrong, but AFAICT the dependabot updates the go dependencies, while this one updated the actual K8s env that gets deployed, hence the first one is fine while this one clearly has some issues with OVN (which might get fixed with Daniel's patch)

Since #1302, the Go dependencies determine the versions of tools we use (the idea being that dependabot then keeps track of updates for us). So #1333 also changes the version of kind deployed in the environment. Without the hash changes however, the k8s images being deployed don’t change. I haven’t checked in detail what the actual changes are, but as a general rule we’re supposed to use the hashes given with each kind release (because images are often rebuilt to handle changes in kind itself).

@stale
Copy link

stale bot commented Sep 17, 2023

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Sep 17, 2023
@dfarrell07 dfarrell07 removed the wontfix This will not be worked on label Sep 19, 2023
@dfarrell07
Copy link
Member

We're still seeing weirdness with OVN CI (K8s networkplugin doesn't come up).

@Jaanki
Copy link
Contributor

Jaanki commented Nov 15, 2023

@aswinsuryan Can you please take a look at this one on priority. We need it to start testing k8s 1.28.

@skitt skitt force-pushed the kind-0.20 branch 2 times, most recently from 71d8ff8 to 59281f7 Compare November 15, 2023 07:57
@aswinsuryan
Copy link
Contributor

aswinsuryan commented Nov 17, 2023

The ovn-kube-node pod fails to come up due to this error.

2798 ovnkube.go:136] failed to start node network manager: failed to start default node network controller: failed to create IPTablesHelper for proto 0: could not get iptables version: exit status 127

It works fine with 1.26.3 but not with kindest/node:v1.26.6@sha256:6e2d8b28a5b601defe327b98bd1c2d1930b49e5d8c512e1895099e4504007adb. This requires a fix in ovn-kubernetes.

@skitt
Copy link
Member Author

skitt commented Nov 17, 2023

@aswinsuryan thanks for the investigation! Do you know what the fix needs to be? I haven’t found a difference between the 1.26.3 and 1.26.6 images which would explain the change in behaviour...

@skitt
Copy link
Member Author

skitt commented Nov 17, 2023

Ah, ovn-org/ovn-kubernetes#4015

Copy link

This PR/issue depends on:

@skitt
Copy link
Member Author

skitt commented Nov 17, 2023

I bumped ovn-kubernetes to the tip of the main branch, with the iptables fix, but it’s still failing to deploy (and I’ve run out of time to look into this, I’ll loop back next week unless someone else figures it out).

@aswinsuryan
Copy link
Contributor

aswinsuryan commented Nov 17, 2023

ll loop back next week unless someone else figures it out).

Now it is failing at,

./daemonset.sh: line 824: openssl: command not found (ovn-kubernetes/dist/images/daemonset.sh)

we use the -ric (run-in-container) option and seems like the container where it runs has the command missing.

Adding this (in scripts/shared/lib/clusters_kind) solves it, but not sure if it is something we want to do

delete_cluster_on_fail ./ovn-kubernetes/contrib/kind.sh -ov "$OVN_IMAGE" -cn "${KIND_CLUSTER_NAME}" -ric "${ovn_flags[@]}" -lr -dd "${KIND_CLUSTER_NAME}.local" --disable-ovnkube-identity

@skitt
Copy link
Member Author

skitt commented Nov 20, 2023

Adding this (in scripts/shared/lib/clusters_kind) solves it, but not sure if it is something we want to do

delete_cluster_on_fail ./ovn-kubernetes/contrib/kind.sh -ov "$OVN_IMAGE" -cn "${KIND_CLUSTER_NAME}" -ric "${ovn_flags[@]}" -lr -dd "${KIND_CLUSTER_NAME}.local" --disable-ovnkube-identity

Adding --disable-ovnkube-identity disables https://github.com/ovn-org/ovn-kubernetes/blob/master/docs/node-identity.md which doesn’t seem to be a problem for tests.

This adds support for Kubernetes 1.28.

ovn-kubernetes needs to be bumped to a version with support for the
new Debian 11-based kind images (iptables is only in /usr/sbin).
openssl is no longer available, which causes deployments to fail when
trying to set up certificates for node identity; disable that for now
(this might be fixed with Debian 12-based images in kind 0.21).

Signed-off-by: Stephen Kitt <[email protected]>
@skitt skitt merged commit 1180650 into submariner-io:devel Nov 20, 2023
54 of 55 checks passed
@submariner-bot
Copy link

🤖 Closed branches: [z_pr1297/skitt/kind-0.20]

@skitt skitt deleted the kind-0.20 branch November 20, 2023 10:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
e2e-projects Run E2E tests in each consuming project
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

9 participants