Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ubuntu/jammy] revert single process #5589

Merged
merged 4 commits into from
Aug 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions debian/changelog
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ cloud-init (24.2-0ubuntu1~22.04.1) jammy; urgency=medium

* d/control: remove netifaces due to GH-4634
* d/p/deprecation-version-boundary.patch: Pin deprecation version to 22.1
* d/p/no-single-process.patch: Remove single process optimization
* d/p/no-nocloud-network.patch: Remove nocloud network feature
Comment on lines +5 to +6
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops we also added this to the already released d/changelog entry, not the UNRELEASED entry. Will followup with a fix PR including both of these comments.

* drop d/p/do-not-block-user-login.patch: upstream 4981ea9 now orders
cloud-init.service Before=systemd-user-sessions.service
* refresh patches:
Expand Down
26 changes: 26 additions & 0 deletions debian/patches/no-nocloud-network.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
Description: Don't allow network-config
This may add a new wait time for a file that doesn't exist on existing series
so patch it out.

Author: Brett Holman <[email protected]>
Last-Update: 2024-08-02

--- a/cloudinit/sources/DataSourceNoCloud.py
+++ b/cloudinit/sources/DataSourceNoCloud.py
@@ -190,7 +190,7 @@

# This could throw errors, but the user told us to do it
# so if errors are raised, let them raise
- md_seed, ud, vd, network = util.read_seeded(seedfrom, timeout=None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't actually stop cloud-init from trying to continue reaching out and blocking on remote I/O network-config requests which could lead to errors in worst case, but in best case still spends cloud-init cycles during boot requesting a network-config file and then ignoring it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what I was hoping is that we minimally avoid reaching out to request network-config from any nocloud endpoint to avoid exposure to error conditions, and avoid the cost of that round-trip only to ignore that network-config value at all callsites

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--- a/debian/patches/no-nocloud-network.patch
+++ b/debian/patches/no-nocloud-network.patch
@@ -7,7 +7,7 @@ Last-Update: 2024-08-02
 
 --- a/cloudinit/sources/DataSourceNoCloud.py
 +++ b/cloudinit/sources/DataSourceNoCloud.py
-@@ -190,7 +190,7 @@
+@@ -190,7 +190,7 @@ class DataSourceNoCloud(sources.DataSour
  
              # This could throw errors, but the user told us to do it
              # so if errors are raised, let them raise
@@ -16,7 +16,7 @@ Last-Update: 2024-08-02
              LOG.debug("Using seeded cache data from %s", seedfrom)
  
              # Values in the command line override those from the seed
-@@ -199,7 +199,6 @@
+@@ -199,7 +199,6 @@ class DataSourceNoCloud(sources.DataSour
              )
              mydata["user-data"] = ud
              mydata["vendor-data"] = vd
@@ -24,3 +24,100 @@ Last-Update: 2024-08-02
              found.append(seedfrom)
  
          # Now that we have exhausted any other places merge in the defaults
+--- a/cloudinit/util.py
++++ b/cloudinit/util.py
+@@ -1059,7 +1059,6 @@ def read_seeded(base="", ext="", timeout
+         ud_url = base.replace("%s", "user-data" + ext)
+         vd_url = base.replace("%s", "vendor-data" + ext)
+         md_url = base.replace("%s", "meta-data" + ext)
+-        network_url = base.replace("%s", "network-config" + ext)
+     else:
+         if features.NOCLOUD_SEED_URL_APPEND_FORWARD_SLASH:
+             if base[-1] != "/" and parse.urlparse(base).query == "":
+@@ -1068,17 +1067,7 @@ def read_seeded(base="", ext="", timeout
+         ud_url = "%s%s%s" % (base, "user-data", ext)
+         vd_url = "%s%s%s" % (base, "vendor-data", ext)
+         md_url = "%s%s%s" % (base, "meta-data", ext)
+-        network_url = "%s%s%s" % (base, "network-config", ext)
+     network = None
+-    try:
+-        network_resp = url_helper.read_file_or_url(
+-            network_url, timeout=timeout, retries=retries
+-        )
+-    except url_helper.UrlError as e:
+-        LOG.debug("No network config provided: %s", e)
+-    else:
+-        if network_resp.ok():
+-            network = load_yaml(network_resp.contents)
+     md_resp = url_helper.read_file_or_url(
+         md_url, timeout=timeout, retries=retries
+     )
+--- a/tests/unittests/test_util.py
++++ b/tests/unittests/test_util.py
+@@ -2476,7 +2476,7 @@ class TestReadSeeded:
+         assert found_md == {"key1": "val1"}
+         assert found_ud == ud
+         assert found_vd == vd
+-        assert found_network == {"test": "true"}
++        assert found_network is None
+ 
+     @pytest.mark.parametrize(
+         "base, feature_flag, req_urls",
+@@ -2485,7 +2485,6 @@ class TestReadSeeded:
+                 "http://10.0.0.1/%s?qs=1",
+                 True,
+                 [
+-                    "http://10.0.0.1/network-config?qs=1",
+                     "http://10.0.0.1/meta-data?qs=1",
+                     "http://10.0.0.1/user-data?qs=1",
+                     "http://10.0.0.1/vendor-data?qs=1",
+@@ -2496,7 +2495,6 @@ class TestReadSeeded:
+                 "https://10.0.0.1:8008/",
+                 True,
+                 [
+-                    "https://10.0.0.1:8008/network-config",
+                     "https://10.0.0.1:8008/meta-data",
+                     "https://10.0.0.1:8008/user-data",
+                     "https://10.0.0.1:8008/vendor-data",
+@@ -2507,7 +2505,6 @@ class TestReadSeeded:
+                 "https://10.0.0.1:8008",
+                 True,
+                 [
+-                    "https://10.0.0.1:8008/network-config",
+                     "https://10.0.0.1:8008/meta-data",
+                     "https://10.0.0.1:8008/user-data",
+                     "https://10.0.0.1:8008/vendor-data",
+@@ -2518,7 +2515,6 @@ class TestReadSeeded:
+                 "https://10.0.0.1:8008",
+                 False,
+                 [
+-                    "https://10.0.0.1:8008network-config",
+                     "https://10.0.0.1:8008meta-data",
+                     "https://10.0.0.1:8008user-data",
+                     "https://10.0.0.1:8008vendor-data",
+@@ -2529,7 +2525,6 @@ class TestReadSeeded:
+                 "https://10.0.0.1:8008?qs=",
+                 True,
+                 [
+-                    "https://10.0.0.1:8008?qs=network-config",
+                     "https://10.0.0.1:8008?qs=meta-data",
+                     "https://10.0.0.1:8008?qs=user-data",
+                     "https://10.0.0.1:8008?qs=vendor-data",
+@@ -2568,7 +2563,7 @@ class TestReadSeeded:
+         # user-data, vendor-data read raw. It could be scripts or other format
+         assert found_ud == "/user-data: 1"
+         assert found_vd == "/vendor-data: 1"
+-        assert found_network == {"/network-config": 1}
++        assert found_network is None
+         assert [
+             mock.call(req_url, timeout=5, retries=10) for req_url in req_urls
+         ] == m_read.call_args_list
+@@ -2598,7 +2593,7 @@ class TestReadSeededWithoutVendorData(he
+         self.assertEqual(found_md, {"key1": "val1"})
+         self.assertEqual(found_ud, ud)
+         self.assertEqual(found_vd, vd)
+-        self.assertEqual(found_network, {"test": "true"})
++        self.assertIsNone(found_network)
+ 
+ 
+ class TestEncode(helpers.TestCase):

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll propose a followup PR to address this issue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposed #5594 to avoid network-config requests for nocloud

+ md_seed, ud, vd, _ = util.read_seeded(seedfrom, timeout=None)
LOG.debug("Using seeded cache data from %s", seedfrom)

# Values in the command line override those from the seed
@@ -199,7 +199,6 @@
)
mydata["user-data"] = ud
mydata["vendor-data"] = vd
- mydata["network-config"] = network
found.append(seedfrom)

# Now that we have exhausted any other places merge in the defaults
253 changes: 253 additions & 0 deletions debian/patches/no-single-process.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,253 @@
Description: remove single process optimization
This optimization is a big change in behavior, patch it out.

Author: Brett Holman <[email protected]>
Last-Update: 2024-08-02

--- a/systemd/cloud-config.service.tmpl
+++ b/systemd/cloud-config.service.tmpl
@@ -10,14 +10,7 @@

[Service]
Type=oneshot
-# This service is a shim which preserves systemd ordering while allowing a
-# single Python process to run cloud-init's logic. This works by communicating
-# with the cloud-init process over a unix socket to tell the process that this
-# stage can start, and then wait on a return socket until the cloud-init
-# process has completed this stage. The output from the return socket is piped
-# into a shell so that the process can send a completion message (defaults to
-# "done", otherwise includes an error message) and an exit code to systemd.
-ExecStart=sh -c 'echo "start" | nc.openbsd -Uu -W1 /run/cloud-init/share/config.sock -s /run/cloud-init/share/config-return.sock | sh'
+ExecStart=/usr/bin/cloud-init modules --mode=config
RemainAfterExit=yes
TimeoutSec=0

--- a/systemd/cloud-final.service.tmpl
+++ b/systemd/cloud-final.service.tmpl
@@ -15,16 +15,10 @@

[Service]
Type=oneshot
-# This service is a shim which preserves systemd ordering while allowing a
-# single Python process to run cloud-init's logic. This works by communicating
-# with the cloud-init process over a unix socket to tell the process that this
-# stage can start, and then wait on a return socket until the cloud-init
-# process has completed this stage. The output from the return socket is piped
-# into a shell so that the process can send a completion message (defaults to
-# "done", otherwise includes an error message) and an exit code to systemd.
-ExecStart=sh -c 'echo "start" | nc.openbsd -Uu -W1 /run/cloud-init/share/final.sock -s /run/cloud-init/share/final-return.sock | sh'
+ExecStart=/usr/bin/cloud-init modules --mode=final
RemainAfterExit=yes
TimeoutSec=0
+KillMode=process
{% if variant in ["almalinux", "cloudlinux", "rhel"] %}
# Restart NetworkManager if it is present and running.
ExecStartPost=/bin/sh -c 'u=NetworkManager.service; \
--- a/systemd/cloud-init-local.service.tmpl
+++ b/systemd/cloud-init-local.service.tmpl
@@ -7,6 +7,7 @@
{% endif %}
Wants=network-pre.target
After=hv_kvp_daemon.service
+After=systemd-remount-fs.service
{% if variant in ["almalinux", "cloudlinux", "rhel"] %}
Requires=dbus.socket
After=dbus.socket
@@ -37,14 +38,7 @@
ExecStartPre=/sbin/restorecon /run/cloud-init
ExecStartPre=/usr/bin/touch /run/cloud-init/enabled
{% endif %}
-# This service is a shim which preserves systemd ordering while allowing a
-# single Python process to run cloud-init's logic. This works by communicating
-# with the cloud-init process over a unix socket to tell the process that this
-# stage can start, and then wait on a return socket until the cloud-init
-# process has completed this stage. The output from the return socket is piped
-# into a shell so that the process can send a completion message (defaults to
-# "done", otherwise includes an error message) and an exit code to systemd.
-ExecStart=sh -c 'echo "start" | nc.openbsd -Uu -W1 /run/cloud-init/share/local.sock -s /run/cloud-init/share/local-return.sock | sh'
+ExecStart=/usr/bin/cloud-init init --local
RemainAfterExit=yes
TimeoutSec=0

--- a/systemd/cloud-init-main.service.tmpl
+++ /dev/null
@@ -1,52 +0,0 @@
-## template:jinja
-# systemd ordering resources
-# ==========================
-# https://systemd.io/NETWORK_ONLINE/
-# https://docs.cloud-init.io/en/latest/explanation/boot.html
-# https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/
-# https://www.freedesktop.org/software/systemd/man/latest/systemd.special.html
-# https://www.freedesktop.org/software/systemd/man/latest/systemd-remount-fs.service.html
-[Unit]
-Description=Cloud-init: Single Process
-Wants=network-pre.target
-{% if variant in ["almalinux", "cloudlinux", "ubuntu", "unknown", "debian", "rhel"] %}
-DefaultDependencies=no
-{% endif %}
-{% if variant in ["almalinux", "cloudlinux", "rhel"] %}
-Requires=dbus.socket
-After=dbus.socket
-Before=network.service
-Before=firewalld.target
-Conflicts=shutdown.target
-{% endif %}
-{% if variant in ["ubuntu", "unknown", "debian"] %}
-Before=sysinit.target
-Conflicts=shutdown.target
-{% endif %}
-
-After=systemd-remount-fs.service
-Before=sysinit.target
-Before=cloud-init-local.service
-Conflicts=shutdown.target
-RequiresMountsFor=/var/lib/cloud
-ConditionPathExists=!/etc/cloud/cloud-init.disabled
-ConditionKernelCommandLine=!cloud-init=disabled
-ConditionEnvironment=!KERNEL_CMDLINE=cloud-init=disabled
-
-[Service]
-Type=notify
-ExecStart=/usr/bin/cloud-init --all-stages
-KillMode=process
-TasksMax=infinity
-TimeoutStartSec=infinity
-{% if variant in ["almalinux", "cloudlinux", "rhel"] %}
-ExecStartPre=/bin/mkdir -p /run/cloud-init
-ExecStartPre=/sbin/restorecon /run/cloud-init
-ExecStartPre=/usr/bin/touch /run/cloud-init/enabled
-{% endif %}
-
-# Output needs to appear in instance console output
-StandardOutput=journal+console
-
-[Install]
-WantedBy=cloud-init.target
--- a/systemd/cloud-init-network.service.tmpl
+++ /dev/null
@@ -1,64 +0,0 @@
-## template:jinja
-[Unit]
-# https://cloudinit.readthedocs.io/en/latest/explanation/boot.html
-Description=Cloud-init: Network Stage
-{% if variant not in ["almalinux", "cloudlinux", "photon", "rhel"] %}
-DefaultDependencies=no
-{% endif %}
-Wants=cloud-init-local.service
-Wants=sshd-keygen.service
-Wants=sshd.service
-After=cloud-init-local.service
-After=systemd-networkd-wait-online.service
-{% if variant in ["ubuntu", "unknown", "debian"] %}
-After=networking.service
-{% endif %}
-{% if variant in ["almalinux", "centos", "cloudlinux", "eurolinux", "fedora",
- "miraclelinux", "openeuler", "OpenCloudOS", "openmandriva", "rhel", "rocky",
- "suse", "TencentOS", "virtuozzo"] %}
-
-After=network.service
-After=NetworkManager.service
-After=NetworkManager-wait-online.service
-{% endif %}
-{% if variant in ["suse"] %}
-After=wicked.service
-# setting hostname via hostnamectl depends on dbus, which otherwise
-# would not be guaranteed at this point.
-After=dbus.service
-{% endif %}
-Before=network-online.target
-Before=sshd-keygen.service
-Before=sshd.service
-Before=systemd-user-sessions.service
-{% if variant in ["ubuntu", "unknown", "debian"] %}
-Before=sysinit.target
-Before=shutdown.target
-Conflicts=shutdown.target
-{% endif %}
-{% if variant in ["suse"] %}
-Before=shutdown.target
-Conflicts=shutdown.target
-{% endif %}
-ConditionPathExists=!/etc/cloud/cloud-init.disabled
-ConditionKernelCommandLine=!cloud-init=disabled
-ConditionEnvironment=!KERNEL_CMDLINE=cloud-init=disabled
-
-[Service]
-Type=oneshot
-# This service is a shim which preserves systemd ordering while allowing a
-# single Python process to run cloud-init's logic. This works by communicating
-# with the cloud-init process over a unix socket to tell the process that this
-# stage can start, and then wait on a return socket until the cloud-init
-# process has completed this stage. The output from the return socket is piped
-# into a shell so that the process can send a completion message (defaults to
-# "done", otherwise includes an error message) and an exit code to systemd.
-ExecStart=sh -c 'echo "start" | nc.openbsd -Uu -W1 /run/cloud-init/share/network.sock -s /run/cloud-init/share/network-return.sock | sh'
-RemainAfterExit=yes
-TimeoutSec=0
-
-# Output needs to appear in instance console output
-StandardOutput=journal+console
-
-[Install]
-WantedBy=cloud-init.target
--- /dev/null
+++ b/systemd/cloud-init.service.tmpl
@@ -0,0 +1,57 @@
+## template:jinja
+[Unit]
+# https://cloudinit.readthedocs.io/en/latest/explanation/boot.html
+Description=Cloud-init: Network Stage
+{% if variant not in ["almalinux", "cloudlinux", "photon", "rhel"] %}
+DefaultDependencies=no
+{% endif %}
+Wants=cloud-init-local.service
+Wants=sshd-keygen.service
+Wants=sshd.service
+After=cloud-init-local.service
+After=systemd-networkd-wait-online.service
+{% if variant in ["ubuntu", "unknown", "debian"] %}
+After=networking.service
+{% endif %}
+{% if variant in ["almalinux", "centos", "cloudlinux", "eurolinux", "fedora",
+ "miraclelinux", "openeuler", "OpenCloudOS", "openmandriva", "rhel", "rocky",
+ "suse", "TencentOS", "virtuozzo"] %}
+
+After=network.service
+After=NetworkManager.service
+After=NetworkManager-wait-online.service
+{% endif %}
+{% if variant in ["suse"] %}
+After=wicked.service
+# setting hostname via hostnamectl depends on dbus, which otherwise
+# would not be guaranteed at this point.
+After=dbus.service
+{% endif %}
+Before=network-online.target
+Before=sshd-keygen.service
+Before=sshd.service
+Before=systemd-user-sessions.service
+{% if variant in ["ubuntu", "unknown", "debian"] %}
+Before=sysinit.target
+Before=shutdown.target
+Conflicts=shutdown.target
+{% endif %}
+{% if variant in ["suse"] %}
+Before=shutdown.target
+Conflicts=shutdown.target
+{% endif %}
+ConditionPathExists=!/etc/cloud/cloud-init.disabled
+ConditionKernelCommandLine=!cloud-init=disabled
+ConditionEnvironment=!KERNEL_CMDLINE=cloud-init=disabled
+
+[Service]
+Type=oneshot
+ExecStart=/usr/bin/cloud-init init
+RemainAfterExit=yes
+TimeoutSec=0
+
+# Output needs to appear in instance console output
+StandardOutput=journal+console
+
+[Install]
+WantedBy=cloud-init.target
2 changes: 2 additions & 0 deletions debian/patches/series
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,5 @@ status-retain-recoverable-error-exit-code.patch
retain-ec2-default-net-update-events.patch
cli-retain-file-argument-as-main-cmd-arg.patch
deprecation-version-boundary.patch
no-single-process.patch
no-nocloud-network.patch
Loading