Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] Add explanation about cloud-config #5403

Closed
wants to merge 7 commits into from

Conversation

s-makin
Copy link
Contributor

@s-makin s-makin commented Jun 13, 2024

Proposed Commit Message

docs: Add explanation of cloud-config

- The cloud-config user data file creates a lot of confusion
- Want to clarify what it is, how it's used and how to structure

Merge type

  • Squash merge using "Proposed Commit Message"
  • Rebase and merge unique commits. Requires commit messages per-commit each referencing the pull request number (#<PR_NUM>)

.. code-block:: yaml

#cloud-config
<put example cloud-config here>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: What modules should I include to make it a useful example? Does it matter for an example (i.e., can I use "any", or should it be an actual working file?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be helpful to represent an example with runcmd: write_files and ssh_authorized_keys: chpasswd as they seem to be the most-searched keys in documentation. We can provide a working example that we can document here to detail the operations performed and expected config as a result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added an example using those modules (plus some extra ones). I'm happy for us to add some more explanation to the inline comments but I don't know that we necessarily want to add a huge amount of explanation here, I'd rather save that for the example library. What I think we should have here is an illustrative example that shows how a reasonably complete cloud-config file could look. Let me know if you think this example suits that purpose (or any other tweaks you'd like me to make to it)

@blackboxsw blackboxsw self-assigned this Jun 19, 2024
Copy link
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass on this looks pretty good @s-makin. I think we may have a bit of back and forth on the module ordering and inter-dependency discussion, because the user/groups ordering is sort of a special case where ordering actually matters. Every other module is strictly ordered in cloud.cfg to make sure it deals with most cross-module dependencies that may exist to ensure all dependencies are satisfied before starting a new config module. The only unique cases we may have are within a single module, such as user_groups (and I'm sure there may be other minor corner cases in other modules) where some hard-coded operations may require that something else is already present. Other cross-module dependencies we may look to call out could be things like:

  • bootcmd can't expect to reference utilities installed by packages: because packages module is run later that bootcmd. Or
  • default write_files ordering can't write to a directory that happens to be installed and created packages installed by 'packages:, sowrite_files:grew adeferred: trueoption which allows the file creation to happen during the cloud.cfgwrite_files_deferredmodule position duringcloud_final_modules`.

I'll follow up tomorrow with a good cloud-config example to include

doc/rtd/explanation/about-cloud-config.rst Outdated Show resolved Hide resolved
Module ordering
---------------

The order of the different module "sections" is mostly unimportant and modules
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason key order doesn't matter is because cloud-config is a YAML dictionary and the processing of the YAML keys is not sorted when cloud-init processes the cloud-config user-data.

What I think we should probably capture here is that module run order is strictly defined by the order of the modules named in /etc/cloud/cloud.cfg in the filesystem as ordered list items under the keys cloud_init_modules, cloud_config_modules and cloud_final_modules.

This order is what cloud-init follows when trying to serialize the order each module is run.

That cloud.cfg file looks something like:

# The modules that run in the 'init' stage
cloud_init_modules:
  - seed_random
  - bootcmd
  - write_files
  - growpart
...
# The modules that run in the 'config' stage
cloud_config_modules:
  - wireguard
  - snap
  - ubuntu_autoinstall
  - ssh_import_id
  - keyboard
....
# The modules that run in the 'final' stage
cloud_final_modules:
  - package_update_upgrade_install
  - fan
  - landscape
  - lxd
...

So, bootcmd will always run before the write_files module even if cloud-config user-data looks like this. Which means that the final content written to /testing file will be write files was here

#cloud-config
write_files:
- path: /testing
  content: write files was here
bootcmd: [echo "bootcmd" > /testing]
Suggested change
The order of the different module "sections" is mostly unimportant and modules
The order of the ``cloud-config`` keys is unimportant and

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chad makes a good point. I think the issue here is that there's really two different concepts that need to be explained independently. The first is the #cloud-config, which is the declarative yaml used to describe how to setup your system, but the second is how that config gets translated, which is a bit more "messy" in the these various modules run a specified order and are not idempotent. /etc/cloud/cloud.cfg is a good way to see the module ordering, but I'm hesitant to call it out because we do not support reordering/removing modules, and showing the config will show users the place to reorder/remove modules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's right not to go into that much detail on a page that's meant to be a friendly intro to the file and how it should be constructed. We may want to have some sort of an explanation of the translation, but it seems to me like an implementation detail that belongs more rightly in the dev docs. Wdyt?

doc/rtd/explanation/about-cloud-config.rst Outdated Show resolved Hide resolved
doc/rtd/explanation/about-cloud-config.rst Outdated Show resolved Hide resolved
doc/rtd/explanation/about-cloud-config.rst Outdated Show resolved Hide resolved
.. code-block:: yaml

#cloud-config
<put example cloud-config here>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be helpful to represent an example with runcmd: write_files and ssh_authorized_keys: chpasswd as they seem to be the most-searched keys in documentation. We can provide a working example that we can document here to detail the operations performed and expected config as a result.

doc/rtd/explanation/about-cloud-config.rst Show resolved Hide resolved
Copy link
Member

@TheRealFalcon TheRealFalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments inline

doc/rtd/explanation/about-cloud-config.rst Outdated Show resolved Hide resolved
doc/rtd/explanation/about-cloud-config.rst Outdated Show resolved Hide resolved
doc/rtd/explanation/about-cloud-config.rst Outdated Show resolved Hide resolved
``groups`` section must go before ``users`` so that the groups are created
first.

Cloud-config for the live installer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would replace live with Ubuntu.

I don't love that we're so prominently calling out subiquity in upstream documentation that should be distro independent. Tbh, I think buried in https://cloudinit.readthedocs.io/en/latest/reference/faq.html#autoinstall-preruncmd-postruncmd is the right place for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, but I'm trying to slim down the FAQ so I'd rather not put it there. Since it does represent a frequently asked question (or at least, something that does cause a lot of problems/consternation) I'm actually OK with putting it in the docs - although I'm also happy to have it at the end of the file so that it's less prominent, if you think that would be better?

doc/rtd/explanation/about-cloud-config.rst Show resolved Hide resolved
Module ordering
---------------

The order of the different module "sections" is mostly unimportant and modules
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chad makes a good point. I think the issue here is that there's really two different concepts that need to be explained independently. The first is the #cloud-config, which is the declarative yaml used to describe how to setup your system, but the second is how that config gets translated, which is a bit more "messy" in the these various modules run a specified order and are not idempotent. /etc/cloud/cloud.cfg is a good way to see the module ordering, but I'm hesitant to call it out because we do not support reordering/removing modules, and showing the config will show users the place to reorder/remove modules.

Copy link

github-actions bot commented Jul 5, 2024

Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close.

If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging TheRealFalcon, and he will ensure that someone takes a look soon.

(If the pull request is closed and you would like to continue working on it, please do tag TheRealFalcon to reopen it.)

@github-actions github-actions bot added the stale-pr Pull request is stale; will be auto-closed soon label Jul 5, 2024
@s-makin s-makin removed the stale-pr Pull request is stale; will be auto-closed soon label Jul 11, 2024
@blackboxsw blackboxsw added the documentation This Pull Request changes documentation label Jul 24, 2024
written in any order.

.. note::
The :ref:`Users and Groups <mod_cc_users_groups>` module is a special case
Copy link
Member

@holmanb holmanb Jul 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this note?

There are other examples where ordering matters too - any place in the schema where a list is used rather than a k/v pair is likely to have an ordering effect (consider runcmd, for example). I'm not sure that enumerating all of the places that lists are used really makes sense. Maybe we just need to point out how lists and key/value pairs behave differently and that's all that we need?

For the special case where your cloud-config file will be consumed by the
Ubuntu live installer, you will need to include the ``autoinstall:`` top level
key. The presence of this key instructs cloud-init not to process
the user-data itself, but instead to pass it directly to the installer for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The presence of this key instructs cloud-init not to process
the user-data itself, but instead to pass it directly to the installer for
processing.

That's not true, is it? The presence of this key doesn't prevent cloud-init from processing the user-data.

Cloud-init just ignores keys nested under the autoinstall key.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good pt. cloud-init just ignores it, it doesn't take any action to "pass it to the installer"

timeout: 30
condition: True

Ubuntu installer cloud-config
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this section makes sense in cloud-init's documentation about cloud-config. Cloud-init doesn't do anything special when autoinstall is included, it is just ignored (and subiquity does whatever it does with it later).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A counterpoint here is that we are getting lots of questions in IRC in cloud-init about this and a lot of keyword searches past 30 days: autoinstall (1 result) 136 searches. I'd rather have a link handy in our docs that says, not our deal so that hopefully it is discoverable before people ask in #cloud-init channel in IRC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a lot of keyword searches past 30 days: autoinstall (1 result) 136 searches. I'd rather have a link handy in our docs that says, not our deal so that hopefully it is discoverable before people ask in #cloud-init channel in IRC.

Why is is this page the right place for it? There are many things that one could say about cloud-config, but there is value in keeping it short and on-topic without cluttering it with random details. Also, go search autoinstall in the docs. I get exactly one hit. We don't need to say "not our deal" in two places, so I'd rather not add another unless we're planning on removing the existing one - and if we're going to relocate it lets put it where it belongs rather than from one section to the next. Your points make it sound like it is a frequently asked question, which is exactly the purpose of the page that that section is currently on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this doesn't belong here. I made a similar comment on an earlier version of this PR and would prefer to keep this info buried in the FAQ, but FAQs don't fit nicely into diataxis and are seen by some as an indication of bad documentation.

Could we keep it in the FAQ for now and perhaps grow a "Projects that use cloud-init" or similar page that point to other well known (and not just Canonical) projects that use cloud-init along with important things to be aware of?

@holmanb
Copy link
Member

holmanb commented Jul 24, 2024

Thanks @s-makin! This is a really helpful addition to the docs. I left some comments inline.

Add note about modules which do not declare activate on schema keys
Copy link
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had some inline comments and suggestions about user-data that needs a fix here. Pushed them as a separate commit for review.

Comment on lines 92 to 101
# Configure storage
storage:
files:
- path: /etc/example_file.txt
content: |
Some text to be stored in the file
- path: /etc/example_script.txt
content: |
#!/bin/bash
echo "Some text to be run in the script"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't valid cloud-config schema in our example. If we wanted to emit files on disk we can use write_fiiles for which you already have an example. Also, setting up storage generally involves mounting disks or volumes in certain places and that would involve more complex than I think we want for a simple cut-n-paste example of a multi-faceted cloud-config user-data file. So, let's redact this content and deal with storage setup if we want more complex explanation and virtual machine setup to provide the doc-reader with an easy to use cloud-config user-data file.

doc/rtd/explanation/about-cloud-config.rst Show resolved Hide resolved
gecos: Example User
sudo: ['ALL=(ALL) NOPASSWD:ALL']
groups: sudo
home: /home/exampleuser
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
home: /home/exampleuser
homedir: /home/exampleuser

Copy link
Member

@TheRealFalcon TheRealFalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments inline

***************************

The ``#cloud-config`` file is a type of user data that cloud-init can consume
to automatically set up various aspects of the system. It is the preferred way
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preferred by who? I don't think the answer is "users" as I remember a previous maintainer telling me that the vast majority of user data is just simple bash scripts. We could poll the maintainers, but I don't have any particular preference for cloud-config.

After the header, every aspect of the system's configuration is controlled
through specific cloud-init modules.

Each module is declared through the use of its **top-level key**, and the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if we were this consistent, but not all modules have a single top-level key. E.g., the set passwords module reacts to the password, chpasswd, and ssh_pwauth keys.

Additionally, there are multiple modules that may react to the same key. E.g., write_files key is used by both the write_files and write_files_deferred module and hostname and fqdn keys are used by both the set_hostname and update_hostname modules.

Module ordering
---------------

The order of the ``cloud-config`` keys is unimportant and modules can be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to what I commented above, keys don't correspond 1:1 with modules, so I don't think it makes sense to talk about the cloud-config as if we're writing or specifying modules.

doc/rtd/explanation/about-cloud-config.rst Show resolved Hide resolved
timeout: 30
condition: True

Ubuntu installer cloud-config
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this doesn't belong here. I made a similar comment on an earlier version of this PR and would prefer to keep this info buried in the FAQ, but FAQs don't fit nicely into diataxis and are seen by some as an indication of bad documentation.

Could we keep it in the FAQ for now and perhaps grow a "Projects that use cloud-init" or similar page that point to other well known (and not just Canonical) projects that use cloud-init along with important things to be aware of?

@github-actions github-actions bot added the stale-pr Pull request is stale; will be auto-closed soon label Aug 9, 2024
@github-actions github-actions bot closed this Aug 17, 2024
@s-makin s-makin reopened this Aug 23, 2024
@blackboxsw blackboxsw removed the stale-pr Pull request is stale; will be auto-closed soon label Aug 23, 2024
@canonical canonical deleted a comment from github-actions bot Aug 23, 2024
@s-makin s-makin closed this Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation This Pull Request changes documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants