Introduce a class to represent the answer file #337

ehelms · 2021-07-19T12:40:47Z

Refactors code into an AnswerFile class for handling the loading of
and details of the parts of the answer file. This abstraction will
allow easier introduction of newer versions of the answer file.

README.md

lib/kafo/answer_file.rb

ehelms · 2021-07-21T01:40:32Z

README.md

+ String $puppet_class => Hash[
+ String $parameter => Enum[true, false, Hash[String, Variant[String, Boolean, Integer, Array, Hash]]]
+ ]
+]


I'm not sure this makes sense in this format. Or here. What I wanted to do was document the current schema to allow then to document a Version 2 schema. I considered doing this in a comment in the file itself.

Thoughts about using https://yardoc.org/ in the actual classes as well?

Also, it's not really valid. You can't use names and Enum only describes strings. really it's:

Hash[String, Variant[Boolean, Hash[String, Any]]]

I was trying to model the structure off of https://github.com/theforeman/puppet-pulpcore/blob/master/templates/apache-fragment.epp#L2

How do you think to use yarddoc? Just for documenting the method calls or somehow for this schema?

I was trying to model the structure off of https://github.com/theforeman/puppet-pulpcore/blob/master/templates/apache-fragment.epp#L2

That's a struct and that works different. With that you describe what's essentially a hash, but with explicit keys. See https://puppet.com/docs/puppet/6/lang_data_abstract.html#lang_data_abstract_flexible-struct-data-type

You could certainly use a Struct to describe the scenario file (where we have a limited number of fields, each with their own rules, but not the answers file.

How do you think to use yarddoc? Just for documenting the method calls or somehow for this schema?

I'd consider both. In the class documentation you can give describe the structure and give examples while also documenting the individual methods. That said, a short bit in README about the answers file can make sense. I guess it's a bit of both.

ekohl · 2021-07-21T08:35:02Z

lib/kafo/answer_file.rb

+ end
+
+ def parameters_for_class(puppet_class)
+ if @version == 1


I wonder if this is the right way. We can use inheritance so perhaps it's better to introduce Kafo::AnswerFile::V1 and Kafo::AnswerFile::V2 as concrete implementations. Then on the module you would have a Kafo::AnswerFile.load_answers(filename, version)

Yea, thats a fair idea. I was not a big fan of the if-else, but it was the crude way my brain got around the problem. I am bouncing back and forth between this and a yet unopened commit to try to balance the setup for V2 (ehelms@2b96d65).

Before a v2 we also need to introduce a migration DSL because now that also operates directly on the answers. Some things I thought about:

# Ensure a module is present in the answers def add_module(name, default=false) end # Ensure a module is absent in the answers def remove_module(name) end # Ensure a module is enabled in the answers def enable_module(name) end # Ensure a module is disabled in the answers def disable_module(name) end # Set an answer for a module's parameter. If a module is disabled, it will be enabled def set_answer(mod, parameter, value) end # Get an answer for a module's parameter. If a module is disabled or no answer for the parameter exists, it will return nil def get_answer(mod, parameter) end # Clear an answer for a module's parameter. If a module is disabled, it does nothing def unset_answer(mod, parameter) end # Migrate a module. The given block is only called if the module is enabled # @example # migrate_module('puppet') do |mod| # current_autosign = mod['autosign'] # unless current_autosign.is_a?(TrueClass) || current_autosign.is_a?(FalseClass) # mod.unset_answer('autosign') # end # end # # @example # migrate_module('foreman_proxy') do |mod| # mod.rename_parameter('puppetrun', 'puppet') # mod.rename_parameter('puppetrun_listen_on', 'puppet_listen_on') # end def migrate_module(mod, &block) end

I'm not entirely sure about the migrate_module name. I think the essence is that it operates on a module only if it's enabled and you get a module back which has some ways of expressing operations.

In theforeman/foreman-installer#700 I tried to see if I could express the migrations and it looks like it. It feels more natural to me. I skipped a few complexer ones and all katello/foreman-proxy-content ones so I may have missed some things.

I'll look this over. I wanted to make a note of something I wanted to discuss more as its something I find a bit odd and affects things. We draw some parallels to databases and migrations with the answer file yet we treat it like its both the end product and the baseline that migrations are layered on top of. I had been wondering if we ought to have this more formalized that really the answer file is the initial migration and migrations layer on top of that. I think this would mean we need a helper to calculate the answer file final state, but in a way we need that today as you can't see the affect of migrations either. This would also allow "squashing" migrations in the future similar to the Rails model.

I think i'll need this change in place so that I have the interface to build the DSL into.

lib/kafo/answer_file.rb

lib/kafo/configuration.rb

ekohl · 2021-07-21T08:47:28Z

test/kafo/answer_file_test.rb

+require 'test_helper'
+
+describe 'Kafo::AnswerFile' do
+ let(:dummy_logger) { DummyLogger.new }


Suggested change

let(:dummy_logger) { DummyLogger.new }

let(:logger) { DummyLogger.new }

test/kafo/answer_file_test.rb

ekohl · 2021-07-21T09:04:46Z

test/kafo/answer_file_test.rb

@@ -0,0 +1,47 @@
+require 'test_helper'
+
+describe 'Kafo::AnswerFile' do


Generally you actually use the class reference.

Suggested change

describe 'Kafo::AnswerFile' do

describe Kafo::AnswerFile do

ekohl · 2021-07-21T09:08:01Z

README.md

+ String $puppet_class => Hash[
+ String $parameter => Enum[true, false, Hash[String, Variant[String, Boolean, Integer, Array, Hash]]]
+ ]
+]


Thoughts about using https://yardoc.org/ in the actual classes as well?

Also, it's not really valid. You can't use names and Enum only describes strings. really it's:

Hash[String, Variant[Boolean, Hash[String, Any]]]

ehelms · 2021-07-21T17:36:54Z

I plan to open up this part of the discussion more broadly, but will put a note here as it helps build context. I want to set up a good way to encapsulate V1 vs V2 differences and provide a nice clean interface between them. An aspect worth considering is then how a project built on top of Kafo then declares and uses one format or the other. The options I thought of in my currently preferred order:

Projects define migrations that convert a V1 answer file into a V2
- This lets Kafo support either format and have to do no conversion itself or provide auto-conversion which could be error prone. Further, this keeps migrations working as they can apply to a V1 answer file and then do the conversion. This would require treating the answer file as the first migration and not a living document. That is, truly like a database.
- The downside is the user has to get the migration right versus Kafo providing a utility for conversion
Kafo performs manual conversion of V1 to V2
- A utility is provided that knows how to convert V1 to V2. Users still have to consider their migrations and what to do with them.
Automatic conversion of V1 to V2 by new Kafo major version.
- Then Kafo only supports one version, code is straight forward, Kafo handles conversion.
- New migration folder/format would be required and old migrations would be ignored

ekohl · 2021-07-21T18:46:41Z

In #337 (comment) I described a migration DSL which could be much easier to support both v1 and v2. If we require users of Kafo to use that DSL, an automated migration can be performed. Trying to support option 1 and 2 sounds hard (though not impossible).

Refactors code into an AnswerFile class for handling the loading of and details of the parts of the answer file. This abstraction will allow easier introduction of newer versions of the answer file.

wbclark

Could you briefly describe here which problems the V2 of the answers file schema is intended to solve?

Looking over the schema in ehelms@2b96d65 and DSL described in comments here + examples in theforeman/foreman-installer#700 , I can see that migrations become much easier to read and write without having to think about module: true vs. module: {} vs. module: {param: value}.

Do we still have the ability to differentiate new installs from upgrades, and how will that look going forward?

And what is the purpose of including and excluding params ? In particular, what does it mean to have module: { parameters: { myparam: myvalue, ... }, include: { myparam, ... } } (and how does the meaning change if that included param is moved to excluded, or neither included nor excluded?)

Does excluding a parameter provide the ability to revert to module default value without forgetting the prior custom value (a downside would be that it becomes more difficult for a human to read and understand configuration from an answers file, for example one collected in an sosreport), or would the module default value for the parameter then get recorded (which brings me back to the question, what does exclusion mean)?

ehelms · 2021-07-28T17:00:39Z

The reference design conversation for V2 format can be seen over here -- https://community.theforeman.org/t/defaulting-puppet-to-off-in-the-katello-scenario/14553

And what is the purpose of including and excluding params? In particular, what does it mean to havemodule: { parameters: { myparam: myvalue, ... }, include: { myparam, ... } }` (and how does the meaning change if that included param is moved to excluded, or neither included nor excluded?)

Think of inclusion and exclusion as a way of deciding what parameters you want to present to a user and which you do not. I think the most obvious use case is the exclude case where you do not want users changing a parameter that a puppet module happens to expose. This pairs down the size of help and reduces potential to break a scenario accidentally.

wbclark · 2021-07-28T17:41:42Z

Think of inclusion and exclusion as a way of deciding what parameters you want to present to a user and which you do not. I think the most obvious use case is the exclude case where you do not want users changing a parameter that a puppet module happens to expose. This pairs down the size of help and reduces potential to break a scenario accidentally.

Thank you.

In that case, I like the feature but disagree with the landing spot. The included/excluded parameters for a module should be in the scenario configuration rather than the answers file. Since hiding a parameter / forcing module default is more a feature of the scenario itself, that all installations of that scenario would have in common regardless of any choices the user makes in installation. The answers file should contain only that which might be reasonably configured by the end user / that which might vary between two installations of the same scenario.

ehelms · 2021-07-28T17:53:49Z

The answers file should contain only that which might be reasonably configured by the end user / that which might vary between two installations of the same scenario.

I see your point. Once thing to recognize about the move to V2 is that it aims to move away from users editing the file and it being treated like a system of record instead. In todays world, we present the answers file as something a user could hand edit, we also present the configuration file in the same way. In reality, we do not intend users to edit these files by hand because it can break the installation, it can skip over hooks and other validations.

If you look at the enabled field its a similar argument. In V2, we aim to support the notion of "you cannot disable this module". So while a user can say enable or disable a module, they should not be able to change whether a module can be disabled if it's been marked as such.

We should probably move this discussion off of this PR and either to the RFC or a new PR with the format.

wbclark · 2021-07-28T19:21:34Z

Once thing to recognize about the move to V2 is that it aims to move away from users editing the file and it being treated like a system of record instead.

Right. While there are cases - such as cloning to DR, test, or reproducer environments - where hand editing may still be necessary, I wouldn't insist on the change for that reason.

My argument is based more on logical separation of what is configuration for the scenario vs. what is configuration for the particular installation, which is still relevant in the paradigm of answers file as a system of record.

In particular, when an answers file is read by a human to understand and analyze the install configuration, it is nicer to keep these concepts totally separate. This might occur when reading an answers file provided by an end user (e.g. via an sosreport) for troubleshooting purposes.

Even in the case of an admin checking their own configuration ("what did I or my colleagues set as that param value again?") there are still advantages to reading them from the answers file. In the case of foreman-installer, the --help option does show you current values but it also shows a lot of other information, isn't formatted for that specific purpose, and can have a non-negligible load time.

We should probably move this discussion off of this PR and either to the RFC or a new PR with the format.

OK, agreed. I've commented on the RFC and will share my follow up thoughts there as well.

ehelms · 2021-07-28T19:51:58Z

In particular, when an answers file is read by a human to understand and analyze the install configuration, it is nicer to keep these concepts totally separate. This might occur when reading an answers file provided by an end user (e.g. via an sosreport) for troubleshooting purposes.

Even in the case of an admin checking their own configuration ("what did I or my colleagues set as that param value again?") there are still advantages to reading them from the answers file. In the case of foreman-installer, the --help option does show you current values but it also shows a lot of other information, isn't formatted for that specific purpose, and can have a non-negligible load time.

We could (and should) provide a command that prints the answers out in a user friendly way. That would both provide an interface for backup and restore, and help keep users from the concerns you raise.

Right. While there are cases - such as cloning to DR, test, or reproducer environments - where hand editing may still be necessary, I wouldn't insist on the change for that reason.

If we have parameters for everything, I would argue this should never be needed. And if it is, then we should be re-assessing why as that is a likely indicator of something being missed. Similar to the "export" or "print" command mentioned above, we can add the ability to import a set or subset of answers via a file. I feel this is a better way to handle customization. And would match something like the Ansible model.

TLDR; We should do a better job aiming at the separation of the user interface and the data storage.

wbclark · 2021-07-28T20:28:46Z

If we have parameters for everything, I would argue this should never be needed. And if it is, then we should be re-assessing why as that is a likely indicator of something being missed.

Speaking from experience building a few reproducers from end-user data, the awful workflow goes something like this:

obtain complete backup from end user.
setup reproducer target and take snapshot after installing RPMs.
extract directory, grab the answers file, and copy that to the target machine.
try to install on the target machine, get some error like name resolution (for obvious reasons). either edit /etc/hosts to workaround this OR
edit answers file and try again. get some other stupid error, such as end user was configured to use external DB, or some issue with some plugin that I know is unrelated to the thing that I care about reproducing.
make more hacks to answers file and try again. repeat as needed (for enterprise users with highly customized installs, probably several times).
finally when the installer will run without errors, reset my target machine to pre-installer snapshot.
run the installer with the hacked answers to confirm it works the first time. after confirmation, reset target machine to pre-installer snapshot once again.
take the working answers file and re-tar it with all the other contents of the original backup.
run the clone script with the hacked backup tar.

I probably got some details wrong as it's been around 2 years since I've done this (I don't recall whether the clone script runs the installer first with default answers, and then again with answers from the backup. I think that may be the case, and that introduces additional complexity, but I'm not positive).

I'm pretty sure Devendra spends a LOT of time day to day on this general process, so he'd likely be a great person to talk to for further feedback on it.

theforeman-bot added Needs testing Not yet reviewed labels Jul 19, 2021

ehelms force-pushed the refactor-answer-file branch 3 times, most recently from bfa8726 to 90c1cda Compare July 20, 2021 02:48

ehelms commented Jul 20, 2021

View reviewed changes

README.md Show resolved Hide resolved

ehelms commented Jul 20, 2021

View reviewed changes

lib/kafo/answer_file.rb Outdated Show resolved Hide resolved

ehelms commented Jul 20, 2021

View reviewed changes

lib/kafo/answer_file.rb Outdated Show resolved Hide resolved

ehelms commented Jul 20, 2021

View reviewed changes

lib/kafo/answer_file.rb Outdated Show resolved Hide resolved

ehelms force-pushed the refactor-answer-file branch 2 times, most recently from 8aca687 to cbd0679 Compare July 21, 2021 01:39

ehelms commented Jul 21, 2021

View reviewed changes

ehelms force-pushed the refactor-answer-file branch from cbd0679 to e4000f6 Compare July 21, 2021 01:46

ekohl reviewed Jul 21, 2021

View reviewed changes

ekohl mentioned this pull request Jul 21, 2021

Use Kafo's migration DSL theforeman/foreman-installer#700

Draft

Introduce a class to represent the answer file

a9f8d92

Refactors code into an AnswerFile class for handling the loading of and details of the parts of the answer file. This abstraction will allow easier introduction of newer versions of the answer file.

ehelms force-pushed the refactor-answer-file branch from e4000f6 to a9f8d92 Compare July 28, 2021 12:29

wbclark reviewed Jul 28, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce a class to represent the answer file #337

Introduce a class to represent the answer file #337

ehelms commented Jul 19, 2021

ehelms Jul 21, 2021

ekohl Jul 21, 2021

ehelms Jul 21, 2021

ehelms Jul 21, 2021

ekohl Jul 21, 2021

ekohl Jul 21, 2021

ehelms Jul 21, 2021

ekohl Jul 21, 2021

ehelms Jul 21, 2021

ehelms Jul 28, 2021

ekohl Jul 21, 2021

ekohl Jul 21, 2021

ekohl Jul 21, 2021

ehelms commented Jul 21, 2021

ekohl commented Jul 21, 2021

wbclark left a comment •

edited

Loading

ehelms commented Jul 28, 2021

wbclark commented Jul 28, 2021

ehelms commented Jul 28, 2021

wbclark commented Jul 28, 2021

ehelms commented Jul 28, 2021

wbclark commented Jul 28, 2021

	let(:dummy_logger) { DummyLogger.new }
	let(:logger) { DummyLogger.new }

		@@ -0,0 +1,47 @@
		require 'test_helper'

		describe 'Kafo::AnswerFile' do

Introduce a class to represent the answer file #337

Are you sure you want to change the base?

Introduce a class to represent the answer file #337

Conversation

ehelms commented Jul 19, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ehelms commented Jul 21, 2021

ekohl commented Jul 21, 2021

wbclark left a comment • edited Loading

Choose a reason for hiding this comment

ehelms commented Jul 28, 2021

wbclark commented Jul 28, 2021

ehelms commented Jul 28, 2021

wbclark commented Jul 28, 2021

ehelms commented Jul 28, 2021

wbclark commented Jul 28, 2021

wbclark left a comment •

edited

Loading