Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reborn in 2020 #26

Open
wants to merge 187 commits into
base: master
Choose a base branch
from
Open

Reborn in 2020 #26

wants to merge 187 commits into from

Conversation

zolfariot
Copy link
Member

@zolfariot zolfariot commented Mar 30, 2020

Unavoidable updates to make everything work in the new decade.

I think a PR may be the right place for comments and decisions.

At the moment I'm testing roles with ansible==2.9.6.
If there is some reason to keep backward compatibility with older ansible version, please let me know.

Sono inoltre chiuse e superate le seguenti PR automatiche per vulnerabilità:
#23 #24 #25

Current status

Roles marked as Need review or New has already been tested by the assignee on a staging environment.

Type Name Status Assignee Refs
plugin connection/lxc_ssh ❌ Deleted
plugin connection/ssh_lxc ⭐ New @zolfariot #26 (comment)
module ssh_cert ✅ Need review @zolfariot c760358, 1ca9f81.
module gen_password ✅ Need review @zolfariot 9e72163.
module cert_request ✅ Need review @zolfariot 9a24a52.
module uci ⭐ New @zolfariot cabcf49.
module occ ⭐ New @zolfariot For Nextcloud.
core role ca_cert ⭐ New @zolfariot #26 (comment)
core role service ✅ Need review @zolfariot #26 (comment)
core role nginx ✅ Need review @zolfariot #26 (comment)
role openvpn ✅ Need review @zolfariot 392edde, 33b61bf.
role lxc_guest ✅ Need review @zolfariot #26 (comment)
role ssh_server ✅ Need review @zolfariot #26 (comment)
role ldap ✅ Need review @zolfariot #26 (comment)
role gitlab ✅ Need review @zolfariot #26 (comment)
role nextcloud ⭐ New @zolfariot c967ffd.
role reverse_proxy ✅ Need review @zolfariot
role port_forwarding ⭐ New @zolfariot 24aa112.
role coturn ⭐ New @zolfariot bf39363.
role matrix-synapse ✅ Need review @zolfariot #26 (comment)
role riot-im ✅ Need review @zolfariot b0f9c97.
role icinga2 ✅ Need review @zolfariot #26 (comment)
role icinga2-monitoring ✅ Need review @zolfariot #26 (comment)
playbook prepare_host.yaml ✅ Need review @zolfariot #26 (comment)
playbook prepare_lxc_guest.yaml ⭐ New @zolfariot #26 (comment)
role ca
role dns_record
role dokuwiki
role dovecot
role exim4
role fail2ban
role kodi-repository
role mailman3
role mattermost
role mysql
role pam-ldap
role postfix
role postgresql
role roundcube
role sympa
role trakt
role webdav
role wordpress

Guidelines for refactored roles

Variables

  • Role-specific variable should be prefixed with <role_name>_
  • For each role defaults/main.yaml should give an overview of all variables used by the role, with reasonable and working defaults.
  • Roles are allowed to share variable namespace only if it's safe to assume that the variable will be same across the whole site (i.e.: ldap_server, ldap_domain, coturn_fqdn).
  • Role defaults can reference site-wide defaults defined in group_vars/all.yaml or in the inventory. In these cases group_vars/all.yaml.example and hosts.example must be updated accordingly.

First step to develop a new cleaner ssh_lxc interface.

Here we clone `ansible/plugins/connection/ssh.py` from Ansible version
2.9.6.

It will be adapted to use `lxc-attach` on the target host.
Modification of the stock connection plugin ssh.py to use lxc-attach on
the target host.

We replace any `<cmd>` with
`lxc-attach -n <container_name> /bin/sh -c '<cmd>'`
before sending it through the ssh connection.

Based on the original idea of *Pierre Chifflier* availabe on [GitHub].

The container name should be passed as the `ansible_ssh_lxc_name`
variable.
The `ansible_docker_extra_args` variable is still working for backward
compatibiliy.

ToDo: The docstrings need to be updated, they are still mostly the ones
from ssh.py connection plugin.

We figured out the proper method to access inventory variables (see
README.md in [GitHub]), they need to be propery specified inside the
DOCUMENTATION of the Connection [1], and then they can be obtained with the
`Plugin.get_option()` method. That method should not be called in the
`__init__()`, because options are not yet initialized. Calling it in
`_connect()` returned the correct option.

[GitHub]: https://github.com/chifflier/ansible-lxc-ssh
[1]: https://docs.ansible.com/ansible/2.9/dev_guide/developing_plugins.html
`lxc-ssh.py` removed.

All Playbbooks now user `ssh_lxc` connection.

`ansible_ssh_lxc_name` variable used to specify container name.

Tested and worked correctly with `python==3.8.2` and `ansible==2.9.6` on the
controller and `python==2.7` on the target.
`python3` and `python3-lxc` are installed with apt instead of `python`
and `python-lxc`.
The role was still referencing older `ansible_docker_extra_args`
variable.

Replaced with `ansible_ssh_lxc_name`.
Mainly string vs bytes-string issues.

Compatibility with Python 2.x now broken.
MIGRATION.md contains a table to track the revision and update process
of each module.

Maybe it would have been better to user a GitHub Issue?
- New apt multipackage style

- Tabulation in `templates/interfaces.j2` and in `/etc/lvm/lvm.conf`
  line fixed: in Debian buster tabulation is used to indent this config
  files by default.
Add support for OpenSSH v8 (ouput of `ssh-keygen` changed slightly) in
module `ssh_cert` and use a better implementation for multiple user CA.

Now we are reading user_ca from `group_vars/all.yaml`.
`user_ca_keys` should be list of each allowed User CA on one host (in
this way is easier to rotate CAs without reissuing keys to each user at
the same time).
The production CA must be the first one in the list. Host certificate
will be checked only against the first CA and updated if their host key
was issued from another CA in the list.

For this reason now we are using a template to create
`/etc/ssh/user_ca.pub` on the target, to preserve the key order.

`group_vars/all.yaml.example` has been updated to reflect the new usage.
Now this connection can (also) be used directly indicating the LXC
container as the target (or delegated host), if the variables
`ansible_lxc_host` and `ansible_lxc_name` are provided, either in
invetory, role or task.

`ansible_lxc_host` is the inventory hostname of the LXC running physical
host.

`ansible_lxc_name` is the container name.

File `hosts.example` is provided to show how this variables can be set
up in an inventory.
Debian version upgraded to buster.

New templates compatibile with stable versions of LXC provided with
debian.

Cleaner syntax using `ssh_lxc.py` connection plugin. Now we don't user
`lxc-attach -n ...` in the `shell` module on the host anymore, but we
delegate to `{{ vm_name }}` with `connection: ssh_lxc`, using suitable
Ansible module to do operation directly on the container before it is
online and SSH accessible.

We added an option to force an LVM VG name: if the default naming
convention is not used the vg name can be overriden with the `vg_name`
variable.

The `xfs` filesystem seems to be broken in this release, so we used
`ext4` as default for new container. This point needs further
investigation.
Now xfs is working, `prepare_host.yaml` is modified to add `xfs` to
the list of modules loaded at each boot.

If module is added to that list, than is also loaded with a modprobe
handler.

If xfs is not working with `role/lxc_guest`, run the patched
`prepare_host.yaml` againg.
using {{ domain }} instead of hardcoder lilik.it in resolv.conf.
Now which *host* is hosting a specific container is not defined in the
playbook yaml file but centrally in the invetory under the
`ansible_lxc_host` variable.

The `lxc_guest` role is runned directly against the guest, even if it
doesn't exist yet, and lxc tasks are delegated to the lxc-running
physical host.

In this way it should be easier to scale-up and configure multiple
istance of a service on different containers without changing the
playbook.

Look at `/ldap.yaml` for a commented example.
Stripping newlines from TLS request Certificate Signing Request cause
ca_manager to fail.

We have to check if SSH cert request are still working.
- Tasks splitted in subfiles.
- Static slapd configuration (slapd.conf) moved *properly* to dynamic
conf (slapd.d).
- TLS Enabled by default, with certificate acquired using
  `ca_manager`.
- New default tree
- New default ACL
- Kerberos schema added
- {SSHA512} hash properly configured.
- Move to omnibus release, with NGINX ang pgSQL managed by GitLab
configuration utilities.

- Move from anonymous to authenticated LDAP bind.
Replace .lilik.it with {{ domain }}
And minor improvements/refactoring
read password from config file instead of generating one new every time
new php packages required added
Setting `reverse_proxy_proxy_protocol: true` and
`nginx_proxy_protocol: true` in nginx roles enable the forwarding of
the original connection address from the reverse_proxy to the target
nginx instance, using the established TCP PROXY PROTOCOL (adding a TCP
header, so working also for TLS connections that are not terminated at
the reverse proxy).

**Warning**

The `reverse_proxy_proxy_protocol` settings acts globally on the
reverse proxy nodes, so every virtual server on the reverse proxy must
accept and correctly handle proxy protocol headers.

This settings must be the same for every host sharing the same reverse
proxy, otherwise the setting will be changed globally at every run.
Increase size to 20GB, needed for federation database and media.
Default user for backup has been changed from `backup` to `borg`.

User `backup` is now a system user on Debian testing. After each upgrade
involving related pacakges (pam?) our `backup` user is overwritten by
the Debian system one.

Also the default repositories folder has been changed from
`/home/backup/repos` to `/home/borg/repos`.

To adapt our existing infrastructure, after moving all the repos for all
the servers, some metadata (cache probably) need to be updated.

This update is done automatically when creating a new archvie if we set
the environment variable `BORG_RELOCATED_REPO_ACCESS_IS_OK` to `yes`.

Our backup script has been adapted to set this env variable to `yes`,
then we run a first run of backup on each host, afterwards we changed
the env variable in all backup scripts again to `no`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Slapd SHA256 manager password
1 participant