You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since nearly half a year, the checkmk_server role supports automated setup of distributed monitoring sites. I extensively run this role in a large environment with two master sites each having multiple slave sites and several hundered monitoring targets each having from a few dozens to a few hundered service checks. Although the role already assists quite well adding new targets or generating monitoring rules, the distributed multisite setup is still a bit clumsy. Especially when adding a new slave site, a lot of manual definition work is required (in checkmk_server__distributed_sites) which is error-prone and already requires a good understanding about the Check_MK and Ansible role internals. It also doesn't help, that I didn't properly document it yet, as I was always looking for a way to simplify the configuration.
Further there are some limitations which I documented in the following issues:
And I plan some extensions such as (#42: Support stunnel for protecting livestatus queries) or automated setup of multiple sites per server which are simply not possible with the current role layout.
TODO
All this made me think, that I need a better way to setup distributed sites and I came along with the following idea:
Instead of attaching the slave site setup to the monitoring server running the site, I plan to move the logic to the master site setup. In this way, a distributed setup is configured and push from a single configuration target (the master server) and it's much easier to pass all required information to a slave site.
In the following weeks I will try to change the role logic in the proposed way. This should not only make it easier to fix the mentioned issues, but hopefully also allow for an easier implementation to support multiple monitoring sites on the same server (which currently has to be done manually).
The text was updated successfully, but these errors were encountered:
Since nearly half a year, the
checkmk_server
role supports automated setup of distributed monitoring sites. I extensively run this role in a large environment with two master sites each having multiple slave sites and several hundered monitoring targets each having from a few dozens to a few hundered service checks. Although the role already assists quite well adding new targets or generating monitoring rules, the distributed multisite setup is still a bit clumsy. Especially when adding a new slave site, a lot of manual definition work is required (incheckmk_server__distributed_sites
) which is error-prone and already requires a good understanding about the Check_MK and Ansible role internals. It also doesn't help, that I didn't properly document it yet, as I was always looking for a way to simplify the configuration.Further there are some limitations which I documented in the following issues:
And I plan some extensions such as (#42: Support stunnel for protecting livestatus queries) or automated setup of multiple sites per server which are simply not possible with the current role layout.
TODO
All this made me think, that I need a better way to setup distributed sites and I came along with the following idea:
Instead of attaching the slave site setup to the monitoring server running the site, I plan to move the logic to the master site setup. In this way, a distributed setup is configured and push from a single configuration target (the master server) and it's much easier to pass all required information to a slave site.
In the following weeks I will try to change the role logic in the proposed way. This should not only make it easier to fix the mentioned issues, but hopefully also allow for an easier implementation to support multiple monitoring sites on the same server (which currently has to be done manually).
The text was updated successfully, but these errors were encountered: