Centreon Engine in the Future: Simpler is Better.
In the first part of this article, we saw the history of Centreon Engine, from his inception as a replacement to Nagios, to the mature software it is now. We saw some of the issues it has successfully responded to and its limits in struggling with a growingly old codebase. The future asked for improvements made to Centreon Engine, and more precisely, for a diet.
In software, like in many things, less is often more. Less means less bugs, a simpler codebase, and more room for improvements. The decision to break Nagios binary compatibility allowed us to clean up concepts that were quite outdated. The list of things removed in the next major milestone of Centreon Engine is as long as the list of new features provided by Centreon, and arguably, as a developer, much more exciting.
We will list some of these simplifications. Many more have been done.
Groups, but why?
An old concept dating back from Nagios is the concept of groups of same objects. So called hostgroups, servicegroups, or contactgroups, they are a way to define some abstract cluster of elements. That seems useful, right? Unfortunately, the group mechanism has been found to be ill-defined and limited, so much so that it has been superseded by other, more expressive concepts.
A group can be used for several things. Primarily, a group can be used as a configuration strategy or as a tool to logically organize several objects that are closely tied. A group configuration has always been a source of errors and has been entirely deprecated in favor of a powerful multiple inheritance system.
Logical grouping is entirely out of the scope of a monitoring engine and should not be present in Centreon Engine at all. In fact, the next version of Centreon supports tags that are a much more powerful way to do logical grouping.
Groups were a clutch that was becoming actively harmful for a healthy configuration. Centreon Engine 2 removes them and uses new Centreon concepts when they are needed.
Nagios notification scheduler
Nagios allowed notifications to be launched on some conditions, typically a change of status. If this mechanism was useful for small, well defined system, it is considerably less useful in a modern world where notifications are both too slow for a split-second response, and not powerful enough to express the full range of today’s issues.
Typically, Centreon(?) Engine’s notifications can’t be correlated. They can’t be made to be launched when a specific metric has exceeded a threshold. They are difficult and error-prone to configure correctly for a large number of users. Even more importantly, they are slow: Nagios’ (and Centreon Engine’s) notifications are sent in a blocking, synchronous manner, stopping the monitoring altogether until the notification has been sent.
To correct this(?), Centreon Engine 2 removes its notification subsystem and instead relies on the fast, asynchronous, high-level notification scheduler integrated to Centreon Broker. This scheduler is able to notify on a far broader range of conditions, and can be fine-tuned to allow a greater control of the notifications.
The Parameter “Hell”
Do you know the total numbers of general configuration parameters inherited from Nagios for Centreon Engine 1.5? 145! There is documented parameters for every nook and cranny of Centreon Engine, most of all unknown and unused by the general public.
This wouldn’t be so bad if Centreon Engine was not able to generate perfectly good default values for those parameters. In many cases, tuning carefully those parameters results in worse performance than letting the daemon choose their value.
Centreon Engine 2 has been cleaning up many of those parameters. The number 145 has been shrunk to 63 – and there are many other object parameters that have been or are waiting to be cleaned.
Centreon’s new agent
In the monitoring world, an agent is a standalone daemon that is installed on the machine to be monitored. It is a paradigm shift from a more centralized monitoring where the daemon is installed on a central server and do remote monitoring. An agent solves a good number of problems inherent to the management of a big park of machines and have been used successfully by big names, from Amazon to Google.
When you remove all the outdated concepts in Centreon Engine, all that remains is a very good monitoring engine able to schedule checks and to send them to a remote server. Incidentally, this is the definition of a monitoring agent!
This long-term goal of Centreon Engine is becoming closer and closer to completion. As an agent, Centreon Engine could be used to support new architectures or new methodology. Distribution of agents ‘on the cloud’ becomes possible inside customized machine image, making place for new, innovative paradigms like MaaS (Monitoring as a Service).
Centreon Engine 2 will never transition exclusively to be a standalone agent. It will be able to be configured as an agent or a monitoring engine, and freely mix in the same network. This will give us the flexibility of both approaches.
I hope this brief look into the mind of the development process was instructive. The history of Centreon Engine can be seen as a steady improvement over Nagios’ design. It is the building block of all the other technologies used by Centreon.
From those roots, work has been done to add auto-discovery modules, support for popular cloud computing services, system wide log-analyzers, and more. But those interesting parts couldn’t exist without the solid foundation Centreon Engine provides.
One of the most interesting piece of software allowed by Centreon Engine’s existence is Centreon Broker, a distributed, high-disponibility, fast event manager. Centreon Broker is the cornerstone of Centreon’s architecture – indeed, it is difficult to explain the many improvements made to Centreon Engine without also explaining the many improvements made to Centreon Broker. This will be the subject of our next article.