This post is also available in: French
In the first part, we introduced all changes we made in our monitoring engine Centreon Engine and Centreon Broker.
To top up all these features, we also decided to make changes on the following elements :
- timeout management
- polling system
- notification system of the scheduler
A better management of timeout
The new version of Centreon will enable us to set different timeouts between hosts and services. The configuration of this parameter is now local for each resource configuration. It is then possible to set a timeout of 40 seconds on some services and 5 seconds on some other services. This will help us fight against those latency issues where some commands are likely to take a long to execute or even hang.
This parameter can be modified on hosts, services and their templates. This precise management of execution timeouts offers the possibility to set a higher check frequency on some resources.
Polling below 1 minute
Polling below 1 minute is now possible. The previous versions already allow it but the Nagios developers don’t insure the reliability of the feature. For this to be reliable, we had to implement the local timeout feature beforehands. Indeed, when a service is set to be checked every 5 seconds and at some point it takes more than 5 seconds to respond, it causes latency which highly impacts the effectiveness of the monitoring platform. Setting a 5 second global timeout on all resources is not an option either, that’s why we had to implement the local timeout first.
All users who need high frequency check (under 1 minute) will be happy with this. However, it is important to keep an eye on your available storage space as high frequency checks lead to higher disk space requirement
We are currently working on new database engines and archiving strategy that allow a better management of big data. But we remind you that it is not necessary to keep data if you don’t make use of it
If you want to keep 1 or 2 years of data, make good use of it and start using Centreon BI and its data warehouse !
Extraction of scheduler notification system
We are working on the extraction of notification from the engine’s core.
Currently, the notification is managed by the scheduler with a blocking operation (again…!) It means that when an alert is sent, no other action can be performed during that time. Imagine that you have to send many notifications… You would have many blocking operations.
In order to solve this problem, we decided to change the way it works. It is now the job of Centreon Broker to handle the notification process. We started preparing it before integrating the correlation system which we have been working on for over a year now. After correlation, the next step is to send notifications (mails, SMS, SNMP traps, HTTP requests, etc..) when an incident is detected by the correlator; the user is then notified even though he is not in front of his monitoring dashboard.
So we can manage alerts from a central point : the broker and pollers as well, for you may have some brokers in specific areas (on pollers or in external zone of your network). The notifications can be correlated but it is not mandatory.
You can find the same rules of notifications (intervals, contacts, contact groups, status) but with greater capabilities. For example, we can get more information about dependencies and impacts. You can still receive non-correlated notifications but the major advantage is the possibility to get notified on root causes only.
What about the release date?
These features are still under development and many changes are still on the way. We will have more information to share in a few weeks. We plan to have a stable version of Centreon 3 by the end of the year.
The engine part is the most up-front.
What do you think of these features ?
We are available to answer all your questions and pay attention on feedbacks.
If you want to try, validate, document or talk with us, please leave a comment or send an email to firstname.lastname@example.org