Reduce Downtimes in Multiserver Setups on Changes
Summary
The ISPConfig System Cron Job runs every 1 minute exactly at :00:00. The cron job does everything and triggers all required service restarts / reloads. On a single server setup this issue doesn't really matter as there is just one node and a "downtime" on service restart is unavoidable. But on a mirrored Multi-Server setup that might have been setup for High-Availability reasons the current behaviour leads to a potential synchronous restart / reload of all affected services on all servers at the same time. For example: Changes to a Website tend to restart / reload the webserver and php on all servers at the exact same time. This leads to avoidable downtimes as all servers are "unavailable" at the same time.
Spreading the execution time of the ISPConfig System Cron job just by some seconds would reduce or remove this issue.
Steps to reproduce
- Setup a Multi Server Setup
- Configure one or more servers as mirror of another server
- Setup one website on the mirrored server with Webserver + PHP
- Save and setup a Ping Check to the Website on each server
- Perform a change to the Websites config
- See all servers being unable to deliver the site at the same time
Correct behaviour
The behaviour is correct. It could just be improved.
Environment
Server OS + version: Alma 9.5 ISPC: 3.2.12p1 nginx: nginx/1.22.1 PHP 8.3
Proposed fix
Either in general or only when a Multiserver setup is detected: Randomize the ISPC System Cron execution second between 0 and 10.
Context
Especially when performing multiple changes to several websites all sites tend to get unavailable repeatedly at the same time for a short amount. In our HA-setup this tends to be noticed by the Health-Checks kicking out some or all servers from the Loadbalancing. Quite contradicting the sense of HA ;-)