Hi,
I upgraded Zabbix a few days ago, from version 1.8.5 to 1.8.13.
The zabbix is hosted on an LAMP : Ubuntu 10.04LTS, Apache2, Mysql and Php5.
I performed the upgrade by the book (the official and the how-to i wrote after installing the previous version)
After the installation, i made a lot of tests, including servers shutdown, network failure, VPN that went down, network broadcast storm, snmp alerts.
All the tests were ok, and zabbix didn't falter at any moment.
The day after, all went wrong : web site not available (Error 325), and way too many alerts sent.
So I started to analyse the logs and find out that some PHP modules started to work weirdly and that there was some gaps in the data trasmission between the monitored servers and the zabbix application.
Before the upgrade, i was testing communications between zabbix and maybe 200 servers, with simple checks on the TSE port, on which i made some alerts:
{simplecheck:tcp,3389.last(0)}=0
These alerts allowed me to be sure that i could remotly manage these servers (no need to explain how or why to you gentlemen I guess).
But since the upgrade, I am receiving way too many alerts that are solved instantly. When i go into the latest data, i can see "mini failures" in the results of the triggers (see attachment file) and therefore many alerts are send.
As i may be paranoid, i double checked by performing an hping on the same port from the zabbix server toward one of the faulty server, and of course, zabbix kept seeing problems when the hping didn't notice anything wrong.
I concluded the problem was coming from the server hosting the zabbix, or indeed from zabbix itself, but the problem is that I haven't been able to solve this mystery.
This spam is a pain : among the flow of alerts it is beginning to be difficult to see the real alerts among the "fake ones", and my question is :
Do you have an idea why a simple upgrade would mess up with PHP modules when all I did on the server was to modify 3 directories :
/etc/zabbix
/var/www/zabbix and
/usr/local/zabbix
and why zabbix would suddenly start working so weirdly when it has been reliable for so long?
Thank you in advance, and yes, i've check the configuration files : I made the same configuration as it was already in place when i used the 1.8.5.
Kind Regards from Paris - France!
I upgraded Zabbix a few days ago, from version 1.8.5 to 1.8.13.
The zabbix is hosted on an LAMP : Ubuntu 10.04LTS, Apache2, Mysql and Php5.
I performed the upgrade by the book (the official and the how-to i wrote after installing the previous version)
After the installation, i made a lot of tests, including servers shutdown, network failure, VPN that went down, network broadcast storm, snmp alerts.
All the tests were ok, and zabbix didn't falter at any moment.
The day after, all went wrong : web site not available (Error 325), and way too many alerts sent.
So I started to analyse the logs and find out that some PHP modules started to work weirdly and that there was some gaps in the data trasmission between the monitored servers and the zabbix application.
Before the upgrade, i was testing communications between zabbix and maybe 200 servers, with simple checks on the TSE port, on which i made some alerts:
{simplecheck:tcp,3389.last(0)}=0
These alerts allowed me to be sure that i could remotly manage these servers (no need to explain how or why to you gentlemen I guess).
But since the upgrade, I am receiving way too many alerts that are solved instantly. When i go into the latest data, i can see "mini failures" in the results of the triggers (see attachment file) and therefore many alerts are send.
As i may be paranoid, i double checked by performing an hping on the same port from the zabbix server toward one of the faulty server, and of course, zabbix kept seeing problems when the hping didn't notice anything wrong.
I concluded the problem was coming from the server hosting the zabbix, or indeed from zabbix itself, but the problem is that I haven't been able to solve this mystery.
This spam is a pain : among the flow of alerts it is beginning to be difficult to see the real alerts among the "fake ones", and my question is :
Do you have an idea why a simple upgrade would mess up with PHP modules when all I did on the server was to modify 3 directories :
/etc/zabbix
/var/www/zabbix and
/usr/local/zabbix
and why zabbix would suddenly start working so weirdly when it has been reliable for so long?
Thank you in advance, and yes, i've check the configuration files : I made the same configuration as it was already in place when i used the 1.8.5.
Kind Regards from Paris - France!