We have a server which experienced network problems. This resulted in Zabbix unreachable events. Those events were correct.
Now that the problems have been fixed a few days ago, Zabbix keeps whining several times a day. For instance, according to Zabbix the server was down for over an hour today, but it was working just fine.
We are using the (standard) trigger status.min(30)=2
The client was running 1.4.1 (FreeBSD port). Now it's on 1.4.2. No difference.
Zabbix server log:
11097:20071127:130321 Host [SERVER]: first network error, wait for 15 seconds
11097:20071127:130321 Parameter [system.cpu.load[,avg1]] will be checked after 20 seconds on host [SERVER]
11100:20071127:130324 Host [SERVER]: first network error, wait for 15 seconds
11100:20071127:130324 Parameter [vfs.fs.size[/opt,used]] will be checked after 120 seconds on host [SERVER
]
11102:20071127:130325 Host [SERVER]: first network error, wait for 15 seconds
11102:20071127:130325 Parameter [system.cpu.load[,avg15]] will be checked after 80 seconds on host [SERVER
]
11099:20071127:130326 Host [SERVER]: first network error, wait for 15 seconds
11099:20071127:130326 Parameter [system.cpu.load[,avg5]] will be checked after 40 seconds on host [SERVER]
11132:20071127:130349 Host [SERVER]: another network error, wait for 15 seconds
11132:20071127:130409 Host [SERVER]: another network error, wait for 15 seconds
11132:20071127:130429 Host [SERVER] will be checked after 60 seconds
11132:20071127:130535 Host [SERVER] will be checked after 60 seconds
11132:20071127:130641 Host [SERVER] will be checked after 60 seconds
....... all the same lines .....
11132:20071127:143706 Host [SERVER] will be checked after 60 seconds
11132:20071127:143812 Host [SERVER] will be checked after 60 seconds
11132:20071127:143912 Enabling host [SERVER]
It looks like we still have some issues but at the same time Zabbix stops trying which isn't supposed to happen.
Now that the problems have been fixed a few days ago, Zabbix keeps whining several times a day. For instance, according to Zabbix the server was down for over an hour today, but it was working just fine.
We are using the (standard) trigger status.min(30)=2
The client was running 1.4.1 (FreeBSD port). Now it's on 1.4.2. No difference.
Zabbix server log:
11097:20071127:130321 Host [SERVER]: first network error, wait for 15 seconds
11097:20071127:130321 Parameter [system.cpu.load[,avg1]] will be checked after 20 seconds on host [SERVER]
11100:20071127:130324 Host [SERVER]: first network error, wait for 15 seconds
11100:20071127:130324 Parameter [vfs.fs.size[/opt,used]] will be checked after 120 seconds on host [SERVER
]
11102:20071127:130325 Host [SERVER]: first network error, wait for 15 seconds
11102:20071127:130325 Parameter [system.cpu.load[,avg15]] will be checked after 80 seconds on host [SERVER
]
11099:20071127:130326 Host [SERVER]: first network error, wait for 15 seconds
11099:20071127:130326 Parameter [system.cpu.load[,avg5]] will be checked after 40 seconds on host [SERVER]
11132:20071127:130349 Host [SERVER]: another network error, wait for 15 seconds
11132:20071127:130409 Host [SERVER]: another network error, wait for 15 seconds
11132:20071127:130429 Host [SERVER] will be checked after 60 seconds
11132:20071127:130535 Host [SERVER] will be checked after 60 seconds
11132:20071127:130641 Host [SERVER] will be checked after 60 seconds
....... all the same lines .....
11132:20071127:143706 Host [SERVER] will be checked after 60 seconds
11132:20071127:143812 Host [SERVER] will be checked after 60 seconds
11132:20071127:143912 Enabling host [SERVER]
It looks like we still have some issues but at the same time Zabbix stops trying which isn't supposed to happen.
Comment