As in v1.3.4, in v1.3.5 host status (key = status) is set to 2 (Unreachable) although host is OK and rest of monitored parameters are all right.
Ad Widget
Collapse
1.3.5 wrong host status.
Collapse
X
-
I was unableto reproduce this with the latest code. Perhaps it was already fixed as a side effect of other improvements. Let wait for 1.3.6. -
[1.3.6] Once host status = 2 (unreachable) it never changes again
Hi, this bug is still present in pre-1.3.6 (r4064), and it's a blocking bug because once the trigger is activated and an alert is sent, the server status never comme back to "OK".Originally posted by AlexeiI was unableto reproduce this with the latest code. Perhaps it was already fixed as a side effect of other improvements. Let wait for 1.3.6.
Here is a full scenario:
- Install Zabbix 1.3.6 from scratch, clean database
- Create a template with a subset of items and triggers from Unix_t
- One of the item is "host status", key "status"
- One of the trigger is "Server {HOSTNAME} is unreacheable", "{template_t:status.last(0)}=2"
- Create one server, add the template to it
In the Overview page, Server Status is always unknown, from the start. I checked the "history_uint" database with the item's id, and these is no data available for this item, even after running the system for 10 hours.
Now I trigger an alert with the following scenario:
- Stop the zabbix agent
- The "Server {HOSTNAME} is unreachable" trigger is now ON
- An alert is sent by mail
- No other of the checks depending on the Zabbix Agent is done anymore and no other trigger is activated. (I supposed this is because Zabbix does rely on the 'status' item being ok before activating other triggers?)
- Wait 1 minute and start the zabbix agent again
- After a few second all the items get refreshed again
But:
- The "Server {HOSTNAME} is unreacheable" trigger stays ON
- No alert is sent to say that the server is available again
- The server stays down in the Overview screen
I checked the "history_uint" database again, with the itemid of the "status" item for this host, and there are only 3 lines in the table:
So, Zabbix added exactly 3 records saying the server is unreachable (because zabbix agent was not running), at a one minute interval, but nothing before and nothing after. Therefore the only recorded state is: unreacheable, foreverCode:itemid clock value 17561 1177661609 2 17561 1177661669 2 17561 1177661729 2
The first bug is that the server "status" item is not recorded in the database when the server is OK.
The second bug which, I think, is a blocking bug, is that the trigger never goes OFF again and no alert is therefore sent.
RegardsComment
-
The "status" is not processed by agent! It is a special key calculated internally by ZABBIX server.
Please can you start with a fresh install of 1.3.7 and let me know if it works for you.Comment
-
Hi,
I have the same problem, with Zabbix 1.3.7 (rev 4114 I think), all the "host status" are greyed out.
The database I use was created under 1.3.5. As I am in a pre-deployement phase (yes I know 1.3.x is beta, but I'm anticipating your official release
) I can't reinstall the whole database from scratch...
But, one thing I did was to delete the "host status" item and the corresponding trigger, and to recreate them in a template. And still I can't get the host status right there is no corresponding data in the "Monitoring > Latest data" screen.
I have a question : as the "status" is an internally calculated item, should it be created as a Zabbix agent, a Zabbix agent (active), a Simple Check, a Zabbix Internal, or as something else? As I deleted and recreated the item, I'm a bit lost.
RegardsComment
-
Here's some more information on the definition of my "status" item, in the hope that it can help debugging the problem (zabbix 1.3.7 r4114):
And there is not data for this item in the database:Code:mysql> select * from items where itemid = 17781 -> \G *************************** 1. row *************************** itemid: 17781 type: 0 snmp_community: public snmp_oid: interfaces.ifTable.ifEntry.ifInOctets.1 snmp_port: 161 hostid: 10012 description: Host status key_: status delay: 60 history: 7 trends: 365 nextcheck: 0 lastvalue: NULL lastclock: NULL prevvalue: NULL status: 0 value_type: 3 trapper_hosts: units: multiplier: 0 delta: 0 prevorgvalue: NULL snmpv3_securityname: snmpv3_securitylevel: 0 snmpv3_authpassphrase: snmpv3_privpassphrase: formula: 1 error: lastlogsize: 0 logtimefmt: templateid: 17779 valuemapid: 2 delay_flex: 1 row in set (0.00 sec)
Regards.Code:mysql> select * from history_uint where itemid = 17781; Empty set (0.00 sec)
Comment
-
Same problem here.
We have 4 hosts running Zabbix agent 1.4 - those are "unkown" status.
And we have one machine which is still running 1.1.7 agent, that one is reported correctly.
Server is 1.4, installed from scratch with fresh MySQL database.
AND when I go to Configuration->Items->Host status, and click on "select" for the key, status is not even listet.
What Info can I provide to hunt that one down?
EDIT: restarting the agent on all those hosts seems to help though.Last edited by gimpel; 01-06-2007, 15:50.Comment
-
Same problem here with Zabbix 1.4
First of all a note about our configuration:
We have heavily templated our zabbix usage such that no monitored host has any of their own items and instead all refer to items via their templates. Thus in the configuration->items screen for a hosts, one sees list of items such as "Template_App_Foo:Foo is running" etc. [Obviously this doesnt apply to web monitoring items as they cant be templated].
The reason I mention this first is that I expect it is pertinent to the main problem we have....
Looking in the manual it says that the parameter 'status' is special, to quote:
Example 7 - Server is unreachable
{zabbix.zabbix.com:status.last(0)}=2
Note: The ‘status’ is a special parameter which is calculated if and only if corresponding host has at least one parameter for monitoring. See description of ‘status’ for more details.
All our "Host status" items, which are defined inside the template 'Template_Linux' and thus listed as 'Template_Linux:Host status' for our monitored hosts always remains set to 'No data' which in turn prevents the main "Server is unreachable" trigger from working.
We've read up on the status parameter (see previous quote from manual) and it simply says as long as ONE parameter (item) is being returned for the host then this will be true as it means the host is up. However for us it remains stubbornly set as 'No data'.
We thought this may be related to our heavy use of templates so have removed the 'Host status' item from the Linux template and manually added it, as well as some other miscellaneous parameters, to one of our monitored hosts in the hope that this would let it start working again but to no avail.
So how can i have huge amounts of data coming back from my hosts but my 'Host status' item, and in turn, its 'status' key, simply remain associated with 'no data'.
has a fix been suggested by anyone?Comment
-
The host status is updated ONLY on host status change. While the host is available the item don't get any data.
Available->Not available: Status gets one value
Not Available->Available: Status gets another value
Please, do not expect constand data flow for the status!Comment
-
i have the same problem.
I work with zabbix since version 1.1 and have a lot of items and triggers so i can't just restart from 0 (now moving to 1.4 and fixing all the problems with template linkage and everything game me a lot of extra hours work the last days).
So now changed to version 1.4 and still have the problem with available.
As you can see in the printscreen there are switches "Not available" en "timeout while connecting" to port 161 but the template is not configured for scanning all hosts to port 161.
Maybe in the beginning it was when i created it and some switches don't have snmp configured and then this status is correct but when you fix it in the template to disable and change the hosts where it needs to be scanned, the status never changes from those who had "not available" tag.
But everything works, e-mails are send when host is down etc...
I can only conclude this is a bug since 1.1.x. and i am not the only one because EVEN if the host status changes, this doesn't change !
hopefully it will be fixed, but for the rest i have to say that zabbix is very ok and i like working with it!
so also have to say thnx for such a good open-source program!
i voted for it on sourceforge!
grtz,
Ph.Last edited by Philippe; 29-06-2007, 15:15.Comment
-
Look at configuration of your items. I'm pretty sure there is an item having type ZABBIX Agent and you have no ZABBIX agent running on the SNMP device.Comment
-
Hello,
No i don't.
I use templates and this templates have SNMP configurations but they are disabled and i enable them for the hosts where i need snmp.
In attachment (doc3) you can see 3 printscreens.
first is printscreen of 1 host (of many) which shows the error.
second is printscreen of the items of that host (where you can see that everything with snmp is disabled, only simple ping is enabled)
third is a printscreen of the template.
simpleping is a printscreen of the icmpping of that template.
grtz and thnx.
Philippe.Last edited by Philippe; 02-07-2007, 11:26.Comment
-
Hey folks,
I'm having a similar problem, though I think I have a fix.
Zabbix 1.4 with a mix of Unix and Windows hosts. It's only happening to linux hosts, but not all of them. I'm using only the Template_Linux with some of the triggers disabled, and a few triggers enabled/disabled depending on the host.
With the exception of "Server {HOSTNAME} which is unreachable" everything that should does show up green. The unreachable trigger is grey.
I've been able to fix it by shutting down the agent long enough for zabbix to realize that its off and then starting the agent back up.
-KenComment
-
I am having the same problem KevinM is describing. Only linux hosts are having this problem.
---"With the exception of "Server {HOSTNAME} which is unreachable" everything that should does show up green. The unreachable trigger is grey."---
Because of this they never alert in a down state when they are actually down.
The problem is it stays in an alert status and never returns to normal. Works just dandy on the windows hosts. Enters ON status when unavailable and switches to OFF status when available.
I'll try what Kevin suggests but hopefully there is a long term solution on the horizon.
*** Correction, trying Kevin's suggestion does not appear to fix this problem on my zabbix server. It has not been able to clear the "unreachable" OFF state on the linux servers for quite some time now. Anyone have any suggestions? Thanks.Last edited by btriem; 08-08-2007, 15:44.Comment
Comment