Ad Widget

Collapse

Apparent triple bug: "Cannot evaluate function" and trigger flips to "Unknown" and

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • otheus
    Member
    • Mar 2009
    • 53

    #1

    Apparent triple bug: "Cannot evaluate function" and trigger flips to "Unknown" and

    Note: Server version 2.2.7

    I am currently getting a VERY frustrating set of behaviors from Zabbix. A trigger will be perfectly working one second, and then the next, flip to "Unknown" status with a corresponding Error pop-up of "Cannot evaluate function". Unfortunately I cannot tell which function it cannot evaluate because the rollover pop-up is too small to display the whole error message; further, the server logs give no hint.

    Consider the following trigger (some values replaced for security reasons):
    Code:
    {hostname.our.org:log[/var/log/postgres/postgresql-today.log,"(PANIC|WARNING|ERROR|FATAL)"].nodata(3600)}=0
    This is meant to fire the trigger if there is data for the last hour in the given logfile. This seems to work well enough. Now we add...

    Code:
    <<first part>>& {hostname.our.org:log[/var/log/postgres/postgresql-today.log,"(PANIC|WARNING|ERROR|FATAL)"].regexp("FATAL",3600)}=1
    This is expected to mean (combined with the first expression): "if the postgresql-today log contains a logged error message sent within the last 3600 seconds and such a message contains the regexp-match FATAL, the trigger evaluates to true".

    Now, whenever I create this trigger, it will seemingly work -- everything is green. After some time, however, it will "flip" to the error status. To get this trigger back to the "enabled" state, I must simply edit it to...
    Code:
    <<first part>> & {hostname.our.org:log[/var/log/postgres/postgresql-today.log,"(PANIC|WARNING|ERROR|FATAL)"].regexp("FATAL")}=1
    I think we have the confluence of three bugs. First, an expression failure that is not seen at trigger-editing time but only later. Second, an insufficient level of reporting (to the user, in some way) on such a failure. Third: the underlying problem: the broken feature of the "evaluation period" in the regexp function.

    The same kind of situation happens when the function is countinstead of regexp.

    If someone will please take the time to confirm/disconfirm this as a bug, and I will follow up accordingly (with the bug-reporting system).
    Last edited by otheus; 22-07-2015, 19:03. Reason: formatting fix
  • ik_zelf
    Member
    • Feb 2015
    • 60

    #2
    Hi I experience the same issues you have and I think this definitely is a bug.

    I did some testing and also noted that when the agent restarted, the logfile is completely read again and uploaded to zabbix, causing false events.

    See also https://www.zabbix.com/forum/showthr...183#post176183


    How can we file a bug for this? I think these things make the logfile monitoring close to useless.

    Comment

    • Forseti
      Junior Member
      • Apr 2014
      • 9

      #3
      Zabbix Server v2.2.10

      Hi,

      i think I have the same Problem here with many triggers.

      Example:
      Code:
      {host1:program.state["Foobar"].nodata(600s)}=1 | {host1:program.state["Foobar"].avg(600s)}>20
      When i use only one part of the expression, it works, with the complete expression is flips most time to "unknown".

      Any Idea?

      Comment

      • shaileshsutar88
        Junior Member
        • Apr 2018
        • 1

        #4
        Hi I am new to zabbix monitoring. I am having 3 hostgroup in my zabbix server UI,
        1. host-all
        2. host-blwe
        3. host-train
        4. host-standby

        Now I am having a external script which I want to run against blwe host group. I am having a template which is linked to all host groups except host-all. Now I want to add an item which will be available to only host-blwe host group. and based on the result of that item script, it should trigger alert.

        Any ideas how it is done? Also I've configured item as external check script and added value mapping as 1--> down(problem) and 0--> Up(No problem) but the trigger condition {Template_gateway:zabbix_api.sh["Europe/Zurich"].last()}=1 is not working as expected. It neither shows problems on dashboard for problem nor sends notification. I am passing timezone as argument to the script. And when i tail the logs of server it shows script is running.

        Note: I was not able to post a question anywhere that's the reason I am replying to this thread. I am using zabbix_server 3.4.7
        Last edited by shaileshsutar88; 11-04-2018, 16:51. Reason: Added server version zabbix_server version.

        Comment

        Working...