Ad Widget

Collapse

Trigger / action and 'autoclose'

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • hm1266
    Junior Member
    • Jan 2025
    • 4

    #1

    Trigger / action and 'autoclose'

    Hi -
    ( my first time on this forum - be nice :-)

    on:
    server: zabbix 7.0.x,
    client: redhat 7.x

    What:
    Iam monitoring a logfile. I becomes quite large, with lots of unimportant messages.

    item-
    To find, what i need, prox. 50 items are configured. (zabbix active agent, log[logfile,something went wrong..] )

    trigger-
    Each has a trigger: count( item, 1h)>0,
    'PROBLEM event generation mode' = Single
    There is no 'recovery expression', since there is no 'somthing is ok again' message in the log

    They trigger an Action, that crates a ticket on the enterprise ( Helix -plugin )
    This works fine and is very fast.

    Except:
    problems are never closed.
    Therefore, Action is only triggered once, and never again.

    I have tried to add a 'recovery expression' with .nodata(...)}=1, but that dos not seem to work on 'log[..' items?
    - perhaps, because there is no further 'items' being generated?
    ( or did I do this wrong ?)

    I did se some very 'hack' type solutions, creating an action, that sends a API req. til the zabbix server to close itself. That dos not seem the 'right' solution..

    I would be very thankful for any hints or experiences...
    /holger

  • cyber
    Senior Member
    Zabbix Certified SpecialistZabbix Certified Professional
    • Dec 2006
    • 4806

    #2
    count( item, 1h)>0 should return to false, when your item has not received any values for 1h.. Have you waited for so long? Is there any such period of time, when no data comes in?
    But this logfile triggers issue is old and has long beard.. I'm using quite often something like "bytelength(last(host/item))>0 and nodata(/host/item/,5m)=0" that usually closes the problem after 5 minutes, IF there is no new data.. Basically suppresses new problems until there has been 5m of silence... As customer need, it can usually be closed earlier or later... no less than 30s. We do forward evetns out to 3rd party ticketing, so they will get a ticket anyway, so problem can be closed and ready for next one.

    Comment

    • hm1266
      Junior Member
      • Jan 2025
      • 4

      #3
      Hi - Thanks for your reply -
      Yes, im sure item is not triggered for long periodes ( days, weeks )
      - where item is like log(logfile, something went wrong..)

      It is my understanding, that triggers don't get evaluated, unless a new item appears? At least no if item is a log()?
      Therefor the trigger never evaluates to False?

      Compared with a, say a trigger like memory_free < 10. This will return False, as soon as it sees 11 free.

      Comment

      • hm1266
        Junior Member
        • Jan 2025
        • 4

        #4
        Hi thanks again, I tried this:

        Host xxxx
        Trigger HM_test_close
        Severity Information
        Problem expression
        bytelength(last(/xxxx/log[/tmp/testfile,Hello]))>0
        Recovery expression
        nodata(/xxxx/log[/tmp/testfile,Hello],5m)=0
        Event generation Normal
        Allow manual close No
        Enabled Yes

        ( i replaced the real hostname.domain with xxxx )

        And after echo Hello7 >> /tmp/testfile, it created a 'problem':
        Event HM_test_close
        Operational data
        Hello7
        Severity Information
        Time 2025-01-31 08:23:01
        Acknowledged No
        Tags OS: Linux
        Description
        Rank Cause​

        and:
        Event list [previous 20]
        Time Recovery time Status Age Duration Update Actions
        2025-01-31 08:23:01 PROBLEM 13m 42s 13m 42s Update


        Any hint as to what I did wrong, or forgot to do?
        /holger

        Comment

        • cyber
          Senior Member
          Zabbix Certified SpecialistZabbix Certified Professional
          • Dec 2006
          • 4806

          #5
          Wrong? Using recovery expression... RE is only considered after problem expression has turned FALSE...
          Use both expressions together with AND.

          Comment

          • hm1266
            Junior Member
            • Jan 2025
            • 4

            #6
            || RE is only considered after problem expression has turned FALSE...
            Oh... OK.
            And since this will never happen ( problem expression is always true ), I need to add

            I have tried:

            Host xxxx
            Trigger HM_test_close
            Severity Information
            Problem expression
            bytelength(last(/xxxx/log[/tmp/testfile,Hello]))>0 and nodata(/xxxx/log[/tmp/testfile,Hello],5m)=0
            Recovery expression
            Event generation Normal
            Allow manual close Yes

            Enabled Yes

            It dos sound like a roundabout way of using count( item, 5m)!=0, but perhaps there is a subtle difference?


            The good news is, that problems are still triggered:

            Event details
            Event HM_test_close
            Operational data
            Hello8
            Severity Information
            Time 2025-02-06 09:13:46
            Acknowledged No
            Tags OS: Linux
            Description
            Rank Cause zz0.e2txsr9xodvzz



            But they still dont close:

            Event list [previous 20]
            Time Recovery time Status Age Duration Update Actions
            2025-02-06 09:13:46 PROBLEM 24m 1s 24m 1s Update


            Thanks for your support - Iam very grateful - I have tried to solve this for more than a year, and your suggestions have explained a lot, already.
            /holger

            Comment

            • cyber
              Senior Member
              Zabbix Certified SpecialistZabbix Certified Professional
              • Dec 2006
              • 4806

              #7
              Was there any data coming in during that 5m time? that 5m is reset, if something comes in, so there should be a total silence for 5m...
              difference in nodata and count, as your example... this is here.. https://www.zabbix.com/documentation...lculation-time
              date/time functions and nodata are recalcualted every 30s. Others, when new value arrives...

              Comment

              Working...