Ad Widget

Collapse

Logfile Trigger

Collapse
This topic has been answered.
X
X
 
  • Time
  • Show
Clear All
new posts
  • danielitm
    Junior Member
    • Mar 2025
    • 28

    #1

    Logfile Trigger

    Hello,

    I'm trying to build a trigger for a log file, but unfortunately it keeps triggering incorrectly. I think this happens because the log does not receive any entries for a long time.

    In the end I want to check which statement is last in the log:
    Code:
    Problem expression  
    find(/SERVER/log[LOGFILE],#50,,"Wechsel nach Runlevel 5")=1 or find(/SERVER/log[LOGFILE],#50,,"Wechsel nach Runlevel 4")=1 or find(/SERVER/log[LOGFILE],#50,,"Wechsel nach Runlevel 3")=1 or find(/SERVER/log[LOGFILE],#50,,"Wechsel nach Runlevel 2")=1 or find(/SERVER/log[LOGFILE],#50,,"Wechsel nach Runlevel 1")=1
    Recovery expression
    find(/SERVER/log[LOGFILE],#50,,"Wechsel nach Runlevel 7")=1
  • Answer selected by danielitm at 10-04-2025, 14:47.
    cyber
    Senior Member
    Zabbix Certified SpecialistZabbix Certified Professional
    • Dec 2006
    • 4807

    "extracting" here means preprocessing. It will never pick up something from date.
    First... only pick up lines with "Wechsel nach Runlevel"
    Key
    Code:
    logrt[LOGFILE,"Wechsel nach Runlevel"]
    Add preprocessing to item -> "regular expression"
    pattern -> Wechsel nach Runlevel (\d)
    Output -> \1
    type of information "numeric (unsigned)"

    After that you only save those runlevel numbers and are able to make triggers based on numbers. You can also have an item on dsashboard, which displays latest status. If you have explanation for each status, you can also create a value mapping, 1->broken, 2-> broken some more... 7 -> ok and that item on dsahsboard will be displayed in human language ie "OK (7)".

    Comment

    • cyber
      Senior Member
      Zabbix Certified SpecialistZabbix Certified Professional
      • Dec 2006
      • 4807

      #2
      you can shorten it "find(/SERVER/log[LOGFILE],#50,"regexp","Wechsel nach Runlevel [12345]")=1"
      Regular expression tester with syntax highlighting, explanation, cheat sheet for PHP/PCRE, Python, GO, JavaScript, Java, C#/.NET, Rust.


      But how do you expect that recovery to work? If your trigger expression is "if we find 5 or 4 or 3 or 2 or 1 in last 50 values, then trigger" and in the same time you can have "lets find 7 in last 50 values".. ? It does not make sense... maybe if you remove that 50 clause there in recovery, so it resolves if last one is 7... That is also only necessary, if there are some other kind of lines in logfile besides those presented here..

      Comment

      • danielitm
        Junior Member
        • Mar 2025
        • 28

        #3
        thank you very much for the regex, it makes it a lot easier.

        The idea was if the last event was 1-5 the server should be considered down and at 7 it is up.
        Since the log is fetched every minute, i thought it would register this immediately when one of the lines comes in, but unfortunately that didn't work, so i entered #50 as a test and thought that recovery had priority over problems.

        here is an excerpt from the last 50 log lines
        Code:
         
        2025-04-07 04:24:34 PM    2025-04-07 04:23:31 PM    
        2025-04-07 16:23:31.038 [WARN ] [connection pool] detected lost object: key = null, value = com.sqlag.tc.repository.Repository$DataStoreConnection
        2025-04-04 02:48:34 PM    2025-04-04 02:47:36 PM    
        2025-04-04 14:47:36.724 [INFO ] System bereit
        2025-04-04 02:48:34 PM    2025-04-04 02:47:36 PM    
        2025-04-04 14:47:36.724 [INFO ] Dienst für zeitgesteuerte Aufgaben aktiviert
        2025-04-04 02:48:34 PM    2025-04-04 02:47:36 PM    
        2025-04-04 14:47:36.724 [INFO ] Nachrichtenverarbeitung aktiviert
        2025-04-04 02:48:34 PM    2025-04-04 02:47:36 PM    
        2025-04-04 14:47:36.723 [INFO ] Messaging Infrastruktur aktiviert
        2025-04-04 02:48:34 PM    2025-04-04 02:47:36 PM    
        2025-04-04 14:47:36.723 [INFO ] Hochverfügbarkeits-Infrastruktur aktiviert
        2025-04-04 02:48:34 PM    2025-04-04 02:47:36 PM    
        2025-04-04 14:47:36.723 [INFO ] Schnittstelle für externe Systeme aktiviert
        2025-04-04 02:48:34 PM    2025-04-04 02:47:36 PM    
        2025-04-04 14:47:36.723 [INFO ] Adaptersystem aktiviert
        2025-04-04 02:48:34 PM    2025-04-04 02:47:36 PM    
        2025-04-04 14:47:36.723 [INFO ] Basisdienste aktiviert
        2025-04-04 02:48:34 PM    2025-04-04 02:47:36 PM    
        2025-04-04 14:47:36.721 [INFO ]       8324 ms   HttpServer.deploy
        2025-04-04 02:47:34 PM    2025-04-04 02:47:28 PM    
        2025-04-04 14:47:28.381 [INFO ]          3 ms   TransConnect.enterProduction
        2025-04-04 02:47:34 PM    2025-04-04 02:47:28 PM    
        2025-04-04 14:47:28.356 [INFO ]         34 ms   QueueManager.start
        2025-04-04 02:47:34 PM    2025-04-04 02:47:28 PM    
        2025-04-04 14:47:28.322 [INFO ]          1 ms   ClientServices.start
        2025-04-04 02:47:34 PM    2025-04-04 02:47:28 PM    
        2025-04-04 14:47:28.321 [INFO ]          1 ms   LifecycleEventEmitter.enterRunLevel
        2025-04-04 02:47:34 PM    2025-04-04 02:47:28 PM    
        2025-04-04 14:47:28.320 [INFO ]        327 ms   SchedulingService.start
        2025-04-04 02:47:34 PM    2025-04-04 02:47:27 PM    
        2025-04-04 14:47:27.984 [INFO ]        248 ms   AdapterManager.run
        2025-04-04 02:47:34 PM    2025-04-04 02:47:27 PM    
        2025-04-04 14:47:27.730 [INFO ]         79 ms   MessagePool.start
        2025-04-04 02:47:34 PM    2025-04-04 02:47:27 PM    
        2025-04-04 14:47:27.651 [INFO ] Wechsel nach Runlevel 7
        2025-04-04 02:47:34 PM    2025-04-04 02:47:27 PM    
        2025-04-04 14:47:27.651 [INFO ] Repository und Benutzerverwaltung aktiviert
        2025-04-04 02:47:34 PM    2025-04-04 02:47:27 PM    
        2025-04-04 14:47:27.651 [INFO ] Persistenzsystem aktiviert
        2025-04-04 02:47:34 PM    2025-04-04 02:47:27 PM    
        2025-04-04 14:47:27.650 [INFO ]        148 ms   WorkflowEngine.initBAM
        2025-04-04 02:47:34 PM    2025-04-04 02:47:27 PM    
        2025-04-04 14:47:27.502 [INFO ]          1 ms   MetaData.startListener
        2025-04-04 02:47:34 PM    2025-04-04 02:47:27 PM    
        2025-04-04 14:47:27.500 [INFO ]        824 ms   QueueManager.init
        2025-04-04 02:47:34 PM    2025-04-04 02:47:26 PM    
        2025-04-04 14:47:26.675 [INFO ]         83 ms   LDTMetaData.init
        2025-04-04 02:47:34 PM    2025-04-04 02:47:26 PM    
        2025-04-04 14:47:26.592 [INFO ]         27 ms   LifecycleEventEmitter.enterRunLevel
        2025-04-04 02:47:34 PM    2025-04-04 02:47:26 PM    
        2025-04-04 14:47:26.564 [INFO ]         19 ms   PersistedProperties.cleanup2
        2025-04-04 02:47:34 PM    2025-04-04 02:47:26 PM    
        2025-04-04 14:47:26.545 [INFO ]          0 ms   Repository.autoUpdate
        2025-04-04 02:47:34 PM    2025-04-04 02:47:26 PM    
        2025-04-04 14:47:26.545 [INFO ]          1 ms   HCMMetaData.init
        2025-04-04 02:47:34 PM    2025-04-04 02:47:26 PM    
        2025-04-04 14:47:26.544 [INFO ]          0 ms   DataStore.closeIndexInfoStatement
        2025-04-04 02:47:34 PM    2025-04-04 02:47:26 PM    
        2025-04-04 14:47:26.544 [INFO ]         17 ms   SchedulingService.initialize
        2025-04-04 02:47:34 PM    2025-04-04 02:47:26 PM    
        2025-04-04 14:47:26.526 [INFO ]      44717 ms   MessagePool.buildStatistics
        2025-04-04 02:47:34 PM    2025-04-04 02:47:26 PM    
        2025-04-04 14:47:26.064 [DEBUG] 74 unreferenzierte Routing IDs aus dem Data Store gelöscht
        2025-04-04 02:47:34 PM    2025-04-04 02:47:24 PM    
        2025-04-04 14:47:24.488 [DEBUG] 7 unreferenzierte Nachrichtenersteller aus dem Data Store gelöscht
        2025-04-04 02:47:34 PM    2025-04-04 02:47:24 PM    
        2025-04-04 14:47:24.295 [DEBUG] 1 unreferenzierter Nachrichtentyp aus dem Data Store gelöscht
        2025-04-04 02:47:34 PM    2025-04-04 02:46:41 PM    
        2025-04-04 14:46:41.809 [INFO ]          0 ms   KeyStoreManager.initRepoHandler
        2025-04-04 02:47:34 PM    2025-04-04 02:46:41 PM    
        2025-04-04 14:46:41.808 [INFO ]       1618 ms   RoutingTable.registerPort
        2025-04-04 02:47:34 PM    2025-04-04 02:46:40 PM    
        2025-04-04 14:46:40.189 [INFO ]       4000 ms   AdapterManager.load
        2025-04-04 02:47:34 PM    2025-04-04 02:46:36 PM    
        2025-04-04 14:46:36.189 [INFO ] Wechsel nach Runlevel 5
        2025-04-04 02:47:34 PM    2025-04-04 02:46:36 PM    
        2025-04-04 14:46:36.189 [INFO ] Lizenzierungsdienst aktiviert
        2025-04-04 02:47:34 PM    2025-04-04 02:46:36 PM    
        2025-04-04 14:46:36.188 [INFO ] Monitoring aktiviert
        2025-04-04 02:47:34 PM    2025-04-04 02:46:36 PM    
        2025-04-04 14:46:36.187 [INFO ] Konfiguration aktiviert
        2025-04-04 02:47:34 PM    2025-04-04 02:46:36 PM    
        2025-04-04 14:46:36.187 [INFO ]          2 ms   ServerConfig.startConfigPort
        2025-04-04 02:47:34 PM    2025-04-04 02:46:36 PM    
        2025-04-04 14:46:36.186 [INFO ]          6 ms   KeyStoreManager.initialize
        2025-04-04 02:47:34 PM    2025-04-04 02:46:36 PM    
        2025-04-04 14:46:36.184 [INFO ] Zertifikatmanager aktiviert
        2025-04-04 02:47:34 PM    2025-04-04 02:46:36 PM    
        2025-04-04 14:46:36.179 [INFO ]          0 ms   HL7MetaData.init
        2025-04-04 02:47:34 PM    2025-04-04 02:46:36 PM    
        2025-04-04 14:46:36.177 [INFO ]        460 ms   BundleProvider.loadBundles
        2025-04-04 02:47:34 PM    2025-04-04 02:46:35 PM    
        2025-04-04 14:46:35.718 [INFO ]          2 ms   LifecycleEventEmitter.enterRunLevel
        2025-04-04 02:47:34 PM    2025-04-04 02:46:35 PM    
        2025-04-04 14:46:35.716 [INFO ]        161 ms   InboundAdapter.initJCO
        2025-04-04 02:47:34 PM    2025-04-04 02:46:35 PM    
        2025-04-04 14:46:35.553 [INFO ]         23 ms   LicenseManager.initializePort
        2025-04-04 02:47:34 PM    2025-04-04 02:46:35 PM    
        2025-04-04 14:46:35.530 [INFO ]        509 ms   AdapterManager.loadFactories ​

        Comment

        • cyber
          Senior Member
          Zabbix Certified SpecialistZabbix Certified Professional
          • Dec 2006
          • 4807

          #4
          Recovery expression is only cosidered, if primary is already calculated to false. So in you case, when you have all kind of other lines there also. It would be considered false, if during 50 checks you do not get any of those "Wechsel nach Runlevel 1-2-3-4-5" lines...
          You dont really need to pick up all lines... just those where that runlevel info is. I woudl probably extract just the number from the line and keep it. Numbers are better for triggers than text.. You could just have a trigger last(/host/item)<7

          Comment

          • danielitm
            Junior Member
            • Mar 2025
            • 28

            #5
            Originally posted by cyber
            Numbers are better for triggers than text.. You could just have a trigger last(/host/item)<7
            Wouldn't it then also recognize a 7 in the date? That is also in the log file.

            Comment

            • danielitm
              Junior Member
              • Mar 2025
              • 28

              #6
              Originally posted by cyber
              Numbers are better for triggers than text.. You could just have a trigger last(/host/item)&lt;7
              Wouldn't it then also recognize a 7 in the date? That is also in the log file.

              Comment

              • cyber
                Senior Member
                Zabbix Certified SpecialistZabbix Certified Professional
                • Dec 2006
                • 4807

                #7
                "extracting" here means preprocessing. It will never pick up something from date.
                First... only pick up lines with "Wechsel nach Runlevel"
                Key
                Code:
                logrt[LOGFILE,"Wechsel nach Runlevel"]
                Add preprocessing to item -> "regular expression"
                pattern -> Wechsel nach Runlevel (\d)
                Output -> \1
                type of information "numeric (unsigned)"

                After that you only save those runlevel numbers and are able to make triggers based on numbers. You can also have an item on dsashboard, which displays latest status. If you have explanation for each status, you can also create a value mapping, 1->broken, 2-> broken some more... 7 -> ok and that item on dsahsboard will be displayed in human language ie "OK (7)".

                Comment

                • danielitm
                  Junior Member
                  • Mar 2025
                  • 28

                  #8
                  Thank you very much, that's a really interesting and good approach.

                  Comment

                  Working...