Ad Widget

Collapse

Create unique problems when source of problem creation is in many logfiles many times

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • mfjensen
    Junior Member
    • Jan 2025
    • 2

    #1

    Create unique problems when source of problem creation is in many logfiles many times


    I need help on resolving the following task:

    I want to monitor a log file. A rotating log file. so filename-date.log, new file generated every day.
    The log file is generated by a server application, that is making connections to clients.
    If it cannot connect to one or more clients, this it is added to the log as an error.
    It will after 5 minutes, try to connect again to the client(s) again, if it cannot connect, an error will be added to the log again.

    by using the logrt key (zabbix 6.4.20) and preprocessing, i have succeeded in generating a problem, for the latest client that was represented in the log. But there were multiple.

    Ex log file:
    2025-01-24 00:01:43.012 [EROR] No such host is known. | Client: TEST123
    2025-01-24 00:01:43.012 [INFO] Will look for TEST123 again in 300 seconds...
    2025-01-24 00:01:43.512 [EROR] No such host is known. | Client: TEST999
    2025-01-24 00:01:43.512 [INFO] Will look for TEST999 again in 300 seconds...
    2025-01-24 00:01:44.112 [EROR] No such host is known. | Client: 555TEST
    2025-01-24 00:01:44.112 [INFO] Will look for 555TEST again in 300 seconds...
    2025-01-24 00:06:43.051 [EROR] No such host is known. | Host: TEST123
    2025-01-24 00:06:43.051 [INFO] Will look for TEST123 again in 300 seconds...
    2025-01-24 00:06:43.551 [EROR] No such host is known. | Host: TEST999
    2025-01-24 00:06:43.551 [INFO] Will look for TEST999 again in 300 seconds.
    2025-01-24 00:06:44.151 [EROR] No such host is known. | Host: 555TEST
    2025-01-24 00:06:44.151 [INFO] Connection OK to 555TEST
    etc...​​

    So initially (at 00:01) i would like for 3 problems to be created:
    1. Unable to connect to client TEST123
    2. Unable to connect to client TEST999
    3. Unable to connect to client 555TEST
    And then (at 00:06), the third problem (connection to 555TEST) should be resolved.

    So the log example goes back many days, so there are lots of "[EROR] No such host" in the files.
    I have succeed in doing the preprocessing. only looking for the last 10 minutes of the log files.
    Also i have been getting problems created, but only 1 problem. Unless i allow for multiple problems to be created, then i get many problems, and many duplicate problems.

    Without going to much what i have tried so far. Then how would you resolved such a monitoring task? are there different approaches?

    Thanks for helping out!

    Br
    Michael​
  • cyber
    Senior Member
    Zabbix Certified SpecialistZabbix Certified Professional
    • Dec 2006
    • 4806

    #2
    Extract hostnames into tags (it would be easier, if it would be always consistent, either Client or Host, and not without it, so it would be easier to match it against regexes...)
    Experiment a bit with correlation rules...

    This timing does not make sense also
    2025-01-24 00:06:44.151 [EROR] No such host is known. | Host: 555TEST
    2025-01-24 00:06:44.151 [INFO] Connection OK to 555TEST

    Same millisecond, host is not there and then suddenly its OK...

    Comment

    • mfjensen
      Junior Member
      • Jan 2025
      • 2

      #3
      Thanks for the reply.

      The timing in my manually put together log example, is not perfect, i can see when you mention it. So it should be like this;

      2025-01-24 00:01:43.012 [EROR] No such host is known. | Client: TEST123
      2025-01-24 00:01:43.012 [INFO] Will look for TEST123 again in 300 seconds...
      2025-01-24 00:01:43.512 [EROR] No such host is known. | Client: TEST999
      2025-01-24 00:01:43.512 [INFO] Will look for TEST999 again in 300 seconds...
      2025-01-24 00:01:44.112 [EROR] No such host is known. | Client: 555TEST
      2025-01-24 00:01:44.112 [INFO] Will look for 555TEST again in 300 seconds...
      2025-01-24 00:06:43.051 [EROR] No such host is known. | Host: TEST123
      2025-01-24 00:06:43.051 [INFO] Will look for TEST123 again in 300 seconds...
      2025-01-24 00:06:43.551 [EROR] No such host is known. | Host: TEST999
      2025-01-24 00:06:43.551 [INFO] Will look for TEST999 again in 300 seconds.
      2025-01-24 00:06:44.151 [INFO] Connection OK to 555TEST

      And then 555TEST host will not be in the log again, until the "No such host is known" error shows up again.

      Currently i do extract the hostname into TAGs (the problems created have a tag named HOST with the value of the host name, ex HOST: 555TEST, and i have configured an event correlation rule, to close new event, if tag value of new event is the same as the old event tag value. But again, when enabling creation of multiple problems, i get flooded with problems regarding the same host.

      Br
      Michael

      Comment

      Working...