Ad Widget

Collapse

Log file monitoring and alerting..

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • joshuamcdo
    Member
    • Nov 2013
    • 76

    #1

    Log file monitoring and alerting..

    I am in need of some insight as to what I am doing wrong here.. I have read until I can't read anymore and still it's just not clicking for me. I have also google-fooed and stalked the forums and still haven't found anything that fits what I am trying to so.

    Background :
    Zabbix server : 3.0.7 (local mariaDB)
    Zabbix Agent : 3.0.7

    O/S : RHEL 7 on both ends

    I created a template named : Template Linux OS Logging Extras

    I created a few items in that template that all work in regard to capturing the expected events in the history for /var/log/secure but I will lay out one for the sake of this post

    Name "Secure log SSHD AUTHENTICATION FAILURES (/var/log/secure)"
    Type "Zabbix Agent(Active)"
    Key: "log[/var/log/secure,"pam_unix\(sshd:auth\): authentication failure;"]"
    Type of information : Log
    Update interval: 10
    History storage period (in days): 90
    yy-MM-ddThh:mm:ss

    Description : Tracks /var/log/secure for failed logins or the word failure.

    Note: ** I kind of thing that the time stamp should be yyyy instead of yy because it starts out "2017-".. I will look at that later.. But it's working...

    I then created the following regexp...

    Name: "loginfail"
    Expression type: "Result is TRUE" Expression: "Authentication failure for"

    ** Note: I am not sure I needed a regexp here looking at it again.. But it also works. Or it worked anyway..



    I then created the following trigger...

    Name: "Log: Failed logins"
    Expression:"{Template OS Linux Logging Extras:log[/var/log/secure,"pam_unix\(sshd:auth\): authentication failure;"].regexp(loginfail)}=0"
    (Multiple problem event generation is not checked)
    Description: Throws an alert using the log file watch for /var/log/secure to detect root logins. This includes sudo to root
    (There are no dependencies)


    So... This works... However, once it threw once, it won't throw again and it keeps the alert on the dashboard for basically ever.. I did some how manage to make it go away once and then sudoed to root and boom it picked it up agian. That has been on the dashboard for basically 13 hours now and won't go away and will not pick up anymore failed logins. It picks them up as items, but not does not trigger on them. I am just stuck with the previously triggered items and it won't pick up anymore triggers.

    So the capturing of events is working correctly everytime..

    2017-04-23 22:01:25 somehostname: Secure log SSHD AUTHENTICATION FAILURES (/var/log/secure) : 2017-04-24T03:00:35.979317+00:00 somehostname sshd[29084]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=10.1.1.1

    Just no triggering anymore.. Please ...---...

    Thanks,
    J
  • joshuamcdo
    Member
    • Nov 2013
    • 76

    #2
    I upgraded to 3.2 and that solves most of my problems but not all.

    If I generate multiple events that should set off multiple triggering events. The problem count is always 1, never the actually count which is deceptive and defeats to purpose of tracking these events.

    The other issues is that I can't get this trigger to pick up at all.

    {Template OS Linux Logging Extras:log[/var/log/secure,Failed keyboard-interactive\/pam for invalid user].last()}=0

    The log file is showing the following error..

    26849:20170424:022505.822 End of evaluate() error:'Cannot evaluate expression: expected numeric token at "24T06:24:14.520443+00:00 somehostname sshd[659]: Failed keyboard-interactive/pam for invalid user boo from 10.1.1.1 port 59296 ssh2)=0".'

    Any advice?

    Thanks in advance.
    J
    Last edited by joshuamcdo; 24-04-2017, 08:27.

    Comment

    • joshuamcdo
      Member
      • Nov 2013
      • 76

      #3
      Bump... Anyone?

      I am still struggling to understand this.. I think the answer lay in how the triggers are setup but can't seem to pinpoint it.

      Comment

      • joshuamcdo
        Member
        • Nov 2013
        • 76

        #4
        Still struggling with this a tad.
        Anyone?

        Comment

        • joshuamcdo
          Member
          • Nov 2013
          • 76

          #5
          I have made some progress...

          I setup the following item:

          Name : Apache base error log
          Type: Zabbix agent active

          Key : log[/var/log/httpd/error_log,error]
          Log time format : [ddd mmm dd hh:mm:ss yyyy]

          Work, picks up the logs..

          The trigger works but it never goes away...

          Trigger :

          {TEMPLATE_APACHE_EXTRAS:log[/var/log/httpd/error_log,error].regexp(\[error\].*ajp_ilink_receive)}=1

          That picks up the log file entry no problem..

          What I am trying to figure out is something like this..

          {TEMPLATE_APACHE_EXTRAS:log[/var/log/httpd/error_log,error].regexp(\[error\].*ajp_ilink_receive)}=1 but clear if no data is received for 5 mins..

          Thoughts?
          Anyone (screams)

          Update: So {TEMPLATE_APACHE_EXTRAS:log[/var/log/httpd/error_log,error].regexp(\[error\].*ajp_ilink_receive)}=1 was working doesn't seem to be the case now.
          Last edited by joshuamcdo; 12-12-2017, 19:44.

          Comment

          • joshuamcdo
            Member
            • Nov 2013
            • 76

            #6
            This just can't be that complex.

            Comment

            • VladM
              Junior Member
              • Dec 2017
              • 2

              #7
              dang

              your thread is exactly what I am trying to figure our for myself too

              I'm waiting for admins to accept my thread too.


              However, I am trying to check if there's any possible way to extract "Local Time" from log item and create a trigger from that.

              In your case for example:
              if [last item LOCAL TIME] - [ 5 mins ] > 0 -> trigger ON

              I'll let you know if I manage to find something out.

              Comment

              • dimir
                Zabbix developer
                • Apr 2011
                • 1080

                #8
                Before configuring Zabbix in this case you need to answer few questions.

                1. Is it a problem if you have on or more failed logins in the last 5 minutes?
                2. Is it a problem when last entry in the log file is "failed login"?
                3. Is it a problem if log file contains one or more failed logins?

                Then you need to decide how do you want to deal with the problem.

                1. If you want a trigger to close automatically collect also the "good" lines and define a "failed login" line on a trigger level. The "good lines" will then close the problem.
                2. If you want a trigger to close automatically based on the time then add a nodata function to your trigger expression. This will cover situations when there are no log entries at all for a longer period.
                3. If you want to manually close a problem you can do that also since version 3.2 .
                Last edited by dimir; 31-08-2022, 12:08.

                Comment

                Working...