Ad Widget

Collapse

Question re: Using the log[*] Key

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • zab_monkey
    Member
    • Mar 2010
    • 37

    #1

    Question re: Using the log[*] Key

    Hello all,

    I have set up an item, that uses the Active Check log[<file>,<pattern>...]

    The Key looks thus:

    log[/tmp/syslog.log,Notification]

    I have a Trigger set up, looking for the regexp 'Notification' and it all works great, except for one bit.

    Upon reading the manual, the explanation about 'log' says that it will use 2 things from the DB to determine whether it has read the most recent iteration of the file or not, 1 being the file size (lastlogsize) and the second being a time stamp (lastclock).

    Basically, what I am getting is every iteration within the file being reported on, not just the last. The manual seems to imply this shouldn't be the case.

    For instance, the syslog may have a line item that reads:

    # Notification : LINE 1

    Then, the trigger will activate and I will get an alert, with all the detail, which is great. However, if another notification occurs in the log file, for example.

    # Notification : LINE 1
    # Everything is ok
    # Notification : LINE 3

    It will record, in its history, both lines, the new one AND the old one again. So I get 2 email notifications since it has found 2 again.

    I checked the DB and can confirm that the 'lastclock' AND 'lastlogsize' do update as they should, but it still alerts me to every iteration in the log file every time a new one is found. (So you can imagine, if I have 60 Notifications, I don't want to get 60 alerts all over again when its finds another one. And I don't want to fill up the history records with ones I already know about)

    So, the questions are:

    1. Do I have something missing from my KEY to say that it should only look for new iterations since last check?
    2. Is there some other caveat I have forgotten to include?

    Any direction would be greatly appreciated.

    Regards,

    JC

    p.s Current version is 1.8.3 for both server and agent, that's probably important
  • zab_monkey
    Member
    • Mar 2010
    • 37

    #2
    Richlv? Anyone got a hint?

    Comment

    • zalex_ua
      Senior Member
      Zabbix Certified Trainer
      Zabbix Certified SpecialistZabbix Certified Professional
      • Oct 2009
      • 1286

      #3
      Originally posted by zab_monkey
      Hello all,

      Then, the trigger will activate and I will get an alert, with all the detail, which is great. However, if another notification occurs in the log file, for example.

      Any direction would be greatly appreciated.
      The second message - the message with the status OK? Right?
      If yes, then read it and ponder in this one post


      You trigger probably switch from PROBLEM to OK state by NOT "Notification" text

      Comment

      • zab_monkey
        Member
        • Mar 2010
        • 37

        #4
        Hi Zalex,

        Thanks for the reply, but sadly that isn't the issue. The OK messages/notifications don't even come into play yet. This is just PROBLEM notifications.

        I'll try a calc and maybe it will clarify it a little better:

        Lets say I have a log file that gets an error.
        eg:
        # Line 1: error

        I will then get an email notification saying (in very basic terms) "Logfile : PROBLEM : Line 1: error"

        So, my trigger has worked, my dashboard has the trigger status shown, and I have received an email notification. All as it should be, yes?

        Heres where the problem starts, and if you refer to my above post, I have delved into the DB and manual to some detail to get to the root with little luck.

        So, my logfile says:
        # Line 1: error

        lets say another error occurs:

        # Line 1: error
        # Line 2: error

        Now, I get another 2 email notifications:

        One email saying "PROBLEM : Line 1 : error", and a second email saying "PROBLEM : Line 2 : error"

        So, I now have 3 emails. 1 from the first time the trigger activated, then 2 emails the second time (3 emails total)

        Then lets add a 3rd line:

        # Line 1: error
        # Line 2: error
        # Line 3: error

        Now, I get ANOTHER 3 emails. One email saying "PROBLEM : Line 1 : error", a second email saying "PROBLEM : Line 2 : error" and a third saying "PROBLEM : Line 3 : error"

        So in total, I now have received 6 emails, all PROBLEM status. The OK status isn't relevant here.

        So, as you can imagine, by a little calculation, if a 4th line appears, I will get another 4 emails, 10 in total.

        5 errors = 5 more emails, 15 emails total
        6 errors = 6 more emails, 21 emails total
        7 errors = 7 more emails, 28 emails total
        8 errors = 8 more emails, 36 emails total
        9 errors = 9 more emails, 45 emails total

        and so on, and so forth.

        So you can see, in this case, we would be loathe to see a number of errors occur in a short space of time, as I get a flood of messages telling me about errors it has already told me about.

        So, as I say above, the manuals description of how the log[*] key works seems to say it shouldn't do this based on the records checked in the DB, but this doesn't appear to work. The DB is being updated every time, but I just cant have 60 emails coming through telling me about 59 earlier errors I already know about.

        Any clearer?

        Cheers,

        JC

        Comment

        • zalex_ua
          Senior Member
          Zabbix Certified Trainer
          Zabbix Certified SpecialistZabbix Certified Professional
          • Oct 2009
          • 1286

          #5
          You trigger Event generation is 'Normal + Multiple PROBLEM events' ?
          Do you understand what it means to this option?

          Originally posted by zab_monkey
          Then lets add a 3rd line:

          # Line 1: error
          # Line 2: error
          # Line 3: error

          Now, I get ANOTHER 3 emails. One email saying "PROBLEM : Line 1 : error", a second email saying "PROBLEM : Line 2 : error" and a third saying "PROBLEM : Line 3 : error"

          So in total, I now have received 6 emails, all PROBLEM status. The OK status isn't relevant here.

          So, as you can imagine, by a little calculation, if a 4th line appears, I will get another 4 emails, 10 in total.
          I do not see any problems here. Behavior such as should be expected.

          Originally posted by zab_monkey
          So you can see, in this case, we would be loathe to see a number of errors occur in a short space of time, as I get a flood of messages telling me about errors it has already told me about.
          If I understand correctly

          Feel free to vote

          Originally posted by zab_monkey
          Hi Zalex,
          So, as I say above, the manuals description of how the log[*] key works seems to say it shouldn't do this based on the records checked in the DB, but this doesn't appear to work. The DB is being updated every time, but I just cant have 60 emails coming through telling me about 59 earlier errors I already know about.
          As I said log[*] key works such as should be expected.
          What's a database? DB has no relation to the possible problem.
          I'm sorry but I can not understand what you meant to say and what are your problem at all. Perhaps because of my English and maybe because of your

          Comment

          • zab_monkey
            Member
            • Mar 2010
            • 37

            #6
            Zalex,

            You're right, you don't understand the problem. Also, English is my first language, and I understand you just fine.

            I understand English isn't your first language, but you are coming across as very patronizing, and all I was looking for was some support from this support forum.

            If you think that a log file monitor should send 6 PROBLEM emails for only 3 errors, or regexp matches, then you and I don't agree on what expected behavior is. Or for that matter 45 PROBLEM emails for 9 matches, and the 'Normal + Multiple PROBLEM events' doesn't change the behavior. I did read about this and try it before I posted here. Sadly, it doesn't help.

            Let me show you the text from the manual:

            _______________________________________
            10.2. How it works

            Monitoring of log files requires Zabbix Agent running on a host. An item used for monitoring of a log file must have type Zabbix Agent (Active), its value type must be Log and key set to log[file,<pattern>,<encoding>,<max lines>] or logrt[path to log file with filename format,<pattern>,<encoding>,<max lines>].

            For example:

            log["/home/user/file.log","pattern_to_match","UTF-8",100]
            or
            logrt["/home/user/filelog_.*_[0-9]{1,3}","pattern_to_match","UTF-8",100]

            The last one will collect data from files such “filelog_abc_1” or “filelog__001”.

            Important notes:

            *
            The server and agent keep a trace of the monitored log's size and last modification time (for logrt) in two counters.
            *
            The agent starts reading the log file from the point it stopped the previous time.
            *
            The number of bytes already analyzed (the size counter) and the last modification time (the time counter) are stored in the Zabbix database and are sent to the agent, to make sure it starts reading the log file from this point.
            *
            Whenever the log file becomes smaller than the log size counter known by the agent, the counter is reset to zero and the agent starts reading the log file from the beginning taking the time counter into account.

            _____________________________________________

            Do you see that the log check is supposed to record in the database the last time it read the file, and how big the file was so that it DOESNT read the lines it has already read? That's the problem. Mine is.
            That's also why the Database matters. Please understand that it is relevant.

            The database is recording these things correctly, its just ignoring them, when as you can plainly see in the manual, it shouldn't be.

            I don't think I need to repeat the problem, I just don't think you quite understand what my problem is.

            Regards,

            JC

            Comment

            • zalex_ua
              Senior Member
              Zabbix Certified Trainer
              Zabbix Certified SpecialistZabbix Certified Professional
              • Oct 2009
              • 1286

              #7
              Originally posted by zab_monkey
              Then lets add a 3rd line:

              # Line 1: error
              # Line 2: error
              # Line 3: error

              Now, I get ANOTHER 3 emails. One email saying "PROBLEM : Line 1 : error", a second email saying "PROBLEM : Line 2 : error" and a third saying "PROBLEM : Line 3 : error"
              I want to clarify.
              Maybe I just did not properly understand the important part.
              You add the third line to the existing two?

              If it is so then why are you in the example gives three lines? For understandable it would show just one line!

              Or do you just add each time 2,3, ... new lines to the already existing ones?


              How do you look like the output file 'syslog.log'
              So:?
              Code:
              # Line 1: error
              # Line 2: error
              # Line 3: error
              # Line 4: error
              # Line 5: error
              # Line 6: error
              # Line 7: error
              # Line 8: error
              # Line 9: error
              Or so:?
              Code:
              # Line 1: error
              # Line 1: error
              # Line 2: error
              # Line 1: error
              # Line 2: error
              # Line 3: error
              # Line 1: error
              # Line 2: error
              # Line 3: error
              # Line 4: error
              # Line 1: error
              # Line 2: error
              # Line 3: error
              # Line 4: error
              # Line 5: error
              I've never used the key log[], but just now checked - everything works as expected.
              I hope you did not restarted the zabbix_agent and did not change the key during experiments?
              Read this to understand some of the nuances associated with 'lastlogsize' https://support.zabbix.com/browse/ZBXNEXT-444
              Last edited by zalex_ua; 16-09-2010, 12:42.

              Comment

              • zab_monkey
                Member
                • Mar 2010
                • 37

                #8
                Hi Zalex,

                Thanks for sticking with this issue with me

                I have read the post on the nuances of 'lastlogsize', and was aware of how it behaved before I posted, although the post does confirm that 'lastlogsize' is meant to work the way I expect it to.

                The post refers to occasions and incidents that causes 'lastlogsize' to reset to 0, and the behaviours it will exhibit when it does so.

                As I said, I have been through how 'lastlogsize' is behaving for me and it is doing what it is meant to. Problem is, my notifications are behaving as though the 'lastlogsize' value is being reset to 0, except it isn't. Does that make sense?

                To also answer your question;
                "I hope you did not restarted the zabbix_agent and did not change the key during experiments?"

                No, I didnt restart, nor did I change they key when watching this behavior. As I said, I understand 'how' its meant to work, but mine is doing something it shouldnt be.

                Now, lets say that I have a process that writes to a log file and it writes the log like your example of the 9 lines:

                Code:

                # Line 1: error
                # Line 2: error
                # Line 3: error
                # Line 4: error
                # Line 5: error
                # Line 6: error
                # Line 7: error
                # Line 8: error
                # Line 9: error

                I would get 45 email notifications. Thats with no restarts to the agent, no changes to the key, no changes in the conditions. If I am looking for the 'error' regexp, using the log[/tmp/syslog.log,error] key for example, I will get 45 emails if my log file looked like above.

                Any clearer?

                Thanks,

                JC

                Comment

                • zalex_ua
                  Senior Member
                  Zabbix Certified Trainer
                  Zabbix Certified SpecialistZabbix Certified Professional
                  • Oct 2009
                  • 1286

                  #9
                  Originally posted by zab_monkey
                  Hi Zalex,

                  Thanks for sticking with this issue with me
                  Heh, yes indeed patience is almost ends

                  Originally posted by zab_monkey
                  I would get 45 email notifications. Thats with no restarts to the agent, no changes to the key, no changes in the conditions. If I am looking for the 'error' regexp, using the log[/tmp/syslog.log,error] key for example, I will get 45 emails if my log file looked like above.
                  Ok, what is the OS on the host where the zabbix_agent is working?

                  Do this: Stop the agent, clear the 'syslog.log' file, re-save Item key several times for reset 'lastlogsize' to 0 into DB.
                  disable all Items (for this host) except the one with which we have problems.
                  Set debug level=4 for agent, start the agent, add a lines as you did it before 5 lines, one with an interval of 10-20 seconds, wait 2 minutes, stop the agent.
                  Send me zabbix_agent debuglog. See my profile for mail.

                  Comment

                  • richlv
                    Senior Member
                    Zabbix Certified Trainer
                    Zabbix Certified SpecialistZabbix Certified Professional
                    • Oct 2005
                    • 3112

                    #10
                    this all does sound a bit weird (hey, i look at private messages quite rarely )

                    my first guess - do you have any escalations or recovery messages enabled on the action that sends these messages ? if yes, does removing them solve the problem ?

                    additionally, look at few other locations :

                    1. item history. does it only contain each line once, or are lines repeated ?

                    2. event history for that event. does it have only one PROBLEM event for each line, or are there more than you expect to see ?
                    Zabbix 3.0 Network Monitoring book

                    Comment

                    • zab_monkey
                      Member
                      • Mar 2010
                      • 37

                      #11
                      Hey Rich,

                      Thanks for taking a look.

                      Firstly, the escalations aren't enabled. Thought maybe I had done something as daft as that but thankfully it was off Also, the Action only has one Operation, which is to send a notification to me.

                      To answer the numbered questions:

                      1. It has the lines repeated, so it definitely appears to be reading the whole file every time, and picking up the correct number of iterations of the error. Hence I thought maybe I had the log[*] key set wrong initially.

                      2. That's part of the odd bit, it has only one PROBLEM event, and says it performed the assigned action only once (tho, I suppose if the system was legitimately thinking it was to read the whole file every time, it may expect that it is only performing the 1 action, just 1 action for all its results).

                      As I say above, it almost seems as if the item is ignoring the lastlogsize and lastclock records in the DB and is just reading the whole log file every time. I am starting to think I have a defect on my hands rather than this being any kind of issue within Zabbix itself. Not sure.

                      Cheers,

                      JC

                      Comment

                      • zalex_ua
                        Senior Member
                        Zabbix Certified Trainer
                        Zabbix Certified SpecialistZabbix Certified Professional
                        • Oct 2009
                        • 1286

                        #12
                        Originally posted by zab_monkey
                        To answer the numbered questions:
                        Where the answer to my question?
                        And where debug log that I requested?
                        I was almost sure that the problem is on the agent side.
                        In relation to the Item key log is the file size in bytes that the agent checks every Update interval (in sec).
                        Maybe your OS for some reason can not determine the correct file size?
                        Delirium of course, but it can be.

                        Comment

                        • zab_monkey
                          Member
                          • Mar 2010
                          • 37

                          #13
                          Oh Zalex,

                          I hadn't forgotten you, I just haven't been back on to that issue in the last few days. I will try and get the info for you tomorrow.

                          I have tried this on 2 OS's now, 1x HP-UX 11v3 and 1x CentOS 5.5. However, I am not sure I got far testing on the HP-UX box.

                          That said, I will get back to you

                          Thanks again!!

                          JC

                          Comment

                          Working...