Ad Widget

Collapse

log[/oracle/alert.log,ORA-] - how to trigger

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • ik_zelf
    Member
    • Feb 2015
    • 60

    #1

    log[/oracle/alert.log,ORA-] - how to trigger

    Hi,

    I found a way to discover all Oracle alertlog files on a host. I configured them like : log[/oracle/instance/alert.log,ORA-] and I see all lines containing the ORA- messages from the alertlog popping up in the item's history, as I hoped for.

    Now I want to create triggers for various situations, like one that fires when we get an ORA-00600 or ORA-07405 -> internal errors

    {template zbx_alertlog:log[{#ALERTLOG},ORA-].str("ORA-(27093|17624)")}<>0

    or

    {template zbx_alertlog:log[{#ALERTLOG},ORA-].str("ORA-01008:*")}<>0

    Idea is: trigger if messages match.
    Is this supposed to work this way?
    I don't get any alerts while I am sure I do get the messages.

    Can you help me?
    thanks,
  • qix
    Senior Member
    Zabbix Certified SpecialistZabbix Certified Professional
    • Oct 2006
    • 423

    #2
    For matching multiple events in one trigger expression, better use regexp instead of string.

    I don't think str() supports wildcards (*), it already matches a string within a value.
    With kind regards,

    Raymond

    Comment

    • ik_zelf
      Member
      • Feb 2015
      • 60

      #3
      Thanks for your help Raymond,

      so for a simple value I have:

      {template zbx_alertlog:log[{#ALERTLOG},ORA-].str("ORA-12012:",2h)}<>0

      My intention for this is to trigger when during the last 2 hours the specified error is found, and have it cleared when no hit during the last 2 hours. For some db's there are no messages at all since 10 hours after this trigger fired; the trigger is still raised.


      So I changed this one to:
      {template zbx_alertlog:log[{#ALERTLOG},ORA-].str("ORA-12012:",2h)}<>0
      and
      {template zbx_alertlog:log[{#ALERTLOG},ORA-].nodata(2h)}=0

      I still have alerts active where last data is > 10h old.
      In my simple mind, I wrote two rules in the trigger definition that should cause the trigger to be cleared after 2 hours not having this ORA-12012 message.
      Appearantly I am wrong on these .....
      How should I correct this?
      The raise of the trigger works, the clear not, when no other data is received.

      For an event that is raised for 10hrs already, not I get 'can not evaluate function str' (it does not have data since 10hrs)
      Last edited by ik_zelf; 04-12-2015, 12:00.

      Comment

      • qix
        Senior Member
        Zabbix Certified SpecialistZabbix Certified Professional
        • Oct 2006
        • 423

        #4
        Are the item values more than 2 hrs old, or are the entries within the logs more than 2hrs old?
        With kind regards,

        Raymond

        Comment

        • ik_zelf
          Member
          • Feb 2015
          • 60

          #5
          I receive the log entries every minute so they are very close to being the same. The values are received more than 2 hrs ago. The trigger Age has about the same value. In the log item I do not interpret the Oracle timestamps written in the alortlogs since Oracle uses a format that is using names for day and month.


          It seems that when nodata is received within the mentioned period, the
          {template zbx_alertlog:log[{#ALERTLOG},ORA-].str("ORA-12012:",2h)}<>0

          fails. How to fix that?
          Last edited by ik_zelf; 04-12-2015, 13:24.

          Comment

          • pfouquet
            Junior Member
            • Jan 2012
            • 10

            #6
            Can you try:

            ({TRIGGER.VALUE}=0 AND {template zbx_alertlog:log[{#ALERTLOG},ORA-].str("ORA-12012:")}=1)
            OR
            ({TRIGGER.VALUE}=1 AND {template zbx_alertlog:log[{#ALERTLOG},ORA-].nodata(2h)}=0)

            => Close if no ORA- is received in 2 last hours.
            A tips : use a macro for timeout closure.
            Last edited by pfouquet; 04-12-2015, 15:23.

            Comment

            • thiagomz
              Member
              • Jan 2010
              • 74

              #7
              That works for me ..

              {AZ_Oracle_Linux_Processes:log["{$ALERT_LOG}","ORA-","UTF-8",20].str(ORA-)}=1 and {AZ_Oracle_Linux_Processes:log["{$ALERT_LOG}","ORA-","UTF-8",20].nodata(10m)}<>1

              Comment

              • ik_zelf
                Member
                • Feb 2015
                • 60

                #8
                Funny things happen when no logs are coming at all:

                Code:
                ( {TRIGGER.VALUE} = 0 and 
                  {madrid:log[/tmp/z.log,ORA-].regexp(".*(07445|00600).*",5m)}=1
                ) 
                or 
                ( {TRIGGER.VALUE}=1 and 
                   {madrid:log[/tmp/z.log,ORA-].nodata(5m)}=1
                )
                This fires when an ORA-07445 or ORA-00600 is found in the last 5m.
                It also clears when there is not any log found in the following 5m.

                but: when the internal error is not followed by an other logline, the trigger becomes Unknown with "Cannot evaluate function "madrid:log[/tmp/z.log,ORA-].regexp(".*(07445|00600).*",5m)"."

                I would expect that
                Code:
                ( {TRIGGER.VALUE} = 0 and 
                  {madrid:log[/tmp/z.log,ORA-].regexp(".*(07445|00600).*",5m)}=1
                ) 
                or 
                ( {TRIGGER.VALUE}=1 and 
                   {madrid:log[/tmp/z.log,ORA-].regexp(".*(07445|00600).*",5m)}=0
                )
                Would work but this also does not work as expected. What I expect is a trigger then in the last 5m there is an internal error (07445|00600) and a clear when during the following 5m there is no match for this error found.

                It looks like I see this too simplistic?
                I must admit that I am already surprise by the fact that a simple
                Code:
                {madrid:log[/tmp/z.log,ORA-].regexp(".*(07445|00600).*",5m)}=1
                is not sufficient.

                I also noticed that the trigger evaluation only starts when data is received after the trigger is [re]created. Could be as intended.

                This, I am testing with zabbix-3alpha5.
                The item is defined with an interval of 10s. The code with the nodata trick raises the trigger for 30s and clears it. Not exactly what I was hoping for. When there is no match in the specified period AND nodata, I get "Cannot evaluate function "madrid:log[/tmp/z.log,ORA-].regexp(".*(07445|00600).*",5m)"." instead of a cleared flag.

                Is the bug in me or is the bug in zabbix?

                Item:
                Code:
                log[/tmp/z.log,ORA-]	Triggers 3	log[/tmp/z.log,ORA-]	10s	90d		Zabbix agent (active)		Enabled
                trigger expression:
                Code:
                ({TRIGGER.VALUE} = 0 and {madrid:log[/tmp/z.log,ORA-].regexp(".*(07445|00600).*",5m)}=1) 
                or 
                ({TRIGGER.VALUE}=1 and {madrid:log[/tmp/z.log,ORA-].nodata(5m)}=1)
                input:
                Code:
                echo "`date` ORA-07445 jeetje" >>/tmp/z.log
                and wait 6m.
                Last edited by ik_zelf; 24-12-2015, 19:05.

                Comment

                • ik_zelf
                  Member
                  • Feb 2015
                  • 60

                  #9
                  The more I am testing with the log monitoring, the more I think it has an error in it.
                  If monitoring the last #lines it works as expected.
                  If monitoring the last time it does not work as expected.

                  When monitoring the last time, I would expect that if a pattern like
                  Code:
                  log[{#ALERTLOG},ORA-].regexp(".*(07445|00600):.*",1h)}=1
                  has no match, I just get trigger value SUCCESS. If the trigger value previously was PROBLEM that it would be cleared.

                  Instead of this, after the trigger got PROBLEM value and within an hour there is no extra match, the trigger gets
                  Code:
                  "Cannot evaluate function "madrid:log[/oracle/alert_ORCL.log,ORA-].regexp"
                  The trigger also changes state very often making this completely useless. I think zabbix tries to reevaluate the trigger and causes a state change. Now after getting a match the trigger got PROBLEM, which is OK. One hour later it got 'Cannot evaluate' and since then it changes state 150 times while not a single log line was received. This is weird.

                  This I have in zabbix-4.2 and also in zabbix-3a5.

                  Is this a bug? I happen to think so.
                  Last edited by ik_zelf; 26-12-2015, 13:37.

                  Comment

                  • Tec_Technician
                    Member
                    • Dec 2015
                    • 39

                    #10
                    Hi all!!

                    Hi ik_zelf!

                    I will help you with one thing about your problem.

                    You are receiveng a lot of PROBLEMS because triggers with "nodata()" function recalculated every 30 seconds.



                    I hope this help a little.

                    About the rest...

                    I have no idea about the "Cannot evaluate function " error on trigger.

                    Good luck with the monitoring ;-).

                    Comment

                    • Neeraj
                      Junior Member
                      • Jan 2025
                      • 1

                      #11
                      Hi ik_zelf,

                      I am new to the zabbix and configured the oracle alert log monitoring as below:
                      :log[/oracle/instance/alert.log,ORA-]

                      When i checked the lastest data, its not showing any values. I have set update interval for 1s. Not sure where exactly issue is?.

                      Thanks, Neeraj

                      Comment

                      Working...