Ad Widget

Collapse

Designing "checks", best practices

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • coreychristian
    Senior Member
    Zabbix Certified Specialist
    • Jun 2012
    • 159

    #16
    Originally posted by steveboyson
    Nope, since trapper items have no intervall, zabbix does not know in which time period values were expected and thus cannot fire up .nodata triggers.
    Pretty obvious, I might think.
    I did a quick test, and I think they changed this, so items with no time interval now use the zabbix timer process.

    Doing a quick test of a trapper item on our dev system (version 2.2.1) the following trigger expression created a trigger, though it was delayed by about 20 seconds.

    Code:
    {Zabbix server:test.trapper.nodata(60)}=1
    This is a somewhat recent change though, I think it was done sometime inbetween 2.0 and 2.2 if I recall correctly.

    Comment

    • ibtanhe
      Junior Member
      • Feb 2014
      • 17

      #17
      Originally posted by steveboyson
      You might have noticed that querying a vSphere datastore via the Perl or Python API is a time & resource consuming task.

      Therefore I doubt you can handle that in a single item call. At least in our vSphere environment a single call lasts up to 45 seconds per ESX host which is 15 seconds longer than the configurable maximum agent or server timeout value.
      Very true. Thus, my concerns.

      Originally posted by steveboyson
      We perform periodic checks on the ESX hosts (running on the vMA against our vcenter server via a cron job), store the gathered values in a parseable text file, send that text file to zabbix and let the zabbix server do the parsing and item delivery stuff.
      We check the filedate of that file and fire up a trigger if it is older than $NUMBER of minutes so we have control over the work flow.

      Of course, this periodic cron job could emit the values directly via trapper items to zabbix. But as mentioned before, they have no .nodata triggers.
      Yeah, so basically, you're using a non-trapper item to check that the trapper-items are populated in due time (by checking the fileage), sort of.

      I've grown pretty allergic to temporary-files over the years though and I'd like to avoid it if possible. Can I use another item/trigger to verify that data is being collected for certain trapper-items I wonder..?

      Comment

      • ibtanhe
        Junior Member
        • Feb 2014
        • 17

        #18
        Originally posted by aib
        In NAGIOS you still have some scripts which request and check all datastores. Right?
        And you create only one trigger on FrontEnd side.
        1) Nobody can stop you from using the same script as UserParameter and create one Item to show the script result.
        Then you can create One trigger for One Item and - Profit!
        I see where you're going but building a trigger based solely on the output of a check (without considering the exitcode) seems not far off from harakiri.

        Originally posted by aib
        2) In Zabbix you can also create as many Items as you need and as many Triggers as you need.
        Also you can create one MEGA-trigger which will check all datastores threshold and switch only if you have any problem.
        It will fully emulates your Nagios behavior.
        The MEGA-trigger idea might seem tempting, but I have still to figure out how to define "Type of information" for a corresponding item. How do I both say something is wrong AND what the problem is. That really is the Nagios-way which doesn't really fit Zabbix.

        Originally posted by aib
        One more great thing that you can create a Template which will be automatically assigned to Discovered Hosts and you will collect only that information which you qualify as important for this type of guest VM.
        (for example, for Windows DB server - some metrics of DB can be included into Windows DB template; for LAMP VM - some metrics for Mysql/Apache/OS can be included into LAMP template)
        Sounds like a really cool feature! Have not really dug into Discovered Hosts yet, don't know how that could be implemented in our environment.

        Originally posted by aib
        Sorry, I like Zabbix so much that I cannot stop talking about it.
        Please, don't hesitate to ask more question and I would like to do my best to answer it.
        Thanks! I've gotta say, I love your enthusiasm!
        Last edited by ibtanhe; 20-03-2014, 18:00. Reason: speling :)

        Comment

        • aib
          Senior Member
          • Jan 2014
          • 1615

          #19
          Originally posted by ibtanhe
          I see where you're going but building a trigger based solely on the output of a check (without considering the exitcode) seems not far off from harakiri.
          From my point of view - creating a script which is output correct data with wrong exit code is not a good idea.

          The MEGA-trigger idea might seem tempting, but I have still to figure out how to define "Type of information" for a corresponding item. How do I both say something is wrong AND what the problem is. That really is the Nagios-way which doesn't really fit Zabbix.
          In Nagios environment you also can create one item per check.
          But you decide to collect all information about one human in one place (jk)
          And now you have to distinguish WASP(rotestant) from WASC(atholics). And for that purpose you try to invent a bicycle with square wheels.
          Sorry, I don't want to offend you. Just my opinion.

          But anyway - I'm glad to see that you already found some interesting things about Zabbix & VM monitoring.
          Good luck!
          Last edited by aib; 20-03-2014, 18:18.
          Sincerely yours,
          Aleksey

          Comment

          • steveboyson
            Senior Member
            • Jul 2013
            • 582

            #20
            Originally posted by coreychristian
            I did a quick test, and I think they changed this, so items with no time interval now use the zabbix timer process.

            Doing a quick test of a trapper item on our dev system (version 2.2.1) the following trigger expression created a trigger, though it was delayed by about 20 seconds.

            Code:
            {Zabbix server:test.trapper.nodata(60)}=1
            This is a somewhat recent change though, I think it was done sometime inbetween 2.0 and 2.2 if I recall correctly.
            Ok, this is new to me. So I correct myself and pretend that trapper items can have evaluated .nodata triggers. Thanks for pointing that one out, I'm still learning.

            Comment

            • coreychristian
              Senior Member
              Zabbix Certified Specialist
              • Jun 2012
              • 159

              #21
              Originally posted by steveboyson
              Ok, this is new to me. So I correct myself and pretend that trapper items can have evaluated .nodata triggers. Thanks for pointing that one out, I'm still learning.
              np, as I said I believe this is pretty new, I don't even recall it being active in the initial 2.0 release.

              And same on the learning, had never thought to send a 'ZBX_UNSUPPORTED' intentionally before will have to play around with that, seems it could be very useful.

              Comment

              • ibtanhe
                Junior Member
                • Feb 2014
                • 17

                #22
                Originally posted by aib
                From my point of view - creating a script which is output correct data with wrong exit code is not a good idea.
                This makes no sense to me.

                Originally posted by aib
                In Nagios environment you also can create one item per check.
                But you decide to collect all information about one human in one place (jk)
                And now you have to distinguish WASP(rotestant) from WASC(atholics). And for that purpose you try to invent a bicycle with square wheels.
                Sorry, I don't want to offend you. Just my opinion.
                Excuse my ignorance, but jk/WASP/WASC?

                You can create one check per item in Nagios as well just like you say, but it's suicide performance-wise. The only real upside of doing that in Nagios is that if for example you have one "MEGA-check" that checks all filesystems on a host, and something happens to one of the filesystems (fills up) you cannot tell Nagios to stop sending messages about that (scheduled downtime, ack, disable notif) because that would mean you also don't get notified if one of the other filesystems fill up as well since they're monitored by the same check.

                Comment

                • gavind
                  Member
                  • Mar 2013
                  • 59

                  #23
                  Hi Corey, any luck on this one yet? I was hoping that you found a work around.

                  Comment

                  Working...