Ad Widget

Collapse

Random Web Scenario problems happening - need performance tuning?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Maxburn
    Member
    • Sep 2019
    • 54

    #1

    Random Web Scenario problems happening - need performance tuning?

    Off and on again I'm having trouble with a LOT of Web Scenario tests failing. For example all was well this week until today (edit; this has happened in the past but I haven't been able to log in and diagnose while it was happening until today) and they are failing many tests at one time. Some facts;
    1. When this happens if I SSH in fast enough and do a curl to the same URL the web test is using it might take 20 seconds to come back. Do not understand that.
    2. During that time I can't account for that delay from other computers using the same ISP, it loads the page in a browser instantly.
    3. I have a couple of same items in uptimerobot and it doesn't see these problems.
    4. Also get Zabbix http poller processes more than 75% busy - alerts at the same time.
    5. Increased StartHTTPPollers from 5 to 10 this morning, no change
    6. About 130 web page checks at default 1 minute interval - too much for this server you think?
    7. The trigger is a rolling average of 5 tests being less than 0.2; avg(/servername/web.test.fail[webtestname],#5)>.2
    8. Majority of sites being monitored are in Cloudflare, and I have a WAF rule bypassing other rules for this and some other monitoring servers.
    9. HTOP looks fine, not out of memory or CPU
    10. iotop looks good, nothing slamming the disk for IO
    11. iftop looks ok too, like it's no sweat on network traffic below.
    12. Edit; while it is happening SSH is as quick as as always across the local LAN.


    Server is Ubuntu 22.04.5 LTS
    Zabbix is 7.0.22 LTS and happened on previous version or two as well.
    VM has 4 CPU and 4GB of memory. MySQL db.

    Click image for larger version

Name:	Screenshot 2026-01-28 135459.png
Views:	37
Size:	62.2 KB
ID:	510723


    Click image for larger version  Name:	WebPageCheckFail.png Views:	0 Size:	47.2 KB ID:	510720 Click image for larger version  Name:	Screenshot 2026-01-28 132319.png Views:	0 Size:	28.5 KB ID:	510722
    Last edited by Maxburn; 28-01-2026, 20:56.
  • cyber
    Senior Member
    Zabbix Certified SpecialistZabbix Certified Professional
    • Dec 2006
    • 4832

    #2
    Maybe I am overreacting and am not used to such small instances.. If its a all-in-one host...? Add some more resources to that host. And then you can try adding some more http pollers also...

    Comment

    • Maxburn
      Member
      • Sep 2019
      • 54

      #3
      Originally posted by cyber
      Maybe I am overreacting and am not used to such small instances.. If its a all-in-one host...? Add some more resources to that host. And then you can try adding some more http pollers also...
      Yesterday I doubled "StartHTTPPollers" from 5 to 10, did I do the wrong one?

      VM has 4 CPU and 4GB of memory. MySQL db. Doesn't appear to be running out of resources per htop, iotop, iftop. I'm fine throwing a little more RAM at it.

      It's "only" monitoring ~100 servers and ~130 ish web pages, certainly not huge.

      Click image for larger version

Name:	Screenshot 2026-01-29 100940.png
Views:	21
Size:	90.2 KB
ID:	510747

      Comment

      • cyber
        Senior Member
        Zabbix Certified SpecialistZabbix Certified Professional
        • Dec 2006
        • 4832

        #4
        yea of course you did correctly, added some http pollers... I would experiment with some more... You can always reduce back to previous values, if it does not help..
        As I said, not used to such small all-in-one instance .. Might be overreacting with resources..

        Comment

        • Maxburn
          Member
          • Sep 2019
          • 54

          #5
          Between increasing http-pollers to 10 AND cleaning up some web checks that I know are failing & won't come back, seems to have settled down now.

          Edit; spoke too soon. This only seems to be happening in the middle of the day. I'm wondering if this isn't a cloudflare issue.
          Last edited by Maxburn; Yesterday, 19:28.

          Comment

          Working...