Ad Widget

Collapse

CacheSize and other Tuning

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • bensode
    Junior Member
    • May 2019
    • 8

    #1

    CacheSize and other Tuning

    I have my Zabbix 4.2.4 running on a system with 8 cores and 4GB of RAM with about 20% RAM usage on average and negligible CPU utilization. I have over 200 hosts and about 18,000 items -- I plan on growing the scope of monitoring but it seems to be slowing down. I've done some web searching and making adjustments to settings in conf -- like increasing CacheSize, etc. If I set the CacheSize above 1G the service won't start and there's an error in the log that the memory failed to initialize. What information/log details can I provide to increase the cachesize values within Zabbix? How can I tweak the settings to reduce the Zabbix internal process busy %? It floats well around 80% most of the day. I'd like to set all of the configs back to default and start over with tuning to get a better understanding.

    Bensode
  • kloczek
    Senior Member
    • Jun 2006
    • 1771

    #2
    CacheSize it is size of the region of the memory where are cached all metadata about current monitoring configuration.
    More metrics and/or triggers than bigger CacheSize.
    http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
    https://kloczek.wordpress.com/
    zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
    My zabbix templates https://github.com/kloczek/zabbix-templates

    Comment

    • db100
      Member
      • Feb 2023
      • 61

      #3
      Is the CacheSize intended to be per host, or in total?

      I am facing a situation where a container running my zabbix server instance (1000 hosts) is showing a total ram usage of about 700MB, but since it was constrained to a limit of 250MB, i am assuming that most of the allocated RAM is actually cache.

      from my understanding of the CacheSize parameter, i would expect the max. amount of RAM to be slightly above 250MB (including cache), but definitely not 700MB (btw this value is ever growing...).

      a quick run of the statistics shows pretty much empty results, apart from this one:

      ```
      zabbix_server -R diaginfo=valuecache
      == value cache diagnostic information ==
      Items:2553 values:6789 mode:0 time:0.000768
      Memory:
      size: free:7843624 used:468248
      chunks: free:105 used:4668 min:32 max:7792424​
      ```

      and the server health dashboard shows a value of 91% usage of configuration cache, but i am not sure what's the total absolute value:

      Click image for larger version

Name:	image.png
Views:	10317
Size:	60.1 KB
ID:	462606
      So my questions here are basically 3:
      • how can i impose a hard limit of the total allocated memory of the zabbix server (container)
      • ​is there a way to periodically clear the cache (housekeeping seems to do nothing here)
      • is it possible to limit the memory usage on the DB side as well? i am using postgres and i can see that the DB reaches high level of memory (1+GB), which sinks back to about 100MB exactly every hour, i am not sure why, but this does not happen at the same time of houskeeping)
      INFO:
      I am running zabbix 6.4 and using all default server conf parameters (i have only reduced the number of started process to "1" for all of them (pollers, trappers, etc.). I am using postgres/timescale/patroni as a backend.

      Comment

      • cyber
        Senior Member
        Zabbix Certified SpecialistZabbix Certified Professional
        • Dec 2006
        • 4807

        #4
        Cache size is total number... it is for the server to hold config data... If you have default settings its ~32M, I think...? if its already 91% filled, you should increase that number...

        Don't mix up different caches... you ask for value cache (zabbix_server -R diaginfo=valuecache) and then complain about config cache being 91%.. No, your value cache says on that same pic 6.4%..

        You do have limits set in server config... All those ValueCache, ConfigCache etc values there... Your server will crash, if it tries to allocate more than allowed...

        These caches are needed to perform all the tasks. Cleaning them would be harmful. For example value cache holds values to calculate your triggers... it is faster than doing selects from DB each time. DB queries are done only, if you don't have all needed values in cache. "Value cache misses" parameter measures it...

        Comment


        • db100
          db100 commented
          Editing a comment
          thanks for the clarification, however i think i am still missing something. Basically the 3 problems i am facing are:

          1. high value of config cache usage
          2. high value of database memory usage
          3. high value of memory usage of zabbix-server (according to the kubernetes metrics server)

          on problem number 1: is it possible to "tell" zabbix not to allocate cache for some hosts? or maybe to simply allocate less cache per host, so that i dont have to necessarily increase to overall ConfigCache value ?

          on problem number 2:
          i would assume here that since i am not having any trouble with ValueCache (no cache misses, 6.4 % utilization) then the zabbix server find all the needed data inside the cache, and does need to query the DB. so basically most of the queries are just INSERTs. I am currently having something like 30 processed values/second, so is it normal that DB usage jumps up to 1GB+ memory ? and this is also true for the replica (i am using patroni) ... so i am wondering if the problem here is the streaming replication itself, or rather some misconfiguration on the zabbix server

          on problem number 3:
          after reading your answer i understand that *Cache configurations are sort of "hard boundaries" and the zabbix server won't come up if those values are overcome. right? then how come does kubernetes metric server measure 700+MB memory usage even if the cache configuration are to their default alues (which are quite low) ? is there any other "hidden cache" that i am not aware about ?
      • db100
        Member
        • Feb 2023
        • 61

        #5
        So my questions here are basically 3:
        • how can i impose a hard limit of the total allocated memory of the zabbix server (container)
        • ​is there a way to periodically clear the cache (housekeeping seems to do nothing here)
        • is it possible to limit the memory usage on the DB side as well? i am using postgres and i can see that the DB reaches high level of memory (1+GB), which sinks back to about 100MB exactly every hour, i am not sure why, but this does not happen at the same time of houskeeping)
        As reference to point number 3 above, this is the kind of periodic mem. oscillations in postgres i am referring to. They happen very sharply every hour both on replica and master. but the replica seems to have the same frequency but with some lag, maybe about 20 minutes,

        has anyone ever seen something like that ?



        Click image for larger version

Name:	image.png
Views:	10352
Size:	17.7 KB
ID:	462862

        Comment

        • cyber
          Senior Member
          Zabbix Certified SpecialistZabbix Certified Professional
          • Dec 2006
          • 4807

          #6
          I am trying to keep myself away from anything "container", it is usually only headache...
          But ...
          1. high value of config cache usage
          Here's nothing to do, but increase the value .. if you set it to 64M instead of 32M your trigger most probably goes away.. There is no option to tell zabbix, that do not keep config of some hosts.. What would be the point? All this is needed to perform all the duties, there is no "allocating less memory per host".. You just cannot have half of config .. At one point you will have more hosts and items and then this config just does not fit into memory (cache) and server wont start up or crashes... But server also has to work with those values, so servers memory usage is not limited to just caches summarized.. Caches are there to speed up processing, as accessing memory is faster than doing queries to DB..
          2. high value of database memory usage
          I am not a DBA, but If you have just 1+G memory usage... its basically nothing.. IMHO, Your whole 4G memory is too few for running all of this.. Or maybe I am just spoiled ..:P

          3. high value of memory usage of zabbix-server (according to the kubernetes metrics server)
          Again, container stuff, I am not going there, I don't know anything about them... ​But all of that 4G is too little amount to run all of it, I think.

          Memory spikes in DB every hour... Just guessing... something with replication? But I am no DBA..

          Comment


          • db100
            db100 commented
            Editing a comment
            > 1. high value of config cache usage

            alright, i got this point. Check

            > I am not a DBA, but If you have just 1+G memory usage... its basically nothing..

            well that depends on the data ingestion frequency anyway (IMO). Things is, i would agree with you in general, but from the chart i have shared, it seems to me that the DB would in theory be able to operate at lower memory level, because every hour the memory goes down to much lower levels, and the system works perfectly. Let's see if anyone else out there, having experience with timescale and patroni can support us.

            Also it is worth mentioning that the number of processed values per second stays quite constant, and does not correlate with the sudden drop in memory on the DBs

            > But all of that 4G is too little amount to run all of it, I think.

            again, i have got a couple of zabbix server instances running in containers, and i see a quite beautiful and constant memory usage of about 130 MB ... and this stays like that for quite many days. Only one instance (the one with more hosts then items) shows this strange behavior of ever slowly increasing memory usage ... but the container does not get killed, so it should be cache. This looks to me like a memory leak, if you wish i could provide more details, just let me know what exactly
        • cyber
          Senior Member
          Zabbix Certified SpecialistZabbix Certified Professional
          • Dec 2006
          • 4807

          #7
          if you wish i could provide more details, just let me know what exactly
          No need.. I have no idea what to ask for.
          Shouldn't memory leaks keep leaking until all memory is exhausted and no leak is possible any more? They should not deflate in regular basis...

          I have a gut feeling, that this memory usage issue is something related to that DB. replication, vacuuming... ? Something that primary does at first and then secondary replicates some time later?
          In such a small instance, I don't know, if you even gain anything by using timescale... Just adding complexity there.

          Comment

          Working...