Ad Widget

Collapse

What can I do about value cache fragmentation?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • TMA
    Junior Member
    • Jun 2022
    • 3

    #1

    What can I do about value cache fragmentation?

    Cache Fragmentation is becoming a real nuissance:

    Code:
    14891:20220629:113807.126 ==================================================
    14891:20220629:113807.168 === memory statistics for value cache size ===
    14891:20220629:113807.168 free chunks of size 24 bytes: 562
    14891:20220629:113807.168 free chunks of size 32 bytes: 1691
    14891:20220629:113807.168 free chunks of size 40 bytes: 899
    14891:20220629:113807.169 free chunks of size 48 bytes: 726
    14891:20220629:113807.169 free chunks of size 56 bytes: 198
    14891:20220629:113807.169 free chunks of size 64 bytes: 1
    14891:20220629:113807.169 free chunks of size 80 bytes: 484
    14891:20220629:113807.169 free chunks of size 88 bytes: 72
    14891:20220629:113807.169 free chunks of size 96 bytes: 117
    14891:20220629:113807.169 free chunks of size 104 bytes: 34
    14891:20220629:113807.170 free chunks of size 112 bytes: 313
    14891:20220629:113807.170 free chunks of size 120 bytes: 54
    14891:20220629:113807.170 free chunks of size 128 bytes: 196
    14891:20220629:113807.170 free chunks of size 136 bytes: 41
    14891:20220629:113807.170 free chunks of size 144 bytes: 51
    14891:20220629:113807.170 free chunks of size 152 bytes: 9
    14891:20220629:113807.171 free chunks of size 160 bytes: 24
    14891:20220629:113807.171 free chunks of size 168 bytes: 4
    14891:20220629:113807.171 free chunks of size 176 bytes: 23
    14891:20220629:113807.171 free chunks of size 184 bytes: 1
    14891:20220629:113807.171 free chunks of size 192 bytes: 26
    14891:20220629:113807.171 free chunks of size 200 bytes: 2
    14891:20220629:113807.171 free chunks of size 208 bytes: 21
    14891:20220629:113807.171 free chunks of size 216 bytes: 9
    14891:20220629:113807.172 free chunks of size 224 bytes: 15
    14891:20220629:113807.172 free chunks of size 232 bytes: 4
    14891:20220629:113807.172 free chunks of size 240 bytes: 2
    14891:20220629:113807.172 free chunks of size 248 bytes: 3
    14891:20220629:113807.172 free chunks of size >= 256 bytes: 133487
    14891:20220629:113807.172 min chunk size: 24 bytes
    14891:20220629:113807.173 max chunk size: 11264 bytes
    14891:20220629:113807.173 memory of total size 524048904 bytes fragmented into 801353 chunks
    14891:20220629:113807.173 of those, 294353752 bytes are in 139069 free chunks
    14891:20220629:113807.173 of those, 229695152 bytes are in 662284 used chunks
    14891:20220629:113807.173 of those, 12821632 bytes are used by allocation overhead
    14891:20220629:113807.173 ================================
    14891:20220629:113807.173 value cache is fully used: please increase ValueCacheSize configuration parameter
    Yeah, I see, I should increase the ValueCacheSize parameter (again). But will that prevent about 43% being lost in fragmentation?

    We're still running 5.4.9 because upgrading to 6.x seems like a lot of trouble ahead without real gain in value (for our use case). But if the cache fragmentation problem was addressed in 6.x, we would of course have a good reason to try. So, have you had similar problems that you were able to solve with 6.x? Or is increasing the cache the real solution?

    Kind regards, Thomas.
  • cyber
    Senior Member
    Zabbix Certified SpecialistZabbix Certified Professional
    • Dec 2006
    • 4806

    #2
    Just increase the size. I think it is not really a version dependent issue, you just have too little memory assigned... I dont know whats your value there right now, but double it and see if error goes away..

    Comment

    • TMA
      Junior Member
      • Jun 2022
      • 3

      #3
      As the log snippet shows, the cache size is at 524 MB. Of these, only 229 MB are used. The rest (294 MB) is free, but the largest free chunk is 11264 Bytes. Wouldn't you say there's something wrong by design?

      When I systemctl restart the server (w/o increasing the cache size), the error goes away and utilization slowly climbs to ~ 50% and stays there. But after some days fragmentation of free space causes it to complain that the "value cache is fully used" - which it is not, except that fragmentation makes it appear to be. If I double the size, the error will go away at first too of course, but will that prevent fragmentation to grow again to the point where the cache appears to be fully used?

      Comment

      • cyber
        Senior Member
        Zabbix Certified SpecialistZabbix Certified Professional
        • Dec 2006
        • 4806

        #4
        I am no developer.. I have never stumbled on that kind of error. Just trying to think here, what could be wrong.. In my mind fragmentation can occur, if it tries to occupy a larger chunk of memory for something, but cannot, as there is not enough. So increasing value can give it enough room to operate, even if its not going to be occupied fully.
        How big is your installation? hosts/items/nvps? Maybe it is a bit undersized? Here I run ~4k nvps and 12k hosts and that value cache is set to 2G... Seems to be occupied ~10%..:P I have never seen any fragmentation warnings...

        Here's one topic https://www.zabbix.com/forum/zabbix-...used-always-20
        It mentions also part in log which should be just before this "memory statistics" snippet... Is there a particular item mentioned, that would take a lot of cache? (=== most used items statistics for value cache ===).

        Comment

        • TMA
          Junior Member
          • Jun 2022
          • 3

          #5
          I agree with you in that oversizing the cache would make the problem less visible in the first place and - if there is an algorithm to recombine unused fragments - it may help it to recombine smaller free chunks into larger ones more often. It greatly depends on that cleanup algorithm though: if it can move around used chunks, it may succeed with very little oversizing (like the disk defragmentation tools do). If it is able to recombine adjacent free chunks only, it may (will?) always end up with a free chunk following a used chunk after some time, and no way to recombine free chunks into larger ones.

          That link you mentioned shows a smaller installation with 32 MB cache where the fragmentation effect already shows at only 20% used chunks, which supports your point that increasing the cache may help to hide (i.e. not solve) the problem. The most used items statistics is about what it says: it lists the chunks which had the most cache hits in the past, i.e. a good thing, not the size of these chunks.

          Well, because this is obviously not new, I should think that the developers are aware of cache fragmentation and consider it not to be a problem because it can be solved by throwing lots of (finally unused) RAM at it. I was just hoping that it was among the things improved for 6.x - because (to me at least) that would be a critical todo-item if 6.x is all about being enterprise ready ...

          Comment

          Working...