Ad Widget

Collapse

Agent2 fails retrieving data for ONE filesystem

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Moebius
    Member
    • Dec 2022
    • 43

    #1

    Agent2 fails retrieving data for ONE filesystem

    Hello. Beginner with Zabbix here.
    Server is Zabbix 6.4.4 and all agents are 6.4.4. as well.

    On an Oracle Linux 8.8 host I uninstalled Agent and installed Agent2 (keeping the standard Linux template) because on this vm there is an Oracle db that I will soon have to monitor.
    After switching, the filesystems are all discovered, but the data for ONE filesystem cannot be retrieved anymore:


    Click image for larger version

Name:	immagine.png
Views:	330
Size:	86.8 KB
ID:	467927

    Only one filesystem fails, there are no problems with the others.

    The message is:
    Preprocessing failed for: [{"fsname":"/sys","fstype":"sysfs","bytes":{"total":0,"free":0 , "used":0,"pfree":0,"pused":0},"ino...
    1. Failed: cannot extract value from json by path "$.[?(@.fsname=='/BACKUP')].first()": no data matches the specified path​​

    Out of curiosity I reinstalled the Agent (not Agent2) on another port, and the data for that filesystem are immediately retrieved without a problem, even with the two agents coexisting:


    Click image for larger version

Name:	immagine.png
Views:	228
Size:	79.0 KB
ID:	467928

    What could the cause possibly be?
    Attached Files
  • Markku
    Senior Member
    Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
    • Sep 2018
    • 1782

    #2
    The filesystem items get their data from the "Get filesystem data" master item, and you can check the data manually.

    Go to the OL8 host, and use these commands:

    /usr/sbin/zabbix_agentd -t vfs.fs.get
    /usr/sbin/zabbix_agent2 -t vfs.fs.get
    (fix the paths if they are not correct in Oracle Linux, and you may need to also give it an argument for the config file location)

    Then compare the JSON outputs of the commands to see how their output for "/BACKUP" filesystem differ.

    You may want to use some JSON interpreter/prettyprinter with the output, for example "jq ." (if installed), "python3 -m json.tool", or just a text editor like VSCode that can format the document, but you will need to manually edit the output first by removing the extra non-JSON characters "vfs.fs.get [s|" from the start and the last "]" from the end.

    Markku

    Comment

    • Moebius
      Member
      • Dec 2022
      • 43

      #3
      Thank you Markku. I ran both commands (had to sudo zabbix_agent2) and the output is almost identical with only the last part different.

      agentd:
      Code:
      {"fsname":"/mnt/tcposser/c/TCPOS.INFINITY","fstype":"cifs","bytes":{"total":2 74300137472,"free":125874831360,"used":14842530611 2,"pfree":45.889453,"pused":54.110547},"inodes":{" total":0,"free":0,"used":0,"pfree":100,"pused":0}, "options":"rw,nosuid,nodev,noexec,relatime,vers=3. 1.1,cache=strict,username=Mercurio,uid=54321,force uid,gid=54321,forcegid,addr=172.16.18.10,file_mode =0755,dir_mode=0755,soft,nounix,serverino,mapposix ,rsize=4194304,wsize=4194304,bsize=1048576,echo_in terval=60,actimeo=1"},{"fsname":"/mnt/tcposser/c/TINEXT","fstype":"cifs","bytes":{"total":274300137 472,"free":125874831360,"used":148425306112,"pfree ":45.889453,"pused":54.110547},"inodes":{"total":0 ,"free":0,"used":0,"pfree":100,"pused":0},"options ":"rw,nosuid,nodev,noexec,relatime,vers=3.1.1,cach e=strict,username=Mercurio,uid=54321,forceuid,gid= 54321,forcegid,addr=172.16.18.10,file_mode=0755,di r_mode=0755,soft,nounix,serverino,mapposix,rsize=4 194304,wsize=4194304,bsize=1048576,echo_interval=6 0,actimeo=1"},{"fsname":"/BACKUP","fstype":"ext4","bytes":{"total":844437020 672,"free":153855721472,"used":647614902272,"pfree ":19.196676,"pused":80.803324},"inodes":{"total":5 2428800,"free":52428688,"used":112,"pfree":99.9997 86,"pused":0.000214},"options":"rw,seclabel,relati me"}]
      agent2:
      Code:
      {"fsname":"/run/user/1000/gvfs","fstype":"fuse.gvfsd-fuse","bytes":{"total":0,"free":0,"used":0,"pfree":0,"pused":0},"inodes":{"total":0,"free":0,"used":0,"pfree":100,"pused":0},"options":"rw,nosuid,nodev,relatime,user_id=1000,group_id=1000"}]
      ​​

      Comment

      • cyber
        Senior Member
        Zabbix Certified SpecialistZabbix Certified Professional
        • Dec 2006
        • 4807

        #4
        Are both agents running as user "zabbix"? If you switch to that user on that host, can you see those FS-es from command line... Looks like somekind of access permission issue...

        Comment

        • Moebius
          Member
          • Dec 2022
          • 43

          #5
          Yes, both agents are running as user "zabbix":

          Code:
          [xyz@tcposdb sbin]$ sudo systemctl show zabbix-agent | grep User
          User=zabbix
          DynamicUser=no
          PrivateUsers=no
          [xyz@tcposdb sbin]$ sudo systemctl show zabbix-agent2 | grep User
          User=zabbix
          DynamicUser=no
          PrivateUsers=no

          Comment

          • Markku
            Senior Member
            Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
            • Sep 2018
            • 1782

            #6
            Originally posted by Moebius
            Thank you Markku. I ran both commands (had to sudo zabbix_agent2) and the output is almost identical with only the last part different.
            I don't understand what you are saying.

            In your agentd output there is:

            Code:
             {
                  "fsname": "/BACKUP",
                  "fstype": "ext4",
                  "bytes": {
                      "total": 844437020672,
                      "free": 153855721472,
                      "used": 647614902272,
                      "pfree ": 19.196676,
                      "pused": 80.803324
                  },
                  "inodes": {
                      "total": 52428800,
                      "free": 52428688,
                      "used": 112,
                      "pfree": 99.999786,
                      "pused": 0.000214
                  },
                  "options": "rw,seclabel,relatime"
              }
            but is there a section for "fsname": "/BACKUP" in your agent2 output?

            Markku

            Comment

            • Moebius
              Member
              • Dec 2022
              • 43

              #7
              Also, it looks like every now and then the agentd gets values for that filesystem also from agentd, at apparently random intervals:

              Click image for larger version

Name:	immagine.png
Views:	225
Size:	91.0 KB
ID:	468010

              On that vm SELinux is installed. Don't know if this is relevant.

              Comment

              • Moebius
                Member
                • Dec 2022
                • 43

                #8
                Originally posted by Markku

                I don't understand what you are saying.

                In your agentd output there is:

                Code:
                 {
                "fsname": "/BACKUP",
                "fstype": "ext4",
                "bytes": {
                "total": 844437020672,
                "free": 153855721472,
                "used": 647614902272,
                "pfree ": 19.196676,
                "pused": 80.803324
                },
                "inodes": {
                "total": 52428800,
                "free": 52428688,
                "used": 112,
                "pfree": 99.999786,
                "pused": 0.000214
                },
                "options": "rw,seclabel,relatime"
                }
                but is there a section for "fsname": "/BACKUP" in your agent2 output?

                Markku
                That is the problem. There is no section for that filesystem in my agent2 output.
                It looks like agent2 sees that FS at random intervals only, and doesn't see it anymore in subsequent polls.

                Comment

                • Markku
                  Senior Member
                  Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
                  • Sep 2018
                  • 1782

                  #9
                  You could change the item history for the "Linux: Get filesystems" item so that you see longer history for the raw data, and then you can try to find out how the data differs in various polls, are there other differences as well etc.

                  Markku

                  Comment

                  • Moebius
                    Member
                    • Dec 2022
                    • 43

                    #10
                    Here's the first few history data for that filesystem after setting longer item history. It seems that there is nothing wrong with it, except the random intervals at which the filesystem can be seen by agent2.
                    Again, agentd has no problems at all with this.

                    Click image for larger version

Name:	immagine.png
Views:	213
Size:	71.5 KB
ID:	468066

                    Comment

                    • Markku
                      Senior Member
                      Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
                      • Sep 2018
                      • 1782

                      #11
                      Note that you should be interested in the "root" master item (vfs.fs.get) because that is the item that finds all the filesystems every minute (I think, recheck the item interval for it). In your message you are showing the parsed item for /BACKUP, not the original master item.

                      Your hypothesis is that sometimes the master item (from agent2) does not contain data for /BACKUP at all, even though it has all other data. Now you need to verify that hypothesis by collecting enough samples from the master item, showing that sometimes /BACKUP is there and sometimes is not.

                      If you then conclude this is a bug in agent 2, you can open an issue in https://support.zabbix.com/ with your evidence.

                      Markku
                      Last edited by Markku; 01-08-2023, 11:16.

                      Comment

                      • Moebius
                        Member
                        • Dec 2022
                        • 43

                        #12
                        Markku, thank you for your patience.

                        I collected the data you described in a file. That filesystem ("/BACKUP) is found 18 times in 1400 polls.

                        Can I conclude that there is a bug in agent 2 then?

                        Comment

                        • Markku
                          Senior Member
                          Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
                          • Sep 2018
                          • 1782

                          #13
                          Sounds strange but if you don't have other ideas why one piece of software gets all filesystems all the time and the other doesn't, maybe it is a bug then.

                          In your 1400 polls, was all other data always there and identical, just that one section for /BACKUP was missing in 1382 cases?

                          Markku

                          Comment

                          • Moebius
                            Member
                            • Dec 2022
                            • 43

                            #14
                            Almost. Two more sections appear randomly, but way more frequently, and refer to two remote cifs file systems mounted here. This is an extract:

                            Click image for larger version

Name:	immagine.png
Views:	221
Size:	131.1 KB
ID:	468107

                            Comment

                            Working...