Ad Widget

Collapse

Zabbix Server: a lot of query failed in zabbix_server.log

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • sscheib
    Junior Member
    • Dec 2019
    • 11

    #1

    Zabbix Server: a lot of query failed in zabbix_server.log

    Hello community,

    within the logfile of the Zabbix server, I can see a lot of
    Code:
    query failed: [1062] Duplicate entry 'XXX' for key 'XXX'
    messages.
    For today there were already 13696 failed queries:
    Code:
    root@zabbix:/home/steffen# grep 'query failed.*Duplicate entry' /var/log/zabbix/zabbix_server.log | wc -l
    13696
    I tried to filter them, to show only unique items:
    Code:
    root@zabbix:/var/log/zabbix# grep 'query failed.*Duplicate entry' /var/log/zabbix/zabbix_server.log | awk '{print $12}' | sed "s/'//g" | sort | uniq | wc -l  
    61
    root@zabbix:/home/steffen#
    This issue seems to only have started recently (according to the unique entries and assuming I am filtering them correctly):
    Code:
    root@zabbix:/var/log/zabbix# grep 'query failed.*Duplicate entry' /var/log/zabbix/zabbix_server.log | awk '{print $12}' | sed "s/'//g" | sort | uniq | wc -l  
    61
    root@zabbix:/var/log/zabbix# grep 'query failed.*Duplicate entry' /var/log/zabbix/zabbix_server.log.1 | awk '{print $12}' | sed "s/'//g" | sort | uniq | wc -l
    89
    root@zabbix:/var/log/zabbix# grep 'query failed.*Duplicate entry' /var/log/zabbix/zabbix_server.log.2 | awk '{print $12}' | sed "s/'//g" | sort | uniq | wc -l 
    71
    root@zabbix:/var/log/zabbix# grep 'query failed.*Duplicate entry' /var/log/zabbix/zabbix_server.log.3 | awk '{print $12}' | sed "s/'//g" | sort | uniq | wc -l 
    1
    root@zabbix:/var/log/zabbix# grep 'query failed.*Duplicate entry' /var/log/zabbix/zabbix_server.log.4 | awk '{print $12}' | sed "s/'//g" | sort | uniq | wc -l 
    1
    root@zabbix:/var/log/zabbix#
    How can I solve these duplicate entries? Is there some "tool/script" to see what actually is the issue, or do I need to look up every query manually and see where the error comes from?

    Thanks!
    Last edited by sscheib; 11-01-2020, 16:19.
  • sscheib
    Junior Member
    • Dec 2019
    • 11

    #2
    Okay, I quickly wrote a little script to give the IDs some names - this might lead to the issue here (I don't know if it helps). I took the database schematics from here: https://zabbix.org/wiki/Database_Schemas
    Following the script:

    Code:
    root@zabbix:/home/steffen# cat /root/scripts/analyze_zabbix_log.sh      
    #!/bin/bash
    while read -r line; do
      IFS="-" read -ra ids <<< "${line}"
      count="${ids[0]}"
      applicationId="${ids[1]}"
      itemId="${ids[2]}"
    
      output="$(mysql zabbix -se "select applications.name as appname,
                               items.name as itemname, 
                               hosts.name as hostname 
                        from items 
                        inner join items_applications on 
                            items.itemid = items_applications.itemid 
                        inner join hosts on 
                            items.hostid = hosts.hostid 
                        inner join applications on 
                            items_applications.applicationid = applications.applicationid 
                        where items.itemid = ${itemId} and applications.applicationid = ${applicationId};" | awk -F '\t' 'BEGIN { OFS="?"}; {$1=$1; print $0}')"
      IFS="?" read -ra mysqlOutput <<< "${output}"
      applicationName="${mysqlOutput[0]}"
      itemName="${mysqlOutput[1]}"
      hostName="${mysqlOutput[2]}"
      echo "---------------------------------------------------"
      echo "Hostname        : ${hostName}"
      echo "Application name: ${applicationName} (ID: ${applicationId})"
      echo "Item name       : ${itemName} (ID: ${itemId})"
      echo "# of occurences : ${count}"
    
    done < <(grep 'query failed.*Duplicate entry' /var/log/zabbix/zabbix_server.log | awk '{print $12}' | sed "s/'//g" | sort | uniq -c | 
             grep -vE '^[[:space:]]+[1][[:space:]]' | sed -E 's/^[[:space:]]+//' | sed -E 's/[[:space:]]/-/')
    root@zabbix:/home/steffen#
    Quick explanation of what this script essentially does:
    - Search for query failed.*Duplicate entry in /var/log/zabbix/zabbix_server.log
    - Print only the IDs of the matching line (Jan 12 15:46:27 zabbix.lan zabbix_server[959]: [Z3005] query failed: [1062] Duplicate entry '1549-31997' for key 'items_applications_1' [insert into items_applications (itemappid,applicationid,itemid) values (56852,1549,31997),(56853,1548,31996),(56854,1548, 32002),(56855,1550,32004),(56856,1550,31998),(5685 7,1548,32005),(56858,1550,32007),(56859,1549,32006 ),(56860,1549,32000),(56861,1548,31999),(56862,154 9,32003),(56863,1550,32001);#012]
    - Remove '' from the line (sed "s/'//g")
    - Sort the entries (sort)
    - print only uniques with the number of occurrences prefixed (uniq -c)
    - Remove entries with only 1 occurrence (grep -vE '^[[:space:]]+[1][[:space:]]')
    - Remove prefixed spaces (sed -E 's/^[[:space:]]+//')
    - And finally replace the space between numOfOccurrences applicationId-ItemId with a hyphen (to have an IFS for bash)
    - Next we iterate over all these "cleaned" entries, extract the number of occurrences, the application ID and the item ID from the line, with which we build the sql statement
    - From the sql output with replace tab with ? as field seperator and assign the output again to variables

    So nothing too fancy - code is rubbish, I know, but for this purpose its enough.
    When running this script I get the following output:
    Code:
    root@zabbix:/home/steffen# bash /root/scripts/analyze_zabbix_log.sh 
    ---------------------------------------------------
    Hostname        : emby.lan
    Application name: Zabbix raw items (ID: 1333)
    Item name       : sdc: Disk write time (rate) (ID: 34087)
    # of occurences : 3
    ---------------------------------------------------
    Hostname        : emby.lan
    Application name: SMART monitoring (smartctl) (ID: 1367)
    Item name       : /dev/sdb: Device model (ID: 30641)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : openwrt.home
    Application name: Interface eth0.1 (ID: 1497)
    Item name       : Interface eth0.1: Bits received (ID: 31621)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : openwrt.home
    Application name: Disk sda (ID: 1510)
    Item name       : sda: Disk utilization (ID: 31711)
    # of occurences : 3
    ---------------------------------------------------
    Hostname        : openwrt.home
    Application name: Filesystem /tmp/log (ID: 1512)
    Item name       : /tmp/log: Free inodes in % (ID: 31752)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : openwrt.lan
    Application name: Interface eth1 (ID: 1533)
    Item name       : Interface eth1: Bits sent (ID: 31876)
    # of occurences : 3
    ---------------------------------------------------
    Hostname        : openwrt.lan
    Application name: Disk sda (ID: 1547)
    Item name       : sda: Disk write request avg waiting time (w_await) (ID: 31969)
    # of occurences : 3
    ---------------------------------------------------
    Hostname        : openwrt.lan
    Application name: Filesystem /tmp/log (ID: 1549)
    Item name       : /tmp/log: Free inodes in % (ID: 31997)
    # of occurences : 3
    ---------------------------------------------------
    Hostname        : openwrt.dmz
    Application name: Zabbix raw items (ID: 1561)
    Item name       : sda: Disk write time (rate) (ID: 32258)
    # of occurences : 3
    ---------------------------------------------------
    Hostname        : openwrt.dmz
    Application name: Interface eth1 (ID: 1569)
    Item name       : Interface eth1: Operational status (ID: 32142)
    # of occurences : 3
    ---------------------------------------------------
    Hostname        : openwrt.dmz
    Application name: Filesystem /tmp/log (ID: 1586)
    Item name       : /tmp/log: Space utilization (ID: 32263)
    # of occurences : 3
    ---------------------------------------------------
    Hostname        : rsyslog.lan
    Application name: Interface eth0 (ID: 1651)
    Item name       : Interface eth0: Inbound packets discarded (ID: 32891)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : rsyslog.lan
    Application name: Disk mmcblk0p1 (ID: 1654)
    Item name       : mmcblk0p1: Disk average queue size (avgqu-sz) (ID: 32908)
    # of occurences : 3
    ---------------------------------------------------
    Hostname        : rsyslog.lan
    Application name: Filesystem / (ID: 1657)
    Item name       : /: Space utilization (ID: 32940)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : access-point.lan
    Application name: Zabbix raw items (ID: 1778)
    Item name       : mtdblock15: Disk read time (rate) (ID: 33928)
    # of occurences : 3
    ---------------------------------------------------
    Hostname        : access-point.lan
    Application name: Interface wlan0 (ID: 1780)
    Item name       : Interface wlan0: Outbound packets discarded (ID: 33823)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : emby.lan
    Application name: Interface eth0 (ID: 1831)
    Item name       : Interface eth0: Bits received (ID: 34109)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : emby.lan
    Application name: Filesystem /media (ID: 1839)
    Item name       : /media: Used space (ID: 34152)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : zabbix.lan
    Application name: Interface wlan0 (ID: 1858)
    Item name       : Interface wlan0: Interface type (ID: 34309)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : zabbix.lan
    Application name: Disk mmcblk0p2 (ID: 1861)
    Item name       : mmcblk0p2: Disk write request avg waiting time (w_await) (ID: 34327)
    # of occurences : 3
    ---------------------------------------------------
    Hostname        : zabbix.lan
    Application name: Filesystem / (ID: 1862)
    Item name       : /: Total space (ID: 34338)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : pihole.lan
    Application name: Interface wlan0 (ID: 1864)
    Item name       : Interface wlan0: Interface type (ID: 34356)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : pihole.lan
    Application name: Disk mmcblk0p1 (ID: 1867)
    Item name       : mmcblk0p1: Disk write request avg waiting time (w_await) (ID: 34374)
    # of occurences : 3
    ---------------------------------------------------
    Hostname        : pihole.lan
    Application name: Filesystem / (ID: 1869)
    Item name       : /: Total space (ID: 34384)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : smarthome-monitoring.lan
    Application name: Interface eth0 (ID: 1899)
    Item name       : Interface eth0: Interface type (ID: 34671)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : smarthome-monitoring.lan
    Application name: Disk mmcblk0p1 (ID: 1901)
    Item name       : mmcblk0p1: Disk read request avg waiting time (r_await) (ID: 34676)
    # of occurences : 3
    ---------------------------------------------------
    Hostname        : smarthome-monitoring.lan
    Application name: Filesystem / (ID: 1903)
    Item name       : /: Used space (ID: 34699)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : access-point.lan
    Application name: Wifi (ID: 1912)
    Item name       : Mode of Toadette (6)(30:23:03:6E:1A:54) (ID: 35176)
    # of occurences : 3
    ---------------------------------------------------
    Hostname        : emby.lan
    Application name: systemd services (ID: 1976)
    Item name       : Service state $1 (ID: 36062)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : smarthome-monitoring.lan
    Application name: systemd services (ID: 1977)
    Item name       : Service state $1 (ID: 36137)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : pihole.lan
    Application name: systemd services (ID: 1978)
    Item name       : Used RSS memory $1 (ID: 36185)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : rsyslog.lan
    Application name: systemd services (ID: 1979)
    Item name       : CPU utilization $1 (ID: 36270)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : zabbix.lan
    Application name: systemd services (ID: 1983)
    Item name       : Used RSS memory $1 (ID: 36371)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : office-printer.lan
    Application name: Printer Cover (ID: 2001)
    Item name       : Cover $1 (ID: 36532)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : office-printer.lan
    Application name: Printer Input (ID: 2003)
    Item name       : Input $1 (ID: 36547)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : office-printer.lan
    Application name: Printer Media Path (ID: 2004)
    Item name       : Media Path Type of supplies $1 (ID: 36571)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : office-printer.lan
    Application name: Printer Output (ID: 2005)
    Item name       : Output Status $1 (ID: 36539)
    # of occurences : 2
    ---------------------------------------------------
    Hostname        : office-printer.lan
    Application name: Printer Supplies (ID: 2006)
    Item name       : Level of supplies $1 (ID: 36604)
    # of occurences : 2
    root@zabbix:/home/steffen#
    I took a closer look at all of them and it turns out every single item, is an item, which was discovered via low level discovery. Not a single item of the failed ones was created manually.
    As there are a variety of LLDs involved in these item creations (selfwritten, community written, even from Zabbix itself (see the disk related stuff)), I guess I can rule out some issue with the LLDs themselves - I wouldn't be the only person to spot this.
    My best guess for now would be (and keep in mind, I have actually not really an idea of Zabbix, I am new to it), that the Zabbix server on ARM has an issue here.
    I found a very similar bug, which was reported 2015: https://support.zabbix.com/browse/ZBX-10194
    Apparently this issue is not resolved (or re-appeared again) , as I am running Zabbix on a Raspbian 10.2 (Buster) with the latest updates installed as well as with the latest Zabbix server and agents available for Raspbian.

    However, before I am going to add a comment on the bug with a reference to this thread, I wanted to make sure, it is actually the same issue I am facing.
    Has somebody running Zabbix on a Raspbian and can run the above script and share the output? Unfortunately all of my Raspberries are in use at the moment and I don't think, that two reports from the same person would make much of a difference.

    Thank you very much!

    Comment

    • dimir
      Zabbix developer
      • Apr 2011
      • 1080

      #3
      In recent versions of Zabbix server the way history syncer is dumping data to history tables was improved and these kind of errors should not be possible. Which version is your Zabbix server? I assume it's 3.x?

      Comment

      • sscheib
        Junior Member
        • Dec 2019
        • 11

        #4
        Originally posted by dimir
        In recent versions of Zabbix server the way history syncer is dumping data to history tables was improved and these kind of errors should not be possible. Which version is your Zabbix server? I assume it's 3.x?
        Sorry for the late reply.
        In the meanwhile I can confirm with 100% certainty, that it is actually an issue with Zabbix and ARM. Tested it along with a friend on 6 different raspberries (with fresh installations). All of them show the very same behaviour - however, this only affects discovered items (from our observations at least). Furthermore due to this issue, low level discovery will only work once (initially so to say) - even if LLD discovers new items, it will lead to one of the errors above, although while debugging you can see, that new items are in fact discovered and send to the Zabbix server, but which will fail to add/create the item. Unlinking and clearing the template which discovers new items and re-add it again, will also only work once.
        The amount of errors are actually (I think) produced due the LLD beeing called. From my observations the error is occuring in any case; meaning it will happen if new items are discovered and also if no new items are discovered.

        Please, feel free to try it yourself.
        1: Install Raspbian
        2: Install Zabbix Server
        3: Add a host and assign any template with LDD (we tested it with both Zabbix agent and Zabbix agent active)
        4: Let LLD run initially
        5: Trigger LLD manually again or wait for LLD to be triggered
        6: Monitor the logfile yourself or use the provided script above

        About the Zabbix version: All was tested with clients beeing on 4.0 to 4.4 and the server beeing on 4.4.

        Thanks!

        Comment

        • Markku
          Senior Member
          Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
          • Sep 2018
          • 1781

          #5
          See https://support.zabbix.com/browse/ZBX-16997: if that is the same problem, it will be fixed in the next versions (next week was the estimation in another ticket).

          Markku

          Comment

          • sscheib
            Junior Member
            • Dec 2019
            • 11

            #6
            Originally posted by Markku
            See https://support.zabbix.com/browse/ZBX-16997: if that is the same problem, it will be fixed in the next versions (next week was the estimation in another ticket).

            Markku
            Thanks for posting that; It actually seems like the same issue, as raspberrys (up and until 3B+ as far as I recall) are running on 32bit, so ...

            Comment

            • sscheib
              Junior Member
              • Dec 2019
              • 11

              #7
              I can confirm, that with the release of Zabbix 4.4.5 the issue described here is fixed - LLD is working again as expected.
              Thanks a lot to all involved guys!

              Comment

              Working...