Ad Widget

Collapse

Active agents stopped reporting after upgrading to 1.6

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • bbrendon
    Senior Member
    • Sep 2005
    • 870

    #31
    Do not run 1.6.0. You either run 1.4.6 or a beta of 1.6.1.
    Unofficial Zabbix Expert
    Blog, Corporate Site

    Comment

    • YvesM
      Junior Member
      • Jul 2008
      • 14

      #32
      Hi,

      Where Can I find a beta for 1.6.1?????
      Thx in advance

      Comment

      • bbrendon
        Senior Member
        • Sep 2005
        • 870

        #33
        http://www.zabbix.com/developers.php
        Unofficial Zabbix Expert
        Blog, Corporate Site

        Comment

        • gjtje
          Junior Member
          • Apr 2006
          • 5

          #34
          pre-1.6.1 didn't work either untill I set the NodeID value in the config to 0!

          I had this set to 1 since it's not a stand alone configuration but I guess the first server should always be 0.

          Comment

          • Jason
            Senior Member
            • Nov 2007
            • 430

            #35
            Hmmm I might try that. This is a distributed setup so I thought the Node ID of the central node always had to be 1 and not 0 as per the 1.4 docs...

            After upgrading DB would we need to reapply the command telling the DB that its node ID is 1? I'll experiment later this week.

            Jason

            Comment

            • byronsmith
              Junior Member
              • Dec 2006
              • 23

              #36
              Hi,

              I was having the same issue - I thought it may be because of some firewall issues across the WAN etc so I set up another computer on the same LAN with an agent that only did active checks and although I could telnet to the server on 10051 I could not get any active checks working.

              However, after changing the following value from 10051 to 10052 on the server and the same for the agent, everything sprung into life! Hope this helps you!

              # Listen port for trapper. Default port number is 10051. This parameter
              # must be between 1024 and 32767

              ListenPort=10052
              Last edited by byronsmith; 05-11-2008, 16:50.

              Comment

              • Jason
                Senior Member
                • Nov 2007
                • 430

                #37
                I've tried both with NodeID=0 and NodeID=1. No difference. Passive agents and SNMP checks seem to report in ok, but active agents do not and the mysql process maxes out.

                My ListenPort is still on 10051 on both agent and server. It should work on these ports.

                Comment

                • byronsmith
                  Junior Member
                  • Dec 2006
                  • 23

                  #38
                  Hi Jason,

                  I thought the same thing, both my agent and server were on 10051 and it was not working - changing to 10052 on both sides the issue no longer exists for me - active checks work perfectly.

                  Both the server and agent are 1.6.1

                  Comment

                  • Jason
                    Senior Member
                    • Nov 2007
                    • 430

                    #39
                    Hmm...

                    I tried that and initially it looked promising, but then the cpu time of mysql went mental and agents stopped reporting in. Processlist from zabbix below and some sample errors from server log below that.

                    I'm going to leave it running as it is for a while to see if it just needs to settle down before the agents report in, but it is looking like its still not working

                    +--------+-------------+-----------+---------+---------+------+----------------+------------------------------------------------------------------------------------------------+
                    | Id | User | Host | db | Command | Time | State | Info |
                    +--------+-------------+-----------+---------+---------+------+----------------+------------------------------------------------------------------------------------------------+
                    | 741617 | zabbix_user | localhost | zabbix2 | Sleep | 173 | | |
                    | 741619 | zabbix_user | localhost | zabbix2 | Sleep | 569 | | |
                    | 741621 | zabbix_user | localhost | zabbix2 | Sleep | 111 | | |
                    | 741622 | zabbix_user | localhost | zabbix2 | Sleep | 361 | | |
                    | 741623 | zabbix_user | localhost | zabbix2 | Sleep | 46 | | |
                    | 741626 | zabbix_user | localhost | zabbix2 | Sleep | 465 | | |
                    | 741640 | zabbix_user | localhost | zabbix2 | Sleep | 65 | | |
                    | 741659 | zabbix_user | localhost | zabbix2 | Sleep | 55 | | |
                    | 741684 | zabbix_user | localhost | zabbix2 | Sleep | 238 | | |
                    | 741698 | zabbix_user | localhost | zabbix2 | Sleep | 299 | | |
                    | 741710 | zabbix_user | localhost | zabbix2 | Sleep | 236 | | |
                    | 741734 | zabbix_user | localhost | zabbix2 | Sleep | 571 | | |
                    | 741749 | zabbix_user | localhost | zabbix2 | Sleep | 1550 | | |
                    | 746208 | zabbix_user | localhost | zabbix2 | Sleep | 0 | | |
                    | 746209 | zabbix_user | localhost | zabbix2 | Sleep | 2 | | |
                    | 746210 | zabbix_user | localhost | zabbix2 | Sleep | 2 | | |
                    | 746211 | zabbix_user | localhost | zabbix2 | Query | 91 | Updating | update ids set nextid=nextid+1 where nodeid=1 and table_name='history_log' and field_name='id' |
                    | 746212 | zabbix_user | localhost | zabbix2 | Query | 91 | Updating | update ids set nextid=nextid+1 where nodeid=1 and table_name='history_log' and field_name='id' |
                    | 746213 | zabbix_user | localhost | zabbix2 | Query | 92 | Updating | update ids set nextid=nextid+1 where nodeid=1 and table_name='history_log' and field_name='id' |
                    | 746214 | zabbix_user | localhost | zabbix2 | Query | 92 | Updating | update ids set nextid=nextid+1 where nodeid=1 and table_name='history_log' and field_name='id' |
                    | 746215 | zabbix_user | localhost | zabbix2 | Query | 4 | Sorting result | select value from history_log where itemid=100100000023283 order by id desc limit 1 |
                    | 746216 | zabbix_user | localhost | zabbix2 | Sleep | 1 | | |
                    | 746217 | zabbix_user | localhost | zabbix2 | Sleep | 15 | | |
                    | 746218 | zabbix_user | localhost | zabbix2 | Sleep | 14 | | |
                    | 746222 | zabbix_user | localhost | zabbix2 | Sleep | 2 | | |
                    | 746223 | zabbix_user | localhost | zabbix2 | Sleep | 5 | | |
                    | 746224 | zabbix_user | localhost | zabbix2 | Sleep | 2 | | |
                    | 746225 | zabbix_user | localhost | zabbix2 | Sleep | 2 | | |
                    | 746228 | zabbix_user | localhost | zabbix2 | Sleep | 44 | | |
                    | 746523 | root | localhost | | Query | 0 | | show processlist |
                    +--------+-------------+-----------+---------+---------+------+----------------+------------------------------------------------------------------------------------------------+


                    28302:20081107:165755 Query failed: [update ids set nextid=nextid+1 where nodeid=1 and table_name='history_log' and field_name='id'] Lock wait timeout exceeded; try restarting transaction [1205]
                    28303:20081107:165755 Query failed: [update ids set nextid=nextid+1 where nodeid=1 and table_name='history_log' and field_name='id'] Lock wait timeout exceeded; try restarting transaction [1205]
                    28300:20081107:165756 Query failed: [update ids set nextid=nextid+1 where nodeid=1 and table_name='history_log' and field_name='id'] Lock wait timeout exceeded; try restarting transaction [1205]
                    28301:20081107:165756 Query failed: [update ids set nextid=nextid+1 where nodeid=1 and table_name='history_log' and field_name='id'] Lock wait timeout exceeded; try restarting transaction [1205]
                    28301:20081107:165957 Query failed: [update ids set nextid=nextid+1 where nodeid=1 and table_name='history_log' and field_name='id'] Lock wait timeout exceeded; try restarting transaction [1205]
                    28299:20081107:170020 Query failed: [update ids set nextid=nextid+1 where nodeid=1 and table_name='history_log' and field_name='id'] Lock wait timeout exceeded; try restarting transaction [1205]
                    28302:20081107:170021 Query failed: [update ids set nextid=nextid+1 where nodeid=1 and table_name='history_log' and field_name='id'] Lock wait timeout exceeded; try restarting transaction [1205]
                    28303:20081107:170026 Query failed: [update ids set nextid=nextid+1 where nodeid=1 and table_name='history_log' and field_name='id'] Lock wait timeout exceeded; try restarting transaction [1205]
                    Last edited by Jason; 07-11-2008, 18:59.

                    Comment

                    • Jason
                      Senior Member
                      • Nov 2007
                      • 430

                      #40
                      I've finally managed to get something working on 1.6 on the pre 1.6.1 beta...

                      I get impression its something to do with my historical data as if I create a fresh install now and manually add each host then it does appear to work ok. (I've exported and imported hosts and templates from old server).

                      I've even moved the NodeID to 1 and its still bringing data in, but with 6 hosts sending data in its maxing out mysql at 100% and I'm seeing the following errors in my zabbix_server log. I'd like to work out whats going on there... Its possible it could be a template with something corrupt. Is there anyway I can work out what it is from info here?

                      18192:20081114:120714 Expression [({100100000014606}=1&{100100000014607})=1|({200200 000014607}=1&{100100000014606}=0)] cannot be evaluated [Unable to get function value: No function for functionid [200200000014607]]
                      18193:20081114:120714 Expression [({100100000014606}=1&{100100000014607})=1|({200200 000014607}=1&{100100000014606}=0)] cannot be evaluated [Unable to get function value: No function for functionid [200200000014607]]
                      18193:20081114:120715 Expression [({100100000014606}=1&{100100000014607})=1|({200200 000014607}=1&{100100000014606}=0)] cannot be evaluated [Unable to get function value: No function for functionid [200200000014607]]
                      18194:20081114:120715 Expression [({100100000014606}=1&{100100000014607})=1|({200200 000014607}=1&{100100000014606}=0)] cannot be evaluated [Unable to get function value: No function for functionid [200200000014607]]
                      18193:20081114:120715 Expression [({100100000014606}=1&{100100000014607})=1|({200200 000014607}=1&{100100000014606}=0)] cannot be evaluated [Unable to get function value: No function for functionid [200200000014607]]


                      I';d like to get the historical data across as well if possible, but if its a template issue then that might fix it.

                      Comment

                      • Jason
                        Senior Member
                        • Nov 2007
                        • 430

                        #41
                        More information...

                        It seems that its the windows machines that are causing the issue. I've 6-7 linux machines running active agents on 1.6.1 and all is fine. I try and add in a windows machine and then the mysql process usage rockets to 100% and everything seems to stop.

                        J

                        Comment

                        • Jason
                          Senior Member
                          • Nov 2007
                          • 430

                          #42
                          I've tried again, and by turning off all the Windows Event Log monitoring suddenly 1.6.1 is working perfectly. Once I start enabling monitoring of System or Application Log (with EventLog(System) & EventLog(Application) and add more than a few hosts then suddenly it seems that the mysql process locks up.

                          Comment

                          Working...