Ad Widget

Collapse

Zabbix crashing

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • webadmin
    Junior Member
    • Jan 2017
    • 10

    #1

    Zabbix crashing

    Hello

    Zabbix server is always crashing. zabbix_server and zabbix_agent processes are always running. Dashoboard do not display any error on the service level whereas the "last 20 even issues" show a high delay in problem triggering and notification (in hours sometimes).
    Can you please help?
    Kindly note that Maria DB logs do not show any errors.

    StartPollers=1000
    StartPollersUnreachable=333
    StartTrappers=1000
    CacheSize=8000M
    CacheUpdateFrequency=60
    HistoryCacheSize=2G
    HistoryIndexCacheSize=2G
    TrendCacheSize=2G
    ValueCacheSize=12G

    Number of hosts: 525
    Number of enabled items: 110250
    Number of enabled triggers:48222

    Kindly note that everytime we have to terminate the zabbix_server process and start it again. Problem is occurring every 30 minutes and sometimes after 8 hours
    Attached Files
    Last edited by webadmin; 31-01-2017, 14:30.
  • batchenr
    Senior Member
    • Sep 2016
    • 440

    #2
    Originally posted by webadmin
    Hello

    Zabbix server is always crashing. zabbix_server and zabbix_agent processes are always running. Dashoboard do not display any error on the service level whereas the "last 20 even issues" show a high delay in problem triggering and notification (in hours sometimes).
    Can you please help?
    Kindly note that Maria DB logs do not show any errors.

    StartPollers=1000
    StartPollersUnreachable=333
    StartTrappers=1000
    CacheSize=8000M
    CacheUpdateFrequency=60
    HistoryCacheSize=2G
    HistoryIndexCacheSize=2G
    TrendCacheSize=2G
    ValueCacheSize=12G

    Number of hosts: 525
    Number of enabled items: 110250
    Number of enabled triggers:48222

    Kindly note that everytime we have to terminate the zabbix_server process and start it again. Problem is occurring every 30 minutes and sometimes after 8 hours
    1.what is the zabbix server resources? mem ,cpu ...etc.
    2. can you add zabbix_server logs ? if you dont see much there please
    add debug level 4 to /etc/zabbix/zabbix_server.conf and restart

    by the way i have a server with half of hosts (250) and i dont use that much pollers
    StartPollers=30
    StartPollersUnreachable=10
    StartTrappers= 30
    CacheSize=1G
    HistoryCacheSize=512M
    HistoryIndexCacheSize=512M

    all the rest are default values
    i think you have lode zabbix too much- try to change the settings...
    Last edited by batchenr; 30-01-2017, 16:09.

    Comment

    • webadmin
      Junior Member
      • Jan 2017
      • 10

      #3
      System Information
      Manufacturer: HP
      Product Name: ProLiant DL380p Gen8
      Version: Not Specified

      Memory :16 GB
      CPU 6 cores

      We are unable to keep the Debug log severity to avoid disk space issue but we were able to collect the attached truncated logs
      Last edited by webadmin; 31-01-2017, 14:31.

      Comment

      • kloczek
        Senior Member
        • Jun 2006
        • 1771

        #4
        Just check zabbix server logs.
        http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
        https://kloczek.wordpress.com/
        zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
        My zabbix templates https://github.com/kloczek/zabbix-templates

        Comment

        • batchenr
          Senior Member
          • Sep 2016
          • 440

          #5
          Originally posted by webadmin
          System Information
          Manufacturer: HP
          Product Name: ProLiant DL380p Gen8
          Version: Not Specified

          Memory :16 GB
          CPU 6 cores

          We are unable to keep the Debug log severity to avoid disk space issue but we were able to collect the attached truncated logs
          did you try to reduce the zabbix resurces ?
          StartPollers=1000
          StartPollersUnreachable=333
          StartTrappers=1000
          CacheSize=8000M
          CacheUpdateFrequency=60
          HistoryCacheSize=2G
          HistoryIndexCacheSize=2G
          TrendCacheSize=2G
          ValueCacheSize=12G

          and second-
          run this on your zabbixdb at mysql :

          select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=37631;
          post the data you get here.

          and please look over all your items and disable all unsupported items.
          most importantly reduce thous pollers - restart zabbix- check if it helps.

          Comment

          • webadmin
            Junior Member
            • Jan 2017
            • 10

            #6
            we have reduced the zabbix resources but the problem persists.
            On the other hand, we have deactivated all unsupported it items followed by a restart.
            Below you will find the output of the requested query

            MariaDB [zabbix]> select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=37631;
            +--------+---------+-------+----------+---------+-------+----------+
            | hostid | key_ | state | evaltype | formula | error | lifetime |
            +--------+---------+-------+----------+---------+-------+----------+
            | 10380 | ifDescr | 0 | 0 | | | 7 |
            +--------+---------+-------+----------+---------+-------+----------+
            1 row in set (0.03 sec)

            Comment

            • batchenr
              Senior Member
              • Sep 2016
              • 440

              #7
              Originally posted by webadmin
              we have reduced the zabbix resources but the problem persists.
              On the other hand, we have deactivated all unsupported it items followed by a restart.
              Below you will find the output of the requested query

              MariaDB [zabbix]> select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=37631;
              +--------+---------+-------+----------+---------+-------+----------+
              | hostid | key_ | state | evaltype | formula | error | lifetime |
              +--------+---------+-------+----------+---------+-------+----------+
              | 10380 | ifDescr | 0 | 0 | | | 7 |
              +--------+---------+-------+----------+---------+-------+----------+
              1 row in set (0.03 sec)

              i wonder why the couloms lifetime and formula are empty.
              there not supposed to be empty.
              can you run it like this :

              select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid like '%376%;
              or see all the table
              select hostid,key_,state,evaltype,formula,error,lifetime from items;
              and see if all the rows is the same or only the rows where the log is alerting

              after the restart and the changes - can you post another logs here ?
              to see if there is any changes.
              and if you can go to a host that making an issue and post zabbix agent log
              maybe it can give us more info about zabbix server problem..

              Comment

              • webadmin
                Junior Member
                • Jan 2017
                • 10

                #8
                Mysql query output truncated:

                ..
                | hostid | key_ | state | evaltype | formula | error
                | lifetime |
                +--------+---------------------------------------------------------------------------+-------+----------+---------+-----------------
                -----------------------------------+----------+
                | 10106 | icmppingsec | 0 | 0 | 1 |
                | 30 |
                | 10133 | ifOperStatus[{#SNMPVALUE}] | 0 | 0 | 1 |
                | 30 |
                | 10155 | sysUpTime | 0 | 0 | 0.01 |
                | 30 |
                | 10177 | ciscoEnvMonTemperatureStatusDescr[{#SNMPVALUE}] | 0 | 0 | 1 |
                | 30 |
                | 10198 | ifOperStatus[{#SNMPVALUE}] | 0 | 0 | 1 |
                | 30 |
                | 10220 | sysUpTime | 0 | 0 | 0.01 |

                ...


                zabbix_agent.log do not show any error as per below logs

                4798:20170127:083923.571 Starting Zabbix Agent [Zabbix server]. Zabbix 3.2.1 (revision 62890).
                4798:20170127:083923.572 **** Enabled features ****
                4798:20170127:083923.572 IPv6 support: YES
                4798:20170127:083923.572 TLS support: YES
                4798:20170127:083923.572 **************************
                4798:20170127:083923.572 using configuration file: /usr/local/etc/zabbix_agentd.conf
                4798:20170127:083923.572 agent #0 started [main process]
                4799:20170127:083923.573 agent #1 started [collector]
                4800:20170127:083923.573 agent #2 started[listener #1]
                4801:20170127:083923.573 agent #3 started[listener #2]
                4802:20170127:083923.573 agent #4 started[listener #3]
                4803:20170127:083923.574 agent #5 started [active checks #

                Comment

                • batchenr
                  Senior Member
                  • Sep 2016
                  • 440

                  #9
                  Originally posted by webadmin
                  Mysql query output truncated:

                  ..
                  | hostid | key_ | state | evaltype | formula | error
                  | lifetime |
                  +--------+---------------------------------------------------------------------------+-------+----------+---------+-----------------
                  -----------------------------------+----------+
                  | 10106 | icmppingsec | 0 | 0 | 1 |
                  | 30 |
                  | 10133 | ifOperStatus[{#SNMPVALUE}] | 0 | 0 | 1 |
                  | 30 |
                  | 10155 | sysUpTime | 0 | 0 | 0.01 |
                  | 30 |
                  | 10177 | ciscoEnvMonTemperatureStatusDescr[{#SNMPVALUE}] | 0 | 0 | 1 |
                  | 30 |
                  | 10198 | ifOperStatus[{#SNMPVALUE}] | 0 | 0 | 1 |
                  | 30 |
                  | 10220 | sysUpTime | 0 | 0 | 0.01 |

                  ...


                  zabbix_agent.log do not show any error as per below logs

                  4798:20170127:083923.571 Starting Zabbix Agent [Zabbix server]. Zabbix 3.2.1 (revision 62890).
                  4798:20170127:083923.572 **** Enabled features ****
                  4798:20170127:083923.572 IPv6 support: YES
                  4798:20170127:083923.572 TLS support: YES
                  4798:20170127:083923.572 **************************
                  4798:20170127:083923.572 using configuration file: /usr/local/etc/zabbix_agentd.conf
                  4798:20170127:083923.572 agent #0 started [main process]
                  4799:20170127:083923.573 agent #1 started [collector]
                  4800:20170127:083923.573 agent #2 started[listener #1]
                  4801:20170127:083923.573 agent #3 started[listener #2]
                  4802:20170127:083923.573 agent #4 started[listener #3]
                  4803:20170127:083923.574 agent #5 started [active checks #
                  did you make any changes \ upgrade in zabbix lately ?
                  i can see that now in this items there is a fornula and lifetime

                  check this quary from your log :
                  select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=42457
                  select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=44958
                  select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=30428
                  select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=37427
                  select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=37426
                  select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=37428
                  select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=43060
                  if all formula and lifetime is empty then i think its the issue.
                  go to this items and see the configuration inside maybe try to check them and do mass update

                  Comment

                  • webadmin
                    Junior Member
                    • Jan 2017
                    • 10

                    #10
                    no we didn't perform any upgrade,
                    We are just restarting the Zabbix_server process when the issue occurs.
                    Problem persists

                    Comment

                    • batchenr
                      Senior Member
                      • Sep 2016
                      • 440

                      #11
                      i have asked for a few things in previews comment - please read it and report back

                      Comment

                      • webadmin
                        Junior Member
                        • Jan 2017
                        • 10

                        #12
                        MariaDB [zabbix]> select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=42457 ;

                        +--------+--------------------------------+-------+----------+---------+-------+----------+
                        | hostid | key_ | state | evaltype | formula | error | lifetime |
                        +--------+--------------------------------+-------+----------+---------+-------+----------+
                        | 10483 | read.eigrp.wanip.pl[{HOST.IP}] | 0 | 0 | | | 30 |
                        +--------+--------------------------------+-------+----------+---------+-------+----------+
                        1 row in set (0.00 sec)

                        MariaDB [zabbix]> select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=44958;
                        +--------+---------+-------+----------+---------+-------+----------+
                        | hostid | key_ | state | evaltype | formula | error | lifetime |
                        +--------+---------+-------+----------+---------+-------+----------+
                        | 10537 | ifDescr | 0 | 0 | | | 7 |
                        +--------+---------+-------+----------+---------+-------+----------+
                        1 row in set (0.04 sec)

                        MariaDB [zabbix]> select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=30428 ;

                        +--------+---------+-------+----------+---------+-------+----------+
                        | hostid | key_ | state | evaltype | formula | error | lifetime |
                        +--------+---------+-------+----------+---------+-------+----------+
                        | 10221 | ifDescr | 0 | 0 | | | 7 |
                        +--------+---------+-------+----------+---------+-------+----------+
                        1 row in set (0.05 sec)

                        MariaDB [zabbix]> select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=37427 ;
                        +--------+------------------------------+-------+----------+---------+-------+----------+
                        | hostid | key_ | state | evaltype | formula | error | lifetime |
                        +--------+------------------------------+-------+----------+---------+-------+----------+
                        | 10376 | ciscoEnvMonSupplyStatusDescr | 0 | 0 | | | 7 |
                        +--------+------------------------------+-------+----------+---------+-------+----------+
                        1 row in set (0.06 sec)

                        MariaDB [zabbix]> select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=37426 ;

                        +--------+---------------------------+-------+----------+---------+-------+----------+
                        | hostid | key_ | state | evaltype | formula | error | lifetime |
                        +--------+---------------------------+-------+----------+---------+-------+----------+
                        | 10376 | ciscoEnvMonFanStatusDescr | 0 | 0 | | | 7 |
                        +--------+---------------------------+-------+----------+---------+-------+----------+
                        1 row in set (0.03 sec)

                        MariaDB [zabbix]> select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=37428 ;
                        +--------+-----------------------------------+-------+----------+---------+-------+----------+
                        | hostid | key_ | state | evaltype | formula | error | lifetime |
                        +--------+-----------------------------------+-------+----------+---------+-------+----------+
                        | 10376 | ciscoEnvMonTemperatureStatusDescr | 0 | 0 | | | 31 |
                        +--------+-----------------------------------+-------+----------+---------+-------+----------+
                        1 row in set (0.00 sec)

                        MariaDB [zabbix]> select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=43060 ;
                        +--------+--------------------------------+-------+----------+---------+-------+----------+
                        | hostid | key_ | state | evaltype | formula | error | lifetime |
                        +--------+--------------------------------+-------+----------+---------+-------+----------+
                        | 10496 | read.eigrp.wanip.pl[{HOST.IP}] | 0 | 0 | | | 30 |
                        +--------+--------------------------------+-------+----------+---------+-------+----------+
                        1 row in set (0.09 sec)



                        most of queries are returning lifetime value while the formula column is empty.
                        Do you suggest updating zabbix server? currently we have the zabbix-3.2.1

                        Comment

                        • batchenr
                          Senior Member
                          • Sep 2016
                          • 440

                          #13
                          Originally posted by webadmin
                          MariaDB [zabbix]> select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=42457 ;

                          +--------+--------------------------------+-------+----------+---------+-------+----------+
                          | hostid | key_ | state | evaltype | formula | error | lifetime |
                          +--------+--------------------------------+-------+----------+---------+-------+----------+
                          | 10483 | read.eigrp.wanip.pl[{HOST.IP}] | 0 | 0 | | | 30 |
                          +--------+--------------------------------+-------+----------+---------+-------+----------+
                          1 row in set (0.00 sec)

                          MariaDB [zabbix]> select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=44958;
                          +--------+---------+-------+----------+---------+-------+----------+
                          | hostid | key_ | state | evaltype | formula | error | lifetime |
                          +--------+---------+-------+----------+---------+-------+----------+
                          | 10537 | ifDescr | 0 | 0 | | | 7 |
                          +--------+---------+-------+----------+---------+-------+----------+
                          1 row in set (0.04 sec)

                          MariaDB [zabbix]> select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=30428 ;

                          +--------+---------+-------+----------+---------+-------+----------+
                          | hostid | key_ | state | evaltype | formula | error | lifetime |
                          +--------+---------+-------+----------+---------+-------+----------+
                          | 10221 | ifDescr | 0 | 0 | | | 7 |
                          +--------+---------+-------+----------+---------+-------+----------+
                          1 row in set (0.05 sec)

                          MariaDB [zabbix]> select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=37427 ;
                          +--------+------------------------------+-------+----------+---------+-------+----------+
                          | hostid | key_ | state | evaltype | formula | error | lifetime |
                          +--------+------------------------------+-------+----------+---------+-------+----------+
                          | 10376 | ciscoEnvMonSupplyStatusDescr | 0 | 0 | | | 7 |
                          +--------+------------------------------+-------+----------+---------+-------+----------+
                          1 row in set (0.06 sec)

                          MariaDB [zabbix]> select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=37426 ;

                          +--------+---------------------------+-------+----------+---------+-------+----------+
                          | hostid | key_ | state | evaltype | formula | error | lifetime |
                          +--------+---------------------------+-------+----------+---------+-------+----------+
                          | 10376 | ciscoEnvMonFanStatusDescr | 0 | 0 | | | 7 |
                          +--------+---------------------------+-------+----------+---------+-------+----------+
                          1 row in set (0.03 sec)

                          MariaDB [zabbix]> select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=37428 ;
                          +--------+-----------------------------------+-------+----------+---------+-------+----------+
                          | hostid | key_ | state | evaltype | formula | error | lifetime |
                          +--------+-----------------------------------+-------+----------+---------+-------+----------+
                          | 10376 | ciscoEnvMonTemperatureStatusDescr | 0 | 0 | | | 31 |
                          +--------+-----------------------------------+-------+----------+---------+-------+----------+
                          1 row in set (0.00 sec)

                          MariaDB [zabbix]> select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=43060 ;
                          +--------+--------------------------------+-------+----------+---------+-------+----------+
                          | hostid | key_ | state | evaltype | formula | error | lifetime |
                          +--------+--------------------------------+-------+----------+---------+-------+----------+
                          | 10496 | read.eigrp.wanip.pl[{HOST.IP}] | 0 | 0 | | | 30 |
                          +--------+--------------------------------+-------+----------+---------+-------+----------+
                          1 row in set (0.09 sec)



                          most of queries are returning lifetime value while the formula column is empty.
                          Do you suggest updating zabbix server? currently we have the zabbix-3.2.1
                          can you try run
                          mysqlcheck --all-databases --auto-repair -o
                          did this errors just starting happening or is it a new install ?

                          Comment

                          • webadmin
                            Junior Member
                            • Jan 2017
                            • 10

                            #14
                            it is not a new installation but the problem is occurring frequently every 30 minutes and sometimes every 8 hours
                            We will try to run the provided mysql and will get back to you

                            Comment

                            Working...