Ad Widget

Collapse

Zabbix crashing 30 minutes after reboot

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • ssantivvera13
    Junior Member
    • Jul 2024
    • 4

    #1

    Zabbix crashing 30 minutes after reboot

    Hello! I have an issue with zabbix, I´m implemented this 1 month ago, and few days ago started to fail the web interface with a pop up like this.

    Click image for larger version  Name:	image.png Views:	14 Size:	13.8 KB ID:	487800

    So I see some topics and made some change in the server, the vm is an ubuntu 24.04 and have mysql for database, have 2 cpu, 8GB Ram, and 200GB to storage. I change the memory_limit to 128M to 64M, and i give permision to read to other users. Thanks for the help!


    This is the log of the zabbix server.

    927:20240716:144842.870 Starting Zabbix Server. Zabbix 7.0.0 (revision 49955f1fb5c).
    927:20240716:144842.911 ****** Enabled features ******
    927:20240716:144842.911 SNMP monitoring: YES
    927:20240716:144842.911 IPMI monitoring: YES
    927:20240716:144842.911 Web monitoring: YES
    927:20240716:144842.911 VMware monitoring: YES
    927:20240716:144842.911 SMTP authentication: YES
    927:20240716:144842.911 ODBC: YES
    927:20240716:144842.911 SSH support: YES
    927:20240716:144842.911 IPv6 support: YES
    927:20240716:144842.911 TLS support: YES
    927:20240716:144842.911 ******************************
    927:20240716:144842.911 using configuration file: /etc/zabbix/zabbix_server.conf
    927:20240716:144842.936 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
    927:20240716:144842.936 database is down: reconnecting in 10 seconds
    927:20240716:144852.936 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
    927:20240716:144852.937 database is down: reconnecting in 10 seconds
    927:20240716:144902.525 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
    927:20240716:144902.803 database is down: reconnecting in 10 seconds
    927:20240716:144912.804 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
    927:20240716:144912.804 database is down: reconnecting in 10 seconds
    927:20240716:144922.804 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
    927:20240716:144922.804 database is down: reconnecting in 10 seconds
    927:20240716:144932.804 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
    927:20240716:144932.897 database is down: reconnecting in 10 seconds
    927:20240716:144942.897 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
    927:20240716:144942.897 database is down: reconnecting in 10 seconds
    927:20240716:144952.897 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
    927:20240716:144952.898 database is down: reconnecting in 10 seconds
    927:20240716:145002.930 database connection re-established
    927:20240716:145003.281 current database version (mandatory/optional): 07000000/07000000
    927:20240716:145003.282 required mandatory version: 07000000
    993:20240716:145003.348 starting HA manager
    993:20240716:145003.418 HA manager started in active mode
    927:20240716:145003.586 server #0 started [main process]
    994:20240716:145003.587 server #1 started [service manager #1]
    995:20240716:145003.588 server #2 started [configuration syncer #1]
    998:20240716:145013.238 server #3 started [alert manager #1]
    999:20240716:145013.239 server #4 started [alerter #1]
    1000:20240716:145013.240 server #5 started [alerter #2]
    1001:20240716:145013.240 server #6 started [alerter #3]
    1002:20240716:145013.241 server #7 started [preprocessing manager #1]
    1003:20240716:145013.243 server #8 started [lld manager #1]
    1004:20240716:145013.522 server #9 started [lld worker #1]
    1002:20240716:145013.523 [2] thread started [preprocessing worker #2]
    1002:20240716:145013.523 [3] thread started [preprocessing worker #3]
    1005:20240716:145013.523 server #10 started [lld worker #2]
    1013:20240716:145013.523 server #15 started [discovery manager #1]
    1014:20240716:145013.525 server #16 started [history syncer #1]
    1009:20240716:145013.569 server #11 started [housekeeper #1]
    1010:20240716:145013.569 server #12 started [timer #1]
    1011:20240716:145013.569 server #13 started [http poller #1]
    1015:20240716:145013.570 server #17 started [history syncer #2]
    1016:20240716:145013.571 server #18 started [history syncer #3]
    1017:20240716:145013.571 server #19 started [history syncer #4]
    1012:20240716:145013.572 server #14 started [browser poller #1]
    1018:20240716:145013.574 server #20 started [escalator #1]
    1019:20240716:145013.576 server #21 started [proxy poller #1]
    1020:20240716:145013.578 server #22 started [self-monitoring #1]
    1022:20240716:145013.578 server #24 started [poller #1]
    1023:20240716:145013.580 server #25 started [poller #2]
    1024:20240716:145013.582 server #26 started [poller #3]
    1025:20240716:145013.584 server #27 started [poller #4]
    1026:20240716:145013.585 server #28 started [poller #5]
    1027:20240716:145013.594 server #29 started [unreachable poller #1]
    1028:20240716:145013.595 server #30 started [trapper #1]
    1029:20240716:145013.596 server #31 started [trapper #2]
    1030:20240716:145013.599 server #32 started [trapper #3]
    1031:20240716:145013.602 server #33 started [trapper #4]
    1032:20240716:145013.603 server #34 started [trapper #5]
    1033:20240716:145013.607 server #35 started [icmp pinger #1]
    1034:20240716:145013.607 server #36 started [alert syncer #1]
    1035:20240716:145013.609 server #37 started [history poller #1]
    1036:20240716:145013.612 server #38 started [history poller #2]
    1037:20240716:145013.614 server #39 started [history poller #3]
    1038:20240716:145013.618 server #40 started [history poller #4]
    1039:20240716:145013.625 server #41 started [history poller #5]
    1040:20240716:145013.632 server #42 started [availability manager #1]
    1021:20240716:145013.643 server #23 started [task manager #1]
    1041:20240716:145013.643 server #43 started [trigger housekeeper #1]
    1042:20240716:145013.644 server #44 started [odbc poller #1]
    1046:20240716:145013.646 server #48 started [configuration syncer worker #1]
    1047:20240716:145013.650 server #49 started [internal poller #1]
    1048:20240716:145013.653 server #50 started [proxy group manager #1]
    1043:20240716:145013.679 server #45 started [http agent poller #1]
    1044:20240716:145013.683 server #46 started [agent poller #1]
    1043:20240716:145013.684 thread started
    1045:20240716:145013.684 server #47 started [snmp poller #1]
    1044:20240716:145013.684 thread started
    1045:20240716:145013.686 thread started
    1002:20240716:145013.691 [1] thread started [preprocessing worker #1]
    1013:20240716:145013.692 for a discovery process with 5 workers, the user limit of 1024 file descriptors is insufficient. The maximum number of concurrent checks per worker has been reduced to 122
    1013:20240716:145014.169 thread started [discovery worker #1]
    1013:20240716:145014.169 thread started [discovery worker #2]
    1013:20240716:145014.169 thread started [discovery worker #5]
    1013:20240716:145014.170 thread started [discovery worker #3]
    1013:20240716:145014.170 thread started [discovery worker #4]
    1014:20240716:145015.385 [Z3005] query failed: [1114] The table 'history_uint' is full [insert into history_uint (itemid,clock,ns,value) values (49486,1721140306,312399910,8587563008),(49488,172 1140374,328269796,48969050),(49490,1721140525,3882 81599,8589467648),(49491,1721140609,405188082,7817 629696),(49493,1721140699,438221466,2),(49543,1721 140303,309117752,0),(49544,1721140304,312207079,0) ,(49545,1721140305,312618234,0),(49546,1721140306, 312851992,0),(49547,1721140307,313460882,0),(49548 ,1721141232,521095682,0),(495 1017:20240716:145214.813 slow query: 117.908008 sec, "commit;"
    1004:20240716:145214.938 slow query: 120.495003 sec, "commit;"
    1016:20240716:145214.938 slow query: 91.004687 sec, "commit;"
    993:20240716:145214.938 slow query: 117.362653 sec, "commit;"
    1040:20240716:145214.940 slow query: 114.804353 sec, "commit;"
    1014:20240716:145214.950 slow query: 115.778432 sec, "commit;"
    1015:20240716:145214.951 slow query: 117.201797 sec, "commit;"
    1014:20240716:145220.200 [Z3005] query failed: [1114] The table 'history_uint' is full [insert into history_uint (itemid,clock,ns,value) values (42228,1721141448,686990849,0),(42232,1721141452,6 87067672,1721141452),(48939,1721141439,610572417,1 885814788),(49059,1721141439,192234945,821 zz0.4jood43nvzz
    1004:20240716:160532.691 fping failed: no output
    1004:20240716:160632.703 fping failed: no output
    1004:20240716:160732.715 fping failed: no output
    1004:20240716:160832.726 fping failed: no output
    1004:20240716:160932.738 fping failed: no output
    1004:20240716:161032.750 fping failed: no output
    1004:20240716:161132.762 fping failed: no output
    1004:20240716:161232.774 fping failed: no output
    1004:20240716:161332.786 fping failed: no output
    1004:20240716:161432.798 fping failed: no output
    1004:20240716:161532.811 fping failed: no output
    1004:20240716:161632.823 fping failed: no output
    1004:20240716:161732.835 fping failed: no output
    1004:20240716:161832.848 fping failed: no output
    1004:20240716:161932.860 fping failed: no output
    1004:20240716:162032.872 fping failed: no output
    1004:20240716:162132.883 fping failed: no output
    1004:20240716:162232.895 fping failed: no output
    1004:20240716:162332.907 fping failed: no output
    1004:20240716:162432.919 fping failed: no output
    1004:20240716:162532.930 fping failed: no output
    1004:20240716:162632.943 fping failed: no output
    1004:20240716:162732.955 fping failed: no output
    1004:20240716:162832.966 fping failed: no output
    1004:20240716:162932.975 Cannot execute "/usr/bin/fping -c1 -t50 -i0": Cannot write address into temporary file: [28] No space left on device
    1004:20240716:163032.984 Cannot execute "/usr/bin/fping -c1 -t50 -i0": Cannot write address into temporary file: [28] No space left on device
    1004:20240716:163132.992 Cannot execute "/usr/bin/fping -c1 -t50 -i0": Cannot write address into temporary file: [28] No space left on device
    1004:20240716:163232.001 Cannot execute "/usr/bin/fping -c1 -t50 -i0": Cannot write address into temporary file: [28] No space left on device
    1004:20240716:163332.010 Cannot execute "/usr/bin/fping -c1 -t50 -i0": Cannot write address into temporary file: [28] No space left on device
    1004:20240716:163432.019 Cannot execute "/usr/bin/fping -c1 -t50 -i0": Cannot write address into temporary file: [28] No space left on device
    1004:20240716:163532.028 Cannot execute "/usr/bin/fping -c1 -t50 -i0": Cannot write address into temporary file: [28] No space left on device
    1004:20240716:163632.036 Cannot execute "/usr/bin/fping -c1 -t50 -i0": Cannot write address into temporary file: [28] No space left on device
    1004:20240716:163732.043 Cannot execute "/usr/bin/fping -c1 -t50 -i0": Cannot write address into temporary file: [28] No space left on device
    1004:20240716:163832.052 Cannot execute "/usr/bin/fping -c1 -t50 -i0": Cannot write address into temporary file: [28] No space left on device

    Zabbix_server.conf

    Code:
    LogFile=/var/log/zabbix/zabbix_server.log
    LogFileSize=0
    PidFile=/run/zabbix/zabbix_server.pid
    SocketDir=/run/zabbix
    DBName=
    DBUser=
    DBPassword=
    SNMPTrapperFile=/var/log/snmptrap/snmptrap.log
    Timeout=4
    FpingLocation=/usr/bin/fping
    Fping6Location=/usr/bin/fping6
    LogSlowQueries=3000
    StatsAllowedIP=127.0.0.1
    EnableGlobalScripts=0
    ​


    Last edited by ssantivvera13; 18-07-2024, 19:27.
  • Markku
    Senior Member
    Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
    • Sep 2018
    • 1781

    #2
    Please use

    sudo grep -v -E "^(#|$)" /etc/zabbix/zabbix_server.conf

    and edit your post to remove all that unnecessary long list of unreadable config file

    Markku

    Comment

    • Markku
      Senior Member
      Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
      • Sep 2018
      • 1781

      #3
      According to the log, your disk is full. What does "df -h" show?

      Markku

      Comment

      Working...