Ad Widget

**mrogers-9898** · 22-11-2018, 00:45

I have got a similar scenario. I upgraded from 3.4 to 4.0 and I'm getting a lot of nodata alerts on my active agents.

I'm really confused at what I've found so far, but it looks like the agent no data alerts are being calculated off the time on the agent, not the server. I had 4 hosts that I could not get working in terms of nodata alerts, and I found that they had their time out of alignment enough that it made the data historically invalid for the nodata alert. Are the system times on our agents an exact alignment with your server?

Above aside, even with that weird time thing, I'm getting a lot of no data alerts even on systems with good aligned times. I've upped the debugging level on the agents, and I can see they're all sending data fine. I'd have thought I'd have a database issue with processing, but all of my Zabbix performance values are near idle. My netdata performance on the Zabbix host itself, shows hosts disks working fine, not overworked.

I'm at a loss where to turn to next as well.

**gjacko197** · 22-11-2018, 12:54

Got the same issue with a few hosts, some had the time issue but ive had to disable some hosts as no matter what, they are not processing quick enough and just show as down when they are not

The majority of hosts are processing within 30 seconds up to a max of 1 min, but these problem hosts seem to either take longer than the trigger period of 10 mins i have set or just do not process at all

All was fine before upgrading to version 4.0

**dimir** · 22-11-2018, 13:07

Since 4.0 user is responsible for keeping agent/proxy/server time in sync. Adjusting the timestamps according to the time difference between server and proxy was removed in 4.0:

9 Upgrade notes for 4.0.0

https://www.zabbix.com/documentation/4.0/manual/installation/upgrade_notes_400#timestamp_correction

Issue where it happened: https://support.zabbix.com/browse/ZBX-12957

**semiraue** · 27-11-2018, 08:12

Confirmed..! Issue is with new zabbix version 4.1. I downgrade to zabbix server version 3.4 and the active checks working perfectly now. There is no host delayed on the queue tab anymore and no false alerts.

dimir Not clear what you mean by "user is responsible for keeping agent/proxy/server time in sync". Does it mean all the zabbix-agent and server should be in same time zone and time? Or I have to manually create item to get host time and compare Host-unreachable trigger with it ? Please explain

**dimir** · 27-11-2018, 13:00

Quoting the upgrade notes of 4.0:

Timestamp correction

Zabbix server will no longer correct timestamps in cases when Zabbix proxy time differs from Zabbix server time.

Before 4.0 server/proxy were adjusting the value timestamps with the difference in time between client and server (agent-proxy, agent-server, proxy-server, whatever). This was done by comparing the timestamp from the packet with current timestamp on receiving side. I suggest enabling DebugLevel=4 for server/proxy and check the log for the following string:

Code:

"delta time from json"

You can increase log level temporarily for pollers in case of passive checks and trapper in case of active checks, e. g.

Code:

zabbix_server -Rlog_level_increase=poller
...check the log file...
zabbix_server -Rlog_level_decrease=poller

I'm only guessing that this could be causing your issues.

**mrogers-9898** · 18-12-2018, 01:58

This seems to be the root of my problems. I'm getting some odd results for the "delta time from json" test.

some massive times, some negative times

timestamp from json 1545088520 seconds and 83703658 nanosecond, delta time from json 109 seconds and 344292014 nanosecond
timestamp from json 1545080585 seconds and 97322844 nanosecond, delta time from json 8044 seconds and 356013425 nanosecond
timestamp from json 1545088584 seconds and 245522331 nanosecond, delta time from json 45 seconds and 210932387 nanosecond
timestamp from json 1545088632 seconds and 347218905 nanosecond, delta time from json -2 seconds and -890282385 nanosecond
timestamp from json 1545088579 seconds and 708274800 nanosecond, delta time from json 49 seconds and 764608651 nanosecond
timestamp from json 1545088626 seconds and 726048142 nanosecond, delta time from json 2 seconds and 779601511 nanosecond
timestamp from json 1545088665 seconds and 826560300 nanosecond, delta time from json -36 seconds and -309845226 nanosecond
timestamp from json 1545088623 seconds and 329288575 nanosecond, delta time from json 6 seconds and 321831017 nanosecond
timestamp from json 1545088818 seconds and 554978700 nanosecond, delta time from json -188 seconds and -663221994 nanosecond
timestamp from json 1545088438 seconds and 444875024 nanosecond, delta time from json 191 seconds and 608670246 nanosecond
timestamp from json 1545088629 seconds and 847397923 nanosecond, delta time from json 0 seconds and 203353093 nanosecond
timestamp from json 1545088438 seconds and 42507158 nanosecond, delta time from json 192 seconds and 20089770 nanosecond
timestamp from json 1545088632 seconds and 315308900 nanosecond, delta time from json -2 seconds and -200238105 nanosecond
timestamp from json 1545088616 seconds and 184540200 nanosecond, delta time from json 13 seconds and 949654597 nanosecond
timestamp from json 1545088535 seconds and 961306674 nanosecond, delta time from json 94 seconds and 190971346 nanosecond
timestamp from json 1545088626 seconds and 110769595 nanosecond, delta time from json 4 seconds and 68602452 nanosecond

I've checked my agents, and their times are solid, they're not out of sync with the Zabbix server. Can you suggest where I can dig next?

**mrogers-9898** · 18-12-2018, 02:05

More weird.

I've a item that checks local time on agent every 30 seconds. When checking that, the value itself, it's return time is spot on (+/-30sec to be expected) but the item "Last checked time" in Zabbix is 5 minutes out.

Attached Files

**dimir** · 18-12-2018, 13:36

Simple solution, set up time synchronization (NTPD) on every host involved: server, proxies, agents.

**mrogers-9898** · 19-12-2018, 00:32

The suggestion here is that the time is out of sync? I've solved that problem - server and agents are in sync.

I must have some other, additional, issue here.

If my items are being processed slowly on my server, would that cause these kinds of troubles?

I see I have a big queue, 6k items. If I pick an agent out of the queue and check it out, nothing jumps out as a problem. The agent is sending data promptly. My server seems to not be having any trouble with the load - what causes items to get stuck in the queue if the agent is (seemingly) sending them on time?

Attached Files

**dimir** · 19-12-2018, 12:53

From your screenshot it's visible that there is an issue with Housekeeper. For example in first spike it hit 100% busy and stayed that way for longer than an hour. During that time you will have issues and that's why probably there's lots of items queued. If you have lot's of data it is suggested to use database partitioning, otherwise it's not possible to fix issues like that.
There some articles on zabbix.org and

Zabbix Integrations and Templates

https://share.zabbix.com/cat-app/zabbix-partition

The Zabbix Team has collected all official Zabbix monitoring templates and integrations.

Also it was already discussed here, e. g.

We’ll be back soon!

https://www.zabbix.com/forum/zabbix-suggestions-and-feedback/51630-zabbix-mysql-partitioning

**mrogers-9898** · 19-12-2018, 22:28

Yep, there was definitely a big blip there. That was very likely to me shortening a huge amount of item durations and retentions, in order to try reduce load - to see if that helped the queue.

This 7 day graph shows a much calmer keeper.

I don't think my install is a particularly big one, and my hardware is in the "ok" range. My DB is 60 GB, disk are 15k SAS RAID10.

Or do I have my wires crossed and it's still a housekeeper problem?

Attached Files

**dimir** · 20-12-2018, 14:09

Can you tell what type of items are being in the queue for longer time?

**mrogers-9898** · 21-12-2018, 01:41

Hi Dimir,

They look to be all types, string, ints. Data on services, data on disks.

Attached Files

**mrogers-9898** · 21-12-2018, 02:57

I may have my context off there - these all active items.

Ad Widget

Active agent items delayed - false alerts (Host unreachable)

Active agent items delayed - false alerts (Host unreachable)

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment