Using 4.4 with proxy.
I'm dealing with a custom LLD for Nokia (formerly Alcatel-Lucent) 7750SR devices: specifically, I need to discover SAPs, which are not discovered by the IF::MIB oid.
I'm actually focusing on the "External Check" way, letting a script of mine do the snmp stuff. The script returns this kind of output, which is exactly what I need:
[
And Zabbix is able to generate items and graphs as per the discovery rule I created on my template, great! But I have devices where I cannot make it work because of the "timeout" I get when running it: the problem here is the time it takes to run, as I have some devices that are very high loaded, with hundreds of SAPs to discover. In such a case my script can take up to two minutes print them all, so I understand there is no other way than compile the source-code files and increase the timeout value up to 2 min (I actually have 20seconds timeout in zabbix_proxy.conf)
But I didn't want to mess with it, and just tried a different approach:
. using a crontab job to call a script that discovers all the sap of all my high-loaded devices: the scripts writes theoutput to a local file named "<nodename>.data"
. create a second script that 'cat' the "<nodename>.data" file
. using the 'cat' script in External Check in order to discover SAPs and create items and graphs
The result from this method is the same output as per the method above, but it takes much less time to get data: 7 seconds vs 2mins! So I expect it to work, but it's not.....Better said, it works great for those devices with fewer saps, but it still doesn't work for those wit a lot of sap, and I 'm in a dead end now, because I don't get any error message. The discovery rule is in "Supported" state, but no items are getting created. Is there a data limit for an External Check script to return? I read UserParameter has 512kB data limit on proxies, but I didn't find such a limit in the documentation for External Checks.
I tried to increase the log_level but the log becomes unreadable, too many lines scrolling at any time. How can I further troubleshoot this anomaly? Please help!
I'm dealing with a custom LLD for Nokia (formerly Alcatel-Lucent) 7750SR devices: specifically, I need to discover SAPs, which are not discovered by the IF::MIB oid.
I'm actually focusing on the "External Check" way, letting a script of mine do the snmp stuff. The script returns this kind of output, which is exactly what I need:
[
{
"{#SVCID}":"8",
"{#SAPDESCR}":"Link 1x10Gb locale v/o ac6-bo2 teng0/2/0/0",
"{#SNMPINDEX}":"8.35717120.8"
"{#SAPDESCR}":"Link 1x10Gb locale v/o ac6-bo2 teng0/2/0/0",
"{#SNMPINDEX}":"8.35717120.8"
},
{
{
"{#SVCID}":"8",
"{#SAPDESCR}":"Link 1x10Gb locale v/o sw7750-bsw2-bo2 port 1/1/4",
"{#SNMPINDEX}":"8.35782656.8"
"{#SAPDESCR}":"Link 1x10Gb locale v/o sw7750-bsw2-bo2 port 1/1/4",
"{#SNMPINDEX}":"8.35782656.8"
},
]And Zabbix is able to generate items and graphs as per the discovery rule I created on my template, great! But I have devices where I cannot make it work because of the "timeout" I get when running it: the problem here is the time it takes to run, as I have some devices that are very high loaded, with hundreds of SAPs to discover. In such a case my script can take up to two minutes print them all, so I understand there is no other way than compile the source-code files and increase the timeout value up to 2 min (I actually have 20seconds timeout in zabbix_proxy.conf)
But I didn't want to mess with it, and just tried a different approach:
. using a crontab job to call a script that discovers all the sap of all my high-loaded devices: the scripts writes theoutput to a local file named "<nodename>.data"
. create a second script that 'cat' the "<nodename>.data" file
. using the 'cat' script in External Check in order to discover SAPs and create items and graphs
The result from this method is the same output as per the method above, but it takes much less time to get data: 7 seconds vs 2mins! So I expect it to work, but it's not.....Better said, it works great for those devices with fewer saps, but it still doesn't work for those wit a lot of sap, and I 'm in a dead end now, because I don't get any error message. The discovery rule is in "Supported" state, but no items are getting created. Is there a data limit for an External Check script to return? I read UserParameter has 512kB data limit on proxies, but I didn't find such a limit in the documentation for External Checks.
I tried to increase the log_level but the log becomes unreadable, too many lines scrolling at any time. How can I further troubleshoot this anomaly? Please help!
Comment