Ad Widget

**sancho** · 19-10-2019, 20:25

Hello again Kos

First of all, thanks for your help.

I have tested the item as indicated, with the name of the subsystem.

proc.num [,,,GRPALM04] (I changed subsystem because 03 today had no jobs in the queue).

And the value it gives me is 1, this queue currently has 6 jobs. I think that value 1 returns because it is detecting the name of the subsystem to which this queue belongs.

If the item you set up with GRPALM03 also gives me 1, although his job queue is empty.

**sancho** · 23-02-2020, 11:03

Hi Kos

I hope everything goes well.

I edit my message to try to explain myself better,

The idea is through an item discovery to obtain users with jobs in RUN or LCKW status for example.

This would be possible???

Sorry if I can't explain myself better

**guille_pm** · 14-04-2020, 01:27

Hi Kos

I'm doing some test with Zabbix and some internal IBM i systems, so far so good! Thanks for all your hard work

I'm having a problem with one of the items, as400.outputqueue.size[QPRINT,QGPL]. The agent is writing the following to the log

32:20200413:034047.315 active check "as400.outputqueue.size[QPRINT,QGPL]" is not supported: com.ibm.as400.access.AS400Exception: CPF34C4 List is too large for user space QNPSLIST.
32:20200413:035154.419 As400Metric.process() error: com.ibm.as400.access.AS400Exception: CPF34C4 List is too large for user space QNPSLIST.
32:20200413:040656.018 As400Metric.process() error: com.ibm.as400.access.AS400Exception: CPF34C4 List is too large for user space QNPSLIST.

QGPL/QPRINT has currently 45033 spool files.

**Kos** · 14-04-2020, 11:08

Hi guille_pm ,

unfortunately, I could not help in your situation.
Our Zabbix agent emulator just re-transmits the error message from Operating System if some error occurs.
I do not use explicitly any user space or IBM i API; the agent in this case uses the standard Java class (wrapper over the IBM i API) SpooledFileList , setting 2 filters - by user (*ALL) and by library/queue (using methods setUserFilter() and setQueueFilter()), and then calling method openSynchronously() to get a size of result. Probably, this error is occurring on the last step (the openSynchronously() call), but it is out of my control :-(

Is it possible to decrease the size of this queue? We have the queue sizes a bit more 10000 in our environment, it works without problems.

**guille_pm** · 14-04-2020, 15:57

Yes, I can probably clean it up a bit, but it won't work for production systems, where the count can go up to 500k. That's fine though, I can do a database monitor check for output queues.
By the way, just want to share some enhancement IBM has been doing for IBM i. Starting on V7R2, they have implemented what they call IBM i Services. These are several SQL views and UDTF that can get some data pretty easily, and without the need to use APIs. So, for output queues spool files, I can do a SELECT NUMBER_OF_FILE from QSYS2.OUTPUT_QUEUE_INFO where OUTPUT_QUEUE_LIBRARY = 'your output queue library' and OUTPUT_QUEUE_NAME = 'your output queue name'.
It may make some checks easier to do.

**guille_pm** · 14-04-2020, 16:30

Hi Kos
Another quick question. Is there any way to get the fully qualified job name, like NUMBER/USER/JOBNAME from evenlog? I used ITEM.LOG.SOURCE macro, but it just returns the job name.
Thanks!

**Kos** · 15-04-2020, 15:17

Hi, guille_pm !

Thank you for information; probably, it's reasonable to use some other way to obtain size of queue (and IBM i Services - one of possible methods).

Regarding fully qualified job name from event log. Unfortunately, direct answer is: no, there is no such feature now.
However, as you wrote, you can obtain just a job name via macro.
Additionally, you can use the "as400UserAsMessagePrefix=1" option in the agent's config.file: it'll produce the Job's user name as a prefix of every message (i.e. its value), that could be easy extractable via regular expression and macro-functions.
Unfortunately, it's impossible to get the job number in the current implementation; however, it is not very hard to add this feature (using the same method as for User name of job, i.e. as an additional prefix for the message body). There was just no such need up to now :-)

**Fabian** · 27-04-2020, 11:48

Good morning.

Thank you for this agent and for mantaining this thread alive, I found it while researching an error we are having.

Our agent responds to the passive checks, but does not so with the active checks.

Our AS400 Agent is behind a zabbix proxy. Both server and proxy are on version 4.2.8.

We have activated the debug feature. When we start the agent, we obtain the following logs:

Code:

   34:20200422:101158.448  Agent hostname: 'CLIENT_APMTEST', System info: IBM OS/400 APMTEST V7R2M0, IBM Corporation IBM J9 VM (v1.8.0_201)
    39:20200422:101158.479 agent #1 started [collector]
    40:20200422:101158.521 agent #2 (10.129.240.13:10051) started [active checks #3]
    42:20200422:101158.545 agent #3 started[listener #1]
    44:20200422:101158.553 agent #5 started[listener #3]
    43:20200422:101158.556 agent #4 started[listener #2]
    34:20200422:101801.885 Starting Zabbix Agent v0.7.7
    34:20200422:101801.892 using configuration file: /home/ZABBIX/agentd/zabbix_agentd.conf
    34:20200422:101801.893 agent #-1 started [ZabbixAgent config]
    34:20200422:101801.893 IBM Corporation Java version "1.8.0_201"
    34:20200422:101801.894  Java(TM) SE Runtime Environment (build 8.0.5.30 - pap3280sr5fp30-20190207_01(SR5 FP30))
    34:20200422:101801.894  IBM J9 VM (build 2.9, JRE 1.8.0 OS/400 ppc-32-Bit 20190124_408237 (JIT enabled, AOT enabled)
OpenJ9   - 9c77d86
OMR      - dad8ba7
IBM      - e2996d1)
    34:20200422:101801.913  Open Source Software, JTOpen 9.4, codebase 5770-SS1 V7R3M0.00 built=20170816 @U4
    34:20200422:101801.922 ZbxMetric() constructor: metric 'log' successfully added
    34:20200422:101801.923 ZbxMetric() constructor: metric 'logrt' successfully added
    34:20200422:101801.923 ZbxMetric() constructor: metric 'eventlog' successfully added
    34:20200422:101801.925 ZbxMetric() constructor: metric 'agent.exit' successfully added
    34:20200422:101801.926 ZbxMetric() constructor: metric 'agent.hostname' successfully added
    34:20200422:101801.927 ZbxMetric() constructor: metric 'agent.ping' successfully added
    34:20200422:101801.929 ZbxMetric() constructor: metric 'agent.version' successfully added
    34:20200422:101801.938 ZbxMetric() constructor: metric 'system.hostname' successfully added
    34:20200422:101801.940 ZbxMetric() constructor: metric 'system.uname' successfully added
    34:20200422:101801.943 ZbxMetric() constructor: metric 'system.localtime' successfully added
    34:20200422:101801.944 ZbxMetric() constructor: metric 'system.cpu.num' successfully added
    34:20200422:101801.945 ZbxMetric() constructor: metric 'as400.cpu.capacity' successfully added
    34:20200422:101801.946 ZbxMetric() constructor: metric 'system.users.num' successfully added
    34:20200422:101801.947 ZbxMetric() constructor: metric 'proc.num' successfully added
    34:20200422:101801.949 ZbxMetric() constructor: metric 'as400.subsystem' successfully added
    34:20200422:101801.950 ZbxMetric() constructor: metric 'as400.outputqueue.size' successfully added
    34:20200422:101801.951 ZbxMetric() constructor: metric 'as400.services' successfully added
    34:20200422:101801.952 ZbxMetric() constructor: metric 'vfs.fs.discovery' successfully added
    34:20200422:101801.953 ZbxMetric() constructor: metric 'vfs.fs.size' successfully added
    34:20200422:101801.954 ZbxMetric() constructor: metric 'vfs.fs.state' successfully added
    34:20200422:101801.955 ZbxMetric() constructor: metric 'as400.disk.discovery' successfully added
    34:20200422:101801.956 ZbxMetric() constructor: metric 'as400.disk.size' successfully added
    34:20200422:101801.957 ZbxMetric() constructor: metric 'as400.disk.state' successfully added
    34:20200422:101801.958 ZbxMetric() constructor: metric 'as400.disk.asp' successfully added
    34:20200422:101801.959 ZbxMetric() constructor: metric 'as400.systemPool.discovery' successfully added
    34:20200422:101801.961 ZbxMetric() constructor: metric 'as400.systemPool.state' successfully added
    34:20200422:101801.962 ZbxMetric() constructor: metric 'proc.cpu.util.discovery' successfully added
    34:20200422:101801.963 ZbxMetric() constructor: metric 'proc.cpu.util' successfully added
    34:20200422:101802.011 constuctor AgentRequest(): str='system.uname', key_name='system.uname'
    34:20200422:101802.011  parameters list: null
    34:20200422:101802.012 in ZabbixAgent.process(): key_name='system.uname', full key='system.uname'
    34:20200422:101802.959 As400Metric.process() is OK for system.uname: 'IBM OS/400 APMTEST V7R2M0, IBM Corporation IBM J9 VM (v1.8.0_201)'
    34:20200422:101802.962 end of ZabbixAgent.process()
    34:20200422:101802.963  Agent hostname: 'CLIENT_APMTEST', System info: IBM OS/400 APMTEST V7R2M0, IBM Corporation IBM J9 VM (v1.8.0_201)
    34:20200422:101802.985 agent #-1 stopped [ZabbixAgent config]
    38:20200422:101802.994 agent #0 started [Cache Controller thread]
    39:20200422:101802.996 agent #1 started [collector]
    40:20200422:101803.008 agent #2 (10.129.240.13:10051) started [active checks #3]
     1:20200422:101803.009 PassiveCheck.init(): starting
    40:20200422:101803.010 in refreshActiveChecks(): host:10.129.240.13, port:10051
    42:20200422:101803.012 agent #3 started[listener #1]
    43:20200422:101803.019 agent #4 started[listener #2]
    44:20200422:101803.028 agent #5 started[listener #3]
    40:20200422:101803.036 in send() to server '10.129.240.13:10051'
    39:20200422:101803.445  Procstat.updateJobinfoList() error: com.ibm.as400.access.AS400Exception: CPF3C53 No encontrado trabajo 979274/QUSER/QZRCSRVS.
    40:20200422:101806.067 End of send()
    40:20200422:101806.067  active check configuration update from [10.129.240.13:10051] started to fail (java.net.SocketTimeoutException: connect timed out)
    40:20200422:101806.068 end of refreshActiveChecks(): false
    40:20200422:101806.068 in processActiveChecks() server:'10.129.240.13' port:10051
    40:20200422:101806.069 End of processActiveChecks()
     1:20200422:101848.585  Thread [main]: got incoming connection from 10.129.240.13
     1:20200422:101848.585  connection will be processed by thread 44[listener #3]
    44:20200422:101848.586 in PassiveCheck.checkConnection()
    44:20200422:101848.587 end of PassiveCheck.checkConnection(): true
    44:20200422:101848.587 in PassiveCheck.process(), #3
    44:20200422:101848.587 ZBXD header is OK, data length=10
    44:20200422:101848.588 PassiveCheck.process(): request is: 'agent.ping'
    44:20200422:101848.589 constuctor AgentRequest(): str='agent.ping', key_name='agent.ping'
    44:20200422:101848.589  parameters list: null
    44:20200422:101848.589 in ZabbixAgent.process(): key_name='agent.ping', full key='agent.ping'
    44:20200422:101848.590 GenericMetric.process() is OK for agent.ping
    44:20200422:101848.590 end of ZabbixAgent.process()
    44:20200422:101848.591 PassiveCheck.process(): sending result: '1'

Could you direct us as to where we could find the solution to this error? I have found some references to CPF3C53 in this thread, but not a conclusive solution.

Thanks in advance.

**guille_pm** · 06-05-2020, 17:03

About your

active check configuration update from [10.129.240.13:10051] started to fail

I saw something similar, but with one of our VIO servers. I think it means that the timeout value was reached before the check finished. Check your agent config file for the timeout value, and also check your server config file, also for timeout value. The server timeout should always be bigger than the agent timeout. I ended up putting my agent at 7 and server at 9.

**Kos** · 06-05-2020, 17:48

Fabian , sorry, I did not see your message in time.

Really, the message about CPF3C53 in Collector thread could be safely ignored - it does mean that some job has disappeared during the job list processing.

What is more important is the message noted by guille_pm : it does really mean that the Agent could not connect to Zabbix Server (or Proxy, in your case).
Check, please, that there is no firewall restrictions for your Zabbix Proxy (10.129.240.13) to accept TCP-connections to port 10051 from your CLIENT_APMTEST host.
The log-file reveals that there is no problems with communications in reverse direction (10.129.240.13 -> CLIENT_APMTEST:10050 is OK).

**Fabian** · 11-05-2020, 12:50

Thanks a lot to you both. That was the problem. There was a wrong configuration of iptables in our zabbix proxy. It works perfectly now.

**guille_pm** · 29-05-2020, 01:05

Hi! Just wanted to share something, and also ask something...

So, I've been working with a friend on a way to get jobnames of jobs in MSGW. We've come up with something that works pretty well.
We recommend IBM i 7.3 for it.
For this, first you need to create a web services server on your IBM i. Open a browser and go to http (or https if you've secured it)://youserver:2001/HTTPAdmin, then on the left Create a Web Services Server. Follow the instructions (make sure to select an unused port), and once it is created and showing it on your screen, select it and click Deploy. Select REST in type and *SQL in implementation, then fill the prompts are described below

Procedure name:	Get_Job_Status
SQL Statement:	SELECT job_name as fulljob, substr(job_name, 1, LOCATE('/',job_name)-1) as numjob, substr(job_name, LOCATE('/',job_name)+1, LOCATE('/',job_name, 8) -locate('/', job_name)-1) as user, substr(job_name, LOCATE('/', job_name,LOCATE('/',job_name)+1)+1) as jobname, subsystem as sbs FROM table(active_job_info()) where job_status = ?
HTTP request method:	GET
URI path template for method:	/{job_status}
SQL result type:	Multi-row result set
Trim mode for output fields:	Trailing
SQL state information in response:	On errors
Treat warnings as SQL Errors:	Yes
User-defined error message:
HTTP status code on SQL success:	200
HTTP status code on SQL failure:	500
HTTP header information:
Allowed input media types:	*ALL
Returned output media types:	*JSON

Input parameter mappings:

job_status

VARCHAR

*PATH_PARAM

job_status

*NONE

Once done, start your web services server, and go to http://yourhost:yourport/web/service...ix_Checks/MSGW

If you open that in Firefox, you'll see a nice formatted JSON. With that done, you only need to create a discovery rule type HTTP Agent. Make sure to create a preprocessing step type JSONPath, and then the Macros to hold values in the JSON returned.
A trigger can then be created using, for instance, proc.num for the job, if the agent returns 1, you have your problem.
Since MSGW is a variable in the PATH, you could really use it for any other status, like LCKW.

Now to the question part, I'm trying to do some filtering on the eventlog metric. I tried using the logseverity function, but then realized that most messages come with "Unknown" severity, or even no value at all. Kos, is there any way to include the severity in the values returned?

Thanks a lot!

**Kos** · 29-05-2020, 09:06

guille_pm , first of all - great thanks, very interesting approach!

Originally posted by guille_pm

Now to the question part, I'm trying to do some filtering on the eventlog metric. I tried using the logseverity function, but then realized that most messages come with "Unknown" severity, or even no value at all. Kos, is there any way to include the severity in the values returned?

I'd like to make more accurate this moment.
The original purpose of this attribute in Zabbix database was to store a Severity of messages received from Windows Event Log. This field has a numeric type in Zabbix internal database, but it is transformed onto text for readability for a standard Windows Event Log grades as described in description of "logseverity()" trigger function (see documentation). My Zabbix "agent" for AS/400 just transmits the original Severity level from AS/400 (IBM i) message queue "as is", without any transformations. Accordingly, it is stored in Zabbix database also in its original form (as a number). However, the value of this number, most often, differs from the values used for a standard Windows Event Log levels (1 - 10), therefore Zabbix Web interface displays them as "Unknown". The same is true, probably, if you use the "{ITEM.LOG.SEVERITY<N>}" macro in notification templates: it will be resolved onto "UNKNOWN" (or something similar). However, you can use the "{ITEM.LOG.NSEVERITY<N>}" macro (with "N" letter before "SEVERITY") - it will substitute the original numeric value. The "logseverity()" trigger function (returning a number) also should work correctly.

**guille_pm** · 31-05-2020, 00:02

Thanks a lot! I tested it and works like a charm, now I get sev10 messages as warning and the rest as average.

Should have tested it before asking, sorry

**bigredau15** · 21-08-2020, 09:09

Hi,

We've recently implemented the AS400 / IBM i Zabbix agent, and it's working fine. However, in the Zabbix Portal under Problems we've got the alert:
'ASP1 used more than 65%'

How do we modify that percentage value? When we click to modify the Trigger, the expression shows:
{LPAR1:vfs.fs.size[1,pused].min(#2)}>{$MAX_DISK_PUSED:"1"} and {LPAR1:vfs.fs.state[1].count(#2,1)}=0 and {LPAR1:vfs.fs.state[1].count(#2,2)}=0

Where do we set the percentage to be, say 80%?

Thanks for your time.

Ad Widget

AS/400 Monitoring solutions

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment