Ad Widget

Collapse

AS/400 Monitoring solutions

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Kos
    Senior Member
    Zabbix Certified SpecialistZabbix Certified Professional
    • Aug 2015
    • 3404

    #151
    Hi guys,

    thank you for a feedback.

    My small comments:

    About operational data and custom intervals. All these are a Zabbix Server functionality, it is independent on Zabbix Agent features. Unfortunately, I have no control over that.
    My Zabbix Agent does not support a scheduled or custom intervals for an active mode (like the new Agent2 is able), but support of these in a passive mode is performed entirely by Zabbix Server.
    The only additional note:a custom or scheduled intervals does not cancel the "standard" intervals; so the usual "best practice" is to set a "standard" intervals into zero when you use another type of scheduling.

    I'm worry that I do not quite understand the problem with subsystem monitoring. Do you monitor it using the as400.subsystem[...] metric? Do you receive the correct data in the Latest data screen?
    What is the problem: incorrect data from the Agent or wrong trigger firing/restore? Could you like, please, provide a bit more details and your trigger formula.

    Regarding the errors "MCH2804 not supervised for QGYOLMSG". It is strange for me - probably, there are some factors that I did not take into account.
    camiespico, hab, whould you like to provide me a more details, please? Where do you see these errors - in the Agent log file, in the job log, message queue, Zabbix web-interface, other place?
    Is the Agent log (usually "zabbix_agentd.log") clear or it contains some warnings/errors?
    Does Zabbix Agent restart helps in this situation or not? What will occur if you clone the problem's item (for example, using different number in "maxlines" field)?

    Regarding the future plans: I'm not planning any serious changes. The only plans are to publish v0.7.8 that has vary small bug fix and 2 small new features:
    • possibility to monitor the History Log (in the same manner like a Message Queue);
    • an additional configuration parameter "as400JobAsMessagePrefix" that allows to use the job name (in format: NUMBER/USER/JOBNAME) as a prefix for the message text.
    Unfortunately, there are some problems with the share.zabbix.com portal at the moment (Google credentials are not working, including mine ones); but I hope this problem will be solved soon.

    Comment

    • hab
      Junior Member
      • Apr 2018
      • 5

      #152
      Originally posted by Kos
      Where do you see these errors - in the Agent log file, in the job log, message queue, Zabbix web-interface, other place?
      Is the Agent log (usually "zabbix_agentd.log") clear or it contains some warnings/errors?
      Does Zabbix Agent restart helps in this situation or not? What will occur if you clone the problem's item (for example, using different number in "maxlines" field)?
      I see it under items on the "info" column (as the reason why the item is unsupported). I do not see that message in the joblogs etc. I think on this partition the unsupported status is quite old.
      A restart of the agent does not make it supported again. Perhaps I should try a unlink+delete/re-link of the template where this is configured.
      I'm having a few days off so I will look into it some more next week.

      Comment

      • Kos
        Senior Member
        Zabbix Certified SpecialistZabbix Certified Professional
        • Aug 2015
        • 3404

        #153
        hab, if you are able to reproduce this problem, could you like, please, make the following steps:
        • stop Zabbix Agent for a moment;
        • edit the Zabbix Agent's config file (zabbix_agentd.conf) and set the parameter "DebugLevel=4";
        • delete old log file of Agent;
        • start Agent;
        • wait for a problem occurs;
        • stop agent, restore old config and restart again.
        The collected log file(s) send me, please, onto my e-mail address mentioned in the README.
        As well describe, please, what exactly is the problem item (with what key), ideally - with a screenshot of Zabbix screen with error message.
        Thanks in advance.

        Comment

        • bobf
          Junior Member
          • Feb 2021
          • 6

          #154
          Kos Thanks for the details. I think I was trying to use active mode but have since changed to passive. Yes I was using the as400.subsystem function and I had a zero in the standard interval and 5 minutes (5m) in the custom interval. I think it may have been me using active mode that caused the issues with so many items being recorded. Anyway I've got it working now with my revised method.

          One other thing. Is there an equivalent or a way of doing something like Zabbix sender ? A sort of command line function that could be used so that if I did some monitoring of other areas (eg audit journals) I could use that to send messages (or even "reports") back up to the server or would I just have to send "errors" to a message queue and have your agent handle it that way ?

          Bob

          Comment

          • Kos
            Senior Member
            Zabbix Certified SpecialistZabbix Certified Professional
            • Aug 2015
            • 3404

            #155
            bobf, zabbix_sender utility was a part of Zabbix Agent for a different platforms; however, version for AIX was binary compatible with AS400. At least, some versions (see here for links to compatibility matrix).

            Comment

            • bobf
              Junior Member
              • Feb 2021
              • 6

              #156
              Kos Excellent. I'll give it a try. Thanks

              Comment

              • bobf
                Junior Member
                • Feb 2021
                • 6

                #157
                Just a question about qsysopr message monitoring.
                There's a trigger that looks like
                "{Base iSeries Template:eventlog[QSYSOPR,,70,,,100,skip].nodata(7)}<>1 and ({Base iSeries Template:eventlog[QSYSOPR,,70,,,100,skip].regexp("^CPPEA33 ")}=1 or ({Base iSeries Template:eventlog[QSYSOPR,,70,,,100,skip].logsource("^#OVERRIDE")}=0 and {Base iSeries Template:eventlog[QSYSOPR,,70,,,100,skip].regexp("^(CPI2401|CPA5201|BRM1472|BRM1392|CPA7025|CPI 096E |CPI5906|CPPEA33|CPF8198|CPF9E7E|CPF9E7D|CPF1393) ")}=0) )"
                and another
                "{Base iSeries Template:eventlog[QSYSOPR,,70,,,100,skip].nodata(7)}<>1 and ({Base iSeries Template:eventlog[QSYSOPR,,70,,,100,skip].regexp("^BRM1392 ")}=1 or {Base iSeries Template:eventlog[QSYSOPR,,70,,,100,skip].logsource(QCLNSYSLOG)}=0 and {Base iSeries Template:eventlog[QSYSOPR,,70,,,100,skip].regexp("^CPA7025 ")}=1)"

                I'm struggling to understand how they work ie what it all means. I'm guessing the first is to cover critical errors (I'd have thought error severity 90 or over would do) and the second one is something like severity 70 - 90. What is logsource(QCLNSYSLOG) ??
                I'm really just wanting to show 90+ as Critical and 70-90) as Severe. Will the above do that?

                Sorry but it's somewhat over my head.

                Comment

                • Kos
                  Senior Member
                  Zabbix Certified SpecialistZabbix Certified Professional
                  • Aug 2015
                  • 3404

                  #158
                  Originally posted by bobf
                  I'm struggling to understand how they work ie what it all means. I'm guessing the first is to cover critical errors (I'd have thought error severity 90 or over would do) and the second one is something like severity 70 - 90. What is logsource(QCLNSYSLOG) ??
                  I'm really just wanting to show 90+ as Critical and 70-90) as Severe. Will the above do that?

                  Sorry but it's somewhat over my head.
                  Recently I've answered onto the similar question by e-mail; so, probably, I'll just publish my answer here:
                  Probably, it is necessary to update this template example: it was created still for version Zabbix 3.0; we currently use the v5.0 (but Zabbix team has v5.4 in beta2 stage and promises to publish v6 this autumn).

                  And, probably, to emphasize once more that it is not a _recommended_ template, but an _example_ only to demonstrate some ideas - how a supported metrics could be used.

                  Regarding these triggers: they do not use a logseverity() trigger function at all; all of them use a filtering by EventID (included in message body as a prefix) and Source (job name) fields only. So, real severity could be any (at least, 70 - it was set in Item key parameters). We use these triggers in our environment based on our own experience. I.e. we know that messages with EventID="BRM1392" are warning only (_for us_) in spite of their high value of Severity; but messages with EventID="CPPEA02" or "CPI0949" are critical, though they have not very high Severity.

                  In other words, the template triggers are an examples only; you can (and, probably, should) modify them for your own needs.

                  Comment

                  • dave_t
                    Junior Member
                    • Apr 2007
                    • 28

                    #159
                    For those of you who have struggled with monitoring message queues - other than the templated "QSYSOPR" message - probably because you're not that familiar with IBM-i or too lazy to RTFM (both of which were the case for me) I can tell you that it's dead-simple, but the documentation doesn't give you a good example for someone (like me) with no IBM-i knowledge.

                    When you look at the IBM-i from a hierarchical filesystem perspective, the messages queues look like this:

                    /QSYS.LIB
                    /LIBRARY.LIB
                    /MESSAGEQUEUE.MSGQ

                    In the "eventlog[]" Zabbix item in the as400 Zabbix template, it looks like "/QSYS.LIB/" is implied.

                    ...so when you setup your eventlog item in Zabbix - after your IBM-i admin tells you that the queue you want to monitor is called something like "APPLICATION/MESSAGEQUEUE" they need to provide the actual path to the IFS - which, in this example would be:

                    eventlog["/QSYS.LIB/APPLICATION.LIB/MESSAGEQUEUE.MSGQ",,,,]

                    This worked for me, and I hope this helps anyone else out who might have been struggling with this !

                    Comment

                    • Kos
                      Senior Member
                      Zabbix Certified SpecialistZabbix Certified Professional
                      • Aug 2015
                      • 3404

                      #160
                      dave_t, thank you for a sharing your experience.

                      By the way, I have some news.
                      The last version at the moment is v0.7.8 (more than 3 month ago). Unfortunately, due to technical problems with site "share.zabbix.com" I'm unable to update any info on this site (I hope that Zabbix team, finally, will be able to restore my access to this site). Meanwhile this version is available on the my Google Drive only (link) - as usually, as the archive file "as400.zip" (417778 bytes); you are welcome to try it.
                      Probably, this project will be migrated onto GitHub in a future.

                      The main changes:
                      • the current job name (in format "NUMBER/USER/JOBNAME") could be added to message text as a prefix (configurable via an additional "as400JobAsMessagePrefix" parameter in config file);
                      • possibility to monitor a History Log: just set an empty first parameter (Message Log name) for "eventlog[...]" metric (see an updated documentation for details);
                      • some tries to catch problems with hangs during the "as400.services" checks (some debugging added, taking these checks into a separate temporary thread, etc.) - it is not described in documentation in details.
                      --
                      Constantin

                      Comment

                      • combisDF
                        Junior Member
                        • Sep 2021
                        • 1

                        #161
                        Hello everyone,

                        I have managed to install and run this emulator remotely and pretty much everything seems to be working fine. Zabbix frontend is collecting the statistics and visualizing them correctly.
                        But even after reading the readme file I still have some specific questions.
                        1. Is it possible to check if a particular job exists in the system?
                        (for example if there is a job named 237425/VBOWNER/AG_INQ_SRV - job_number/job_user/job_name, how to make such item?)
                        2. How to check job status?
                        (for example if a job named AG_INQ_SRV is in status 'MSGW','LCKW' or 'HLD', how to make that item?)
                        3. Is it possible to find specific jobs in some status?
                        (for example how to make an item to find jobs in status 'LCKW')
                        4. Is it possible to count jobs in job queues?
                        5. Can agent amulator execute commands on AS400?

                        Thanks in advance for your help!

                        Comment

                        • Aman Malik
                          Junior Member
                          • Sep 2021
                          • 1

                          #162
                          Hi,

                          I am trying to implement monitoring of AS400 on zabbix 5.0 LTS from this link.


                          i have placed the jt400.jar and json-simple-1.1.1.jar as per documentations, but still facing below error in zabbix agent logs when starting zabbix agent.

                          #########################Zabbix agent logs#####################
                          24:20210928:095649.414 Starting Zabbix Agent v0.7.7
                          24:20210928:095649.474 using configuration file: /home/ZABBIX/agentd/zabbix_agentd.conf
                          24:20210928:095649.484 agent #-1 started [ZabbixAgent config]
                          24:20210928:095649.494 IBM Corporation Java version "1.7.0"
                          24:20210928:095649.494 Java(TM) SE Runtime Environment (build jvmap3270_27sr3fp50-20160720_022.7)
                          24:20210928:095649.504 IBM J9 VM (build 2.7, JRE 1.7.0 OS/400 ppc-32 jvmap3270_27sr3fp50-20160720_02 (JIT enabled, AOT enabled)
                          J9VM - R27_Java727_SR3_20160630_1516_B309914
                          JIT - tr.r13.java_20160629_120282
                          GC - R27_Java727_SR3_20160630_1516_B309914
                          J9CL - 20160630_309914)
                          24:20210928:095649.814 Open Source Software, JTOpen 9.1, codebase 5770-SS1 V7R3M0.00 built=20160705 @RF
                          24:20210928:095649.863 The library 'json-simple-<version>.jar' is not available, exiting:
                          java.lang.ClassNotFoundException: org.json.simple.JSONValue

                          ################################################## ################################################## ##########################

                          Comment

                          • Kos
                            Senior Member
                            Zabbix Certified SpecialistZabbix Certified Professional
                            • Aug 2015
                            • 3404

                            #163
                            Originally posted by combisDF
                            Hello everyone,

                            I have managed to install and run this emulator remotely and pretty much everything seems to be working fine. Zabbix frontend is collecting the statistics and visualizing them correctly.
                            But even after reading the readme file I still have some specific questions.
                            1. Is it possible to check if a particular job exists in the system?
                            (for example if there is a job named 237425/VBOWNER/AG_INQ_SRV - job_number/job_user/job_name, how to make such item?)
                            2. How to check job status?
                            (for example if a job named AG_INQ_SRV is in status 'MSGW','LCKW' or 'HLD', how to make that item?)
                            3. Is it possible to find specific jobs in some status?
                            (for example how to make an item to find jobs in status 'LCKW')
                            4. Is it possible to count jobs in job queues?
                            5. Can agent amulator execute commands on AS400?

                            Thanks in advance for your help!
                            Hi!

                            thank you for your feedback.

                            1. You can check if a particular job does exist using the "proc.num[...]" metric. It will perform a search by job name and user name; additionally you can filter by job state (usually: "RUN") and a subsystem name (using regular expression). It's currently impossible to search by a job number (as it is formed dynamically, so, anyway, you can not know it in advance).

                            2 and 3. There is no metric returning exactly a job status. At the same time, the mentioned "proc.num[...]" metric able to check a job state also. So, if you know a job name, user (and, possible, subsystem), then you can use, for example the following:
                            proc.num[MYJOB,MYUSER,RUN,"^MYSUBSYSTEM$"] return a number of jobs named MYJOB running in subsystem MYSUBSYSTEM for user MYUSER
                            (it's a typical use case)
                            proc.num[MYJOB,MYUSER,LCKW,"^MYSUBSYSTEM$"] return a number of these jobs in the state "LCKW"
                            proc.num[,,LCKW,"^MYSUBSYSTEM$"] return a number of any jobs in the state "LCKW" in a subsystem MYSUBSYSTEM

                            4. I'm not sure if I'm correctly understanding your question, but, maybe you mean the "proc.num[,,"*JOBQ"]"?

                            5. Unfortunately, no: the is no such possibility :-(
                            Last edited by Kos; 28-09-2021, 14:15.

                            Comment

                            • Kos
                              Senior Member
                              Zabbix Certified SpecialistZabbix Certified Professional
                              • Aug 2015
                              • 3404

                              #164
                              Originally posted by Aman Malik
                              i have placed the jt400.jar and json-simple-1.1.1.jar as per documentations, but still facing below error in zabbix agent logs when starting zabbix agent.

                              Code:
                              #########################Zabbix agent logs#####################
                              24:20210928:095649.414 Starting Zabbix Agent v0.7.7
                              24:20210928:095649.474 using configuration file: /home/ZABBIX/agentd/zabbix_agentd.conf
                              24:20210928:095649.484 agent #-1 started [ZabbixAgent config]
                              24:20210928:095649.494 IBM Corporation Java version "1.7.0"
                              24:20210928:095649.494 Java(TM) SE Runtime Environment (build jvmap3270_27sr3fp50-20160720_022.7)
                              24:20210928:095649.504 IBM J9 VM (build 2.7, JRE 1.7.0 OS/400 ppc-32 jvmap3270_27sr3fp50-20160720_02 (JIT enabled, AOT enabled)
                              J9VM - R27_Java727_SR3_20160630_1516_B309914
                              JIT - tr.r13.java_20160629_120282
                              GC - R27_Java727_SR3_20160630_1516_B309914
                              J9CL - 20160630_309914)
                              24:20210928:095649.814 Open Source Software, JTOpen 9.1, codebase 5770-SS1 V7R3M0.00 built=20160705 @RF
                              24:20210928:095649.863 The library 'json-simple-<version>.jar' is not available, exiting:
                              java.lang.ClassNotFoundException: org.json.simple.JSONValue
                              
                              ##############################################################################################################################
                              The agent is starting and creating log file, it's good. At least, the process can find a JVM, start the process and agent is able to find and successfully parse its config file.
                              The problem is really that it could not find a json-simple library by some reason.
                              If you run this JVM directly on the system i (as400) system, then you need to check the following:
                              You need also necessary libraries (jt400.jar and json-simple-1.1.1.jar) in one of two places: either in the same directory as ZabbixAgent.jar file or in the directory for JRE's system libraries (${JAVA_HOME}/lib/ext/).
                              If you start it from another system, you can use the "-classpath" parameter during the JVM call, something like the following:
                              Code:
                              #!/bin/bash
                              CP=~/lib/jt400.jar:~/lib/json-simple-1.1.1.jar
                              java -classpath ${CP} -jar ZabbixAgent.jar
                              By default, ZabbixAgent tries to find library json-simple in file json-simple-1.1.1.jar located at the same place as ZabbixAgent.jar. If this file has another name (for example, includes some another sub-version number), you can just rename it as "json-simple-1.1.1.jar".

                              Comment


                              • Aman Malik
                                Aman Malik commented
                                Editing a comment
                                Hi,

                                file name is json-simple-1.1.1.jar and is located at the same place as ZabbixAgent.jar. but still showing same error.
                                also tried to rename the file but same error.

                              • Kos
                                Kos commented
                                Editing a comment
                                Aman, it's very strange for me.

                                Maybe, some wrong file has been downloaded?
                                Check, please, that the "json-simple-1.1.1.jar" file really contains the "/org/json/simple/JSONValue.class" file inside it.
                                Also check, please, that user starting JAVA process has a read permissions for this JAR-file.

                                As a troubleshooting step, you can try to start JAVA process without "-jar ZabbixAgent.jar" parameter, but using the exact class name and references onto needed libraries via "-classpath" parameter. The resulting command line should be something like the following (it's all is a single line):

                                java -classpath /QIBM/ProdData/OS400/jt400/lib/jt400.jar:/home/ZABBIX/agentd/json-simple-1.1.1.jar:/home/ZABBIX/agentd/ZabbixAgent.jar as400.thread.ZabbixAgent
                            • phdeliege
                              Junior Member
                              • Jan 2022
                              • 1

                              #165
                              Hi everybody,

                              First of all, very good job for this agent.
                              I'm starting to test it and I found a strange thing concerning the size of the ASP:

                              On the system, I have 2 disks:
                              From the WRKDSKSTS:
                              Size %
                              Unit Type (M) Used
                              1 2145 76354 63,5
                              2 2145 76354 63,1


                              But on the Zabbix, I have the following:
                              Disk 1: DMP001 (2145 0050, CD-4000007) status
                              2022-01-24 15:21:00
                              Disk 1: DMP001 capacity 2022-01-24 15:05:37 71.11 GB
                              Disk 1: DMP001 free space 2022-01-24 15:21:00 25.94 GB -976.56 KB
                              Disk 1: DMP001 free space, % 2022-01-24 15:21:00 36.4717 % -0.001313 %
                              Disk 1: DMP001 used space 2022-01-24 15:21:00 45.18 GB +976.56 KB
                              Disk 1: DMP001 used space, % 2022-01-24 15:21:00 63.5283 % +0.001313 %
                              Disk 2: DMP012 (2145 0050, CD-4000008) status 2022-01-24 15:21:00 The disk unit is active (1)
                              Disk 2: DMP012 capacity 2022-01-24 15:05:37 71.11 GB
                              Disk 2: DMP012 free space 2022-01-24 15:21:00 26.23 GB -3.81 MB
                              Disk 2: DMP012 free space, % 2022-01-24 15:21:00 36.8882 % -0.005236 %
                              Disk 2: DMP012 used space 2022-01-24 15:21:00 44.88 GB +3.81 MB
                              Disk 2: DMP012 used space, % 2022-01-24 15:21:00 63.1118 % +0.005238 %
                              And from the WRKSYSSTS, I have a size for the ASP of:

                              System ASP . . . . . . . : 152,7 G
                              % system ASP used . . . : 63,3271


                              And Zabbix returns this value:
                              ASP1 capacity 2022-01-24 15:05:37 142.22 GB
                              ASP1 free 2022-01-24 15:26:00 52.16 GB -9.54 MB
                              ASP1 free, % 2022-01-24 15:26:00 36.6734 % -0.006555 %
                              ASP1 pused, % 2022-01-24 15:26:00 63.3266 % +0.006555 %
                              ASP1 status 2022-01-24 15:26:00 no status (0)
                              ASP1 used 2022-01-24 15:26:00 90.06 GB +9.54 MB
                              Do you have any idea why there is a difference: 142,22 GB vs. 152,7 G ?

                              Same thing for the Disk: 71.11 GB vs. 76354 M ?

                              Thanks for your support.

                              Comment

                              Working...