Ad Widget

Collapse

trigger actions doesn't seem to be running or working

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • helloguys2024
    Junior Member
    • Apr 2024
    • 10

    #1

    trigger actions doesn't seem to be running or working

    zabbix server: 6.0.30
    zabbix client: 6.0.29

    zabbix server OS: rhel 8.9
    zabbix agent OS: rhel 8.9 & rhel 9.3 servers

    Hi guys, so I am still very new to zabbix. I have a script that is to restart a systemd service like sshd.service for example. Then I have a trigger action against a group of hosts that is supposed to restart the sshd.service if the service is not running. I purposely stopped the systemd service to see if the trigger action was going to restart it but for the last 3 days, if has not restarted. sometimes, the trigger action will work right away and restart the service.

    I'm not sure why sometimes it works and other times like right now, it does nothing.

    The trigger action config is: Trigger equals "stage-servers: systemd service: service is not running"

    Thank you.
  • tim.mooney
    Senior Member
    • Dec 2012
    • 1427

    #2
    Since you've said you're new to Zabbix, it may help you to break it down into stages:
    1. Stage 1: various items defined for the host collect measurements about the state of the host.
    2. Stage 2: triggers (thresholds) apply expressions to the item values to decide if there is a problem or not
    3. Stage 3: if the trigger expression determines that there is a problem, a new PROBLEM event is generated.
    4. Stage 4: any number of actions may happen as a result of the PROBLEM event, depending upon how your actions are written.
    With those stages in mind, which stage is not doing what you expect?

    Is it that the item is not returning a value that your triggers would treat as a problem? You should be able to tell what the item is returning via the host's Latest Data.

    Is the item value correct (i.e. what you expect for a problem where sshd.service is not running), but the trigger is not treating it as a PROBLEM? At version 6.0.x, Looking at Monitoring->Problems should tell you if a PROBLEM event has been generated. Depending on when the PROBLEM happened, you may need to click "History" for the "Show:" part of the filter.

    If a PROBLEM event is correctly being generated each time sshd.service is offline, then the next thing to check is whether your action is being attempted. Again in Monitoring->Problems, is there a number and a little red arrow in the "Actions" column for the problem in question? If you mouse over it, does it show you what actions have been performed as a result of this PROBLEM event? If there's no number in that column, then the issue is probably with your action, not with your trigger. If there are actions there, but some of them are in red or otherwise obviously failed, then the issue is that the actions are being attempted but not succeeding.

    Does that help narrow down where to focus on further debugging the issue?

    Comment


    • helloguys2024
      helloguys2024 commented
      Editing a comment
      hi Tim,
      Thanks for the reply. I am not exactly sure but I will try to answer the questions:

      Stage 1: various items defined for the host collect measurements about the state of the host.
      * I am not exactly what you are referring to but under administration, there are several scripts such as detect OS, ping OS and my script restart systemd.service

      Stage 2: triggers (thresholds) apply expressions to the item values to decide if there is a problem or not
      * I guess here you are asking for the trigger actions? for that I have "Trigger equals stage-server: sshd.service: service is not running" under configuration -> Trigger Actions -> Restart sshd.servce -> Conditions

      Stage 3: if the trigger expression determines that there is a problem, a new PROBLEM event is generated.
      * I don't think I have anything for this

      Stage 4: any number of actions may happen as a result of the PROBLEM event, depending upon how your actions are written.
      * I believe this is where under configuration -> trigger actions -> Operations 1:
      under Operations: Run Script "Restart sshd.service" on host group: stage servers immediately

      I hope this is what you were asking for.

      When configuring the script, I followed this: https://www.zabbix.com/forum/zabbix-...stemd-services

      Thank you for the help and please let me know if I can add anything else to make this work on a consistent basis.
  • tim.mooney
    Senior Member
    • Dec 2012
    • 1427

    #3
    Sorry about the confusion. I actually wasn't asking anything for me, I was highlighting the various areas where something might go wrong, and then asking questions to try help you focus on where the issue is. I was trying to help you conceptually divide the problem up.

    If you don't know what I'm referring to with "items", you probably want to review the manual, especially chapter 7, which covers many of the "stages" I outlined. In particular, items: https://www.zabbix.com/documentation...l/config/items

    The issue you're having may have nothing to do with your actions, it might be happening in an earlier stage, but until you can identify which parts are working, it's difficult to focus on what the actual problem is.

    Comment

    Working...