Ad Widget

Collapse

using script and trigger actions to restart systemd service on linux

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • helloguys2024
    Junior Member
    • Apr 2024
    • 10

    #1

    using script and trigger actions to restart systemd service on linux

    hi guys,
    I know this thread is a bit old but I am hoping you guys can help as this is the most closest thread I've seen with the issues that I am having.
    zabbix server: 6.0.29 (RHEL 8.9)
    zabbix_agent2: 6.0.26 (RHEL 9.3)

    I am new to zabbix and trying to setup salt-minion.service to restart if the service is either failed/inactive. My script looks the same as V1ktor's as I essentially copied his config, which can be found here https://www.zabbix.com/forum/zabbix-...stemd-services. I substituted mysql for salt-minion.
    Click image for larger version

Name:	Screenshot 2024-05-21 144340.png
Views:	1990
Size:	20.4 KB
ID:	484279

    Then I created Trigger actions under Configuration-> Actions -> trigger actions
    Click image for larger version

Name:	Screenshot 2024-05-21 144606.png
Views:	1753
Size:	26.6 KB
ID:	484280
    then the operations
    Click image for larger version

Name:	Screenshot 2024-05-21 144832.png
Views:	1742
Size:	33.1 KB
ID:	484281

    on the client server, I modified the following files;
    1. modified /etc/sudoers and added:
    zabbix ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart salt-minion.service
    2. modified /etc/zabbix/zabbix_agent2.conf and added:
    AllowKey=system.run[*]

    3. I restarted the zabbix-agent service.
    4. I then stopped the salt-minion.service to test the script but I when I look on the monitor dashboard, nothing happens, I don't see any changes and when looking at zabbix server's /var/log/zabbix_server.log, this is the only entry I see
    6426:20240520:172142.721 Zabbix agent item "systemd.unit.get["salt-minion.service"]" on host "test-server" failed: another network error, wait for 15 seconds

    on the client server, I don't see any entry related to salt-minion server in /var/log/zabbix_agent2.log

    I feel like I'm very close but not sure where to go from here.

    Thanks for looking and any advice would be appreciated.

    ​​​
  • salavie
    Junior Member
    • Oct 2020
    • 29

    #2
    try this

    Comment

    • cyber
      Senior Member
      Zabbix Certified SpecialistZabbix Certified Professional
      • Dec 2006
      • 4807

      #3
      You are not executing the script ... you have set it up under "recovery action", which gets executed, when trigger recovers. You should run it under "Operations".

      "Recovery operations" does not mean "operations to recover the situation"... It means "operations what Zabbix does, when trigger recovers"

      Comment

      • helloguys2024
        Junior Member
        • Apr 2024
        • 10

        #4
        Originally posted by cyber
        You are not executing the script ... you have set it up under "recovery action", which gets executed, when trigger recovers. You should run it under "Operations".

        "Recovery operations" does not mean "operations to recover the situation"... It means "operations what Zabbix does, when trigger recovers"
        Thanks Cyber,
        so I basically took out the recovery & update operations since I'm really just testing the service restart. So, I changed the operation to below and then restarted the zabbix-agent2 services on all of the stage servers and looking at the Dashboard, I don't still don't see any changes. could this be related to the permission issues that salavie mentioned? even though the changes I made in /etc/sudoers should suffice.

        Thanks guys.

        Comment


        • helloguys2024
          helloguys2024 commented
          Editing a comment
          hi guys,

          So this morning, I went ahead and crated the override.conf file in /etc/systemd/zabbix-agent2.service.d directory and add the entry suggested by salavie:
          [Service]
          User=root
          Group=root

          then systemctl daemon-reload
          then systemctl restart zabbix-agent2

          Looking at the dashboard, I do not see any changes. What else can I be missing?

          Thank you
      • cyber
        Senior Member
        Zabbix Certified SpecialistZabbix Certified Professional
        • Dec 2006
        • 4807

        #5
        6426:20240520:172142.721 Zabbix agent item "systemd.unit.get["salt-minion.service"]" on host "test-server" failed: another network error, wait for 15 seconds
        You should try to find a reason, why your check is failing, you monitoring does not work, so trigger does not activate and no actions will be executed...

        How is it configured? As passive check, I would assume? Maybe you have connection issues from server to agent?

        Comment

        • markosa
          Senior Member
          Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
          • Aug 2022
          • 104

          #6
          If you add to commands section sudo /usr/bin/systemctl restart salt-minion.service instead of sudo systemctl restart .... does that help? Since you have defined absolute path within sudoers... Or edit sudoers to match commands section. Also remember to check selinux settings, if that is preventing something.

          Comment

          • helloguys2024
            Junior Member
            • Apr 2024
            • 10

            #7
            so it seems like my original trigger action started working after adding the override.conf file as suggested by salavie but the trigger does not work consistently. For example, I have 3 servers in my host group, sometimes it will restart all 3,sometimes only 2 and sometimes not at all and I'm not sure if it's because I have a host under the Conditions statement Trigger equals hostname: salt-minion.service:Service is not running. Below
            Click image for larger version

Name:	image.png
Views:	1670
Size:	11.9 KB
ID:	484937
            Then in the operations, I have the below statement.
            Click image for larger version

Name:	image.png
Views:	1675
Size:	11.0 KB
ID:	484938

            For testing purposes, I created a new trigger action:
            In the case, I left the action condition blank
            Click image for larger version

Name:	image.png
Views:	1671
Size:	9.0 KB
ID:	484939
            Then under Operations it looks the same as previous statement:
            Click image for larger version

Name:	image.png
Views:	1671
Size:	11.1 KB
ID:	484940

            This works but it looks like it just restarts the service arbitrarily. Any suggestions would be appreciated.

            Thank you.
            Attached Files

            Comment

            Working...