Ad Widget

**bagni** · 08-01-2015, 11:22

Hi,
if you have 30K items with 5m polling frequency that mean a teorical 100 nvs, so the zabbix server is not so under pressure.

You don't need to setup a Zabbix Agent for each process, but for each host.
Every hosts would have an item for any process, you could use:

proc.mem[<name>,<user>,<mode>,<cmdline>]

proc.num[<name>,<user>,<mode>,<cmdline>]

to monitor the memory usage or the presence of process.

Configure 30K items is time consuming, you can think to use the LLD and intem/trigger prototype.

**timbo** · 09-01-2015, 04:59

In a typical implementation you would have each process setup as an Item under a Host (typically a physical/virtual machine) - as bagni suggests. In this instance I imagine LLD would automate the process of creating Items significantly (though I haven't used LLD myself yet), you will need an Agent on the server to use LLD (I believe). In this is particular case I would avoid installing multiple agents on the one server.

But if your requirements stipulate you need each process to be it's own Host you could use the Zabbix API to automate the creation of the hosts (though the API can be rather tricky).

You do not need to use the Agent to send Process data to the Zabbix server. You could script something with the Zabbix Sender:

6 Sender

https://www.zabbix.com/documentation/2.2/manual/concepts/sender

The Zabbix Sender allows you to specify which Host and Item should receive which data regarding any particular process.

Potential Steps:
Create Hosts:
- Manually via Web Frontend
- Automated via API
- Network discovery (will find machines on the network, would not create Hosts as Process names)

Create Items:
- Manually via Web Frontend
- Automated via API
- LLD (If you use Zabbix Agents)

Collect data and send to Zabbix:
- Agent installed on servers (see bagni's post)
- Script in conjunction with Zabbix Sender

Recovering the process:
- Manually, as you have said is fine
- For automated recovery you would need the Zabbix Agent installed, and you would need to enable "EnableRemoteCommands" in the Agent config file. When Zabbix Server detects a specific Process has failed it can send the Agent a command to kill/restart the Process (say by executing a script). But if you're happy manually restarting the Process, and you're happy with using Zabbix Sender, then there is no need to install the Zabbix Agent.

This is all theoretical, so please do your own research. (I also don't have access to my Zabbix install while I'm typing this)

So, maybe you can manually setup a few Host/Items first to test the waters:
Typical setup:
1. Create a Host: Server1
2. Create Items in that Host for each Process and Data Type you want to monitor. E.g.

Active/Passive Items (using Zabbix Agent):
Item Name | Item Key | Value
cmdexeMem | proc.mem[process] | 23452
cmdexeNum | proc.num[process] | 342

sqlexeMem | proc.mem[process] | 23452
sqlexeNum | proc.num[process] | 342

Zabbix Trapper Items (using Zabbix Sender):
Item Name | Item Key | Value
cmdexeUp | process[cmd.exe,'up'] | 1
cmdexeUptime | process[cmd.exe,'uptime'] | 3600
cmdexePID | process[cmd.exe,'pid'] | 954

sqlexeUp | process[sql.exe,'up'] | 1
sqlexeUptime | process[sql.exe,'uptime'] | 3600
sqlexePID | process[sql.exe,'pid'] | 954

OR

Non-Typical Setup:
1. Create a Host: cmdexe
2. Create Items within that Host for each Data Type you'd like to collect.

Zabbix Trapper Items (using Zabbix Sender):
Item Name | Item Key | Value
Up | process['up'] | 1
Uptime | process['uptime'] | 3600
PID | process['pid'] | 954

OR (simpler Item Keys)

Item Name | Item Key | Value
Up | processUp | 1
Uptime | processUptime | 3600
PID | processPID | 954

I honestly think the "typical" method is cleaner and easier to administer. (aka. create a Host for each Server, then use LLD to create an Item per process you want monitored in each Host)

First stage done!

With the above done, hopefully you're now collecting data about the status of your processes. You should be able to eyeball these and react manually. But you'd obviously rather people be notified when there's a problem rather than stare at the Zabbix Dashboard all day.

That's when you need a Trigger to fire when an Item matches a specific criteria (i.e. Process is down). This Trigger can fire off an "Action" - the Action can be an Email, SMS, or a Zabbix Agent "RemoteCommand" (allowing you to automate the recovery of the Process).

Though as you mentioned you'd be happy manually recovering these processes, you'd probably be happy with something as simple as an Email notification.

Hope this helps!

-Timbo

**Sa2015** · 13-01-2015, 03:54

Thank you very much bagni and timbo for your ideas!
Your comments sure gave me a lot of food for thought! Thank you!

Actually I already have an automated way of creating the necessary host and item information for each process in the server using the API, but the architecture is as I said in my first post -- 1 process is modeled in the Zabbix server as 1 host... And that doesn't work with Zabbix agent unless I start one agent per process.
Also, If possible I would like to get free from the host and item generation code that I wrote, because even though it works fine the importing process takes a lot of time. But if it's not possible, then I guess I can live with that.

So, right now what I'm trying to do is to re-design my architecture to model 1 host as 1 physical server and use LLD, but my custom LLD code doesn't seem to work well.
When an agent self-registers, after it is added as host and applied a template with the custom LLD rules I created, after several seconds the discovery rules it inherits from its templates shows as "Not supported (ZBX_NOTSUPPORTED)"... And that's where I'm stuck now.

But I think that since I already got the code for host creation, as timbo suggested I will try my way out with Zabbix sender instead of agent... Especially because I need to be able to run scripts on each process separately (sometimes even if it is not down), and that would only be possible on my current architecture (1 process = 1 host). If I change it to 1 host = 1 physical machine, I guess that my processes will show only on the trigger window when they're down...

Anyways, I'm still working on it! Thanks a lot for the ideas, they were really helpful!
And if you have new thoughts on what I wrote I would be very happy to hear them

Thank you so much!

Ad Widget

Zabbix architecture for multiple process monitoring in a large environment

Zabbix architecture for multiple process monitoring in a large environment

Comment

Comment

Comment