Ad Widget

Collapse

configs and specification requirement for 14000 hosts

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • ykdhan
    Junior Member
    • Jul 2020
    • 14

    #1

    configs and specification requirement for 14000 hosts

    Hi,
    I am working for a company that monitors about 7000 network devices, mostly firewalls.
    My company is trying to switch its monitoring application to Zabbix, and we are planning on having 5 items to monitor for each host (device) every minute.
    However, my boss wants to add another 7000 devices to monitor which are L2 devices so that we could detect any problem in those devices as well.
    The problem is that my boss does not want to check pings of those L2 devices at all times and still wants to check pings when there occurs a problem in one of the original 7000 firewall devices.
    I could use trigger dependency, but I heard that when there is a difference in time intervals between two hosts, it does not trigger immediately; it waits until the later time interval item to be triggered a problem.
    The reason my boss wants this way is to reduce cpu and memory resources. Would this scale of monitoring cause a lot of resources?
    What are the configurations I need to consider to make changes? What would be the server specification requirement for my company?
    Thank you.
  • kloczek
    Senior Member
    • Jun 2006
    • 1771

    #2
    Number of host is irrelevant.

    What matters is total number of metrics and avg rate with which all those metrics will be sampled. That will define avg flow of the data which which needs to go over server process than needs to be stored in DB backend. More or less that rate will be flat.
    That defines single parameter which in zabbix methodology is called NVPS which stands for New Values Per Second.
    On top of that as next what matters is complexity of the alarming layer which mostly impacts processing resources as it will be evaluating all that data from against logical rules which will be triggering alarms. Before that part is another brick which is all filtering power as some raw data sampled on monitored hosts/objects may not actually produce form of the data which could be used to extract exact metrics data.

    Probably now you may see that zabbix stack monitoring single host with 14k metrics will be ~equal to 14k hosts with only one metric each of those hosts.

    As long as you don't have kind of "mono culture" with 14k hosts working as for example HPC processing platform that estimation may be tricky not only from point of view used different set of templates to monitor different groups of all hosts but as well even if each host will be using exactly the same templates but wilt some LLDs (Low Level Discovery) rules on each host number of metric may be not fixed if for example number of network ports, block devices, volumes etc to monitor each host number of metrics and NVPS may vary as well.

    In other words to produce estimate amount of resources necessary to build exact monitoring stack you must add at least few more variable values
    http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
    https://kloczek.wordpress.com/
    zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
    My zabbix templates https://github.com/kloczek/zabbix-templates

    Comment


    • ykdhan
      ykdhan commented
      Editing a comment
      Thanks for your reply. Well, I already knew about the NVPS and have calculated the disk space I would need although it could be inaccurate.
      I understand what you said but I just want to know a common amount of resources it would take to handle about 14000 hosts with about 5 items each.
      I really appreciate your heartful advice.
  • Hamardaban
    Senior Member
    Zabbix Certified SpecialistZabbix Certified Professional
    • May 2019
    • 2713

    #3
    I am not sure that in zabbix we can use standard methods to implement your scenario in which data collection for a specific host begins when another host fired a trigger. For sure, you can do this by writing your own scripts that call the API .... but this is a separate task.
    About resources: The amount of data stored depends on what kind of data it is (a number or text), how long to store it, and how long to store the dynamics of changes. The latest versions of Zabbix have new features that allow you to significantly reduce the amount of stored data and implement the mechanism "receive and respond frequently - store rarely".
    The hardware performance depending on many things https://www.zabbix.com/documentation...ormance_tuning
    Last edited by Hamardaban; 12-08-2020, 21:19.

    Comment


    • ykdhan
      ykdhan commented
      Editing a comment
      Thanks for your reply. I might try changing the C source code of Zabbix in order to do what I want to do.
Working...