Realizing the “sense of speed” demanded by CyberAgent with automation through API link
CyberAgent broadly develops Internet-based businesses such as Internet advertising, smartphone games and media, including the “Ameba” community service, which has evolved together with trends for Internet use such as avatar services and smartphone development, etc. In the company’s main battlefield, the Internet market, time is money. In this industry where new ideas and produced and introduced every day, the only way to acquire a large number of users is to develop new services ahead of others. Conversely, any delay in introducing new services will result in lost ground and heavy losses in terms of profit.
CyberAgent quickly implements services in line with market needs by promoting automation of server construction, including monitoring. “Zabbix” is a piece of software that plays a major role in this activity.
Not only server construction, but also automation of monitoring configuration
CyberAgent’s AdTech Headquarters is cross-departmental organization that specializes in Ad Technology. It provides a great number of ad services based on a hybrid cloud environment combining private clouds using Amazon Web Services (AWS) and OpenStack for many services and products, including those belonging to subsidiary companies.
According to Mr. Makoto Hasegawa of AdTech Headquarters, the requirement for this foundation above all else is “a sense of speed”. As he explains, “When responding to requests such as ‘We’re starting a new service, so we want you to prepare a development server as quickly as possible… can’t you do it now?’ we didn’t want to spend much time on server construction or configuration of monitoring items.” What, then, would be required in order to prepare servers quickly when necessary, without spending much time? The answer is the “automation” this company is pursuing.
Mr. Makoto Hasegawa, AdTech Headquarters
In order to realize the objective of automated server configuration and monitoring configuration, CyberAgent uses the open source software “Chef” to automate all kinds of configuration work. Zabbix was selected as the tool for automatic addition of monitoring item configuration by operating with server construction using Chef, enabling immediate use. A structure was arranged so that Zabbix Agent could be installed by operating with server construction by linking Chef and Zabbix via API, with configurations inserted as necessary depending on the type of server.
Mr. Hasegawa explains: “In order to promote automation, it is necessary to make server construction an operation that is not labor-intensive. In that regard, the Zabbix’s compatibility with automation was excellent, as it inserts all flexible configurations via API, including preparation of discovery functions, monitoring intervals and calculation items, etc.”
As a result, it was possible to eliminate the labor-intensive process of having a human visually check monitoring items one by one and perform configuration. In addition, Mr. Hasegawa says, “When we configure by hand, mistakes and configuration oversights inevitably occur. The advantage of automation is that it has enabled us to monitor everything at the same level.”
Monitoring items are managed with templates for each server type and purpose -- “It’s helpful that it’s easy to use,” says Mr. Hasegawa. But, he adds, “It is necessary to carry out version management of the actual templates as monitoring items and changes increase during continued use. For now we export templates and manage them with Git, but I would be grateful for a structure like diff, which would immediately show any differences between templates.”
Detailed monitoring possible in units of seconds, together with flexible action configuration
At CyberAgent, which has always actively used open source software, each post already made proprietary use of Zabbix in server monitoring. Zabbix 2.2 and 2.4 are used in the automatic monitoring structure AdTech Headquarters has implemented by linking with Chef. “One of the good things about Zabbix is its comprehensive backwards compatibility,” says Mr. Hasegawa. “We also value the fact that structures produced with 2.2 can also be used as they are in 2.4.”
Mr. Hasegawa continues: “Only the parts that hit API required production from the ground up, and basically we were able to complete tasks simply by using the officially distributed rpm package, so the installation work was straightforward. We made adjustments such as increasing the cache capacity and memory in line with the scale of items to be monitored, but we were basically able to do this quite easily.” There was even “a mountain” of documents to refer to when carrying out production.
When setting about constructing a monitoring system, the company also considered other open source monitoring tools such as Nagios, Munin and Cacti. As Mr. Hasegawa recalls, “One of the requirements of project monitoring was a demand for monitoring every 5 seconds, but we struggled to find anything that would correspond to that. With Zabbix, though, we were able to satisfy the demand for wanting to monitor in detail in units of seconds.”
At present, CyberAgent’s AdTech Headquarters conducts server group monitoring with approximately 30 Zabbix Servers. Each Zabbix Server monitors an average of around 20 to 30 servers. The average number of triggers is around 3,000 and the average number of items is around 5,000, but in exceptional cases these numbers can exceed 20,000 and 60,0000 respectively. An average of 100 items are monitored every second, but Mr. Hasegawa says that “There is no particular problem with the performance of Zabbix itself” even under such circumstances. Instead, he says, they put greater effort into tuning of back-end database servers.
Mr. Hasegawa also speaks highly of the ability to carry out flexible configuration of “actions” to be executed by taking on-board the monitoring results. When some kind of abnormality has occurred, such as monitoring items exceeding the threshold, not only are alerts transmitted by e-mail but it is also possible to carry out flexible treatment by circulating information to chat tools and executing commands. Mr. Hasegawa explains: “In our company we use chat tools such as Chatwork and Slack, and we direct monitoring contents to these chat tools both when problems have occurred and when carrying out recovery.”
Another valued feature is ability to link with “Java Management Extensions (JMX)”, which can carry out in-depth Java application monitoring and general monitoring. “With Zabbix,” Mr. Hasegawa says, “we can fetch various Java application values via JMX. There are likely very few tools that can directly implement this, and for us it is an extremely helpful feature.”
Zabbix is a “tool that follows the flow of Infrastructure as a Code”
Mr. Hasegawa, who uses Zabbix in this way as a link in automation, also seems to actively enjoy using it. He has worked on a “theme” prepared in Zabbix’s management interface, completing it in Ameba style with green as the key color tone.
Going forwards, AdTech Headquarters will work on producing a structure that can gather in a centralized way the Zabbix information that is scattered here and there for each project. “For example,” Mr. Hasegawa explains, “we want to be able to display in a list on the dashboard the various service and product response times aggregated by Zabbix and ascertain at-a-glance the current response time and server alert situation, etc.”
They have also developed a proprietary tool known as “Blackbird” in order to acquire various data from middleware without adding load to monitoring targets. Data can also be acquired from middleware that does not have Zabbix Agent installed, such as cloud-based components, without forming a huge number of network connections. Blackbird is officially distributed as open source software on GitHub (https://github.com/Vagrants/Blackbird).
Based on his experience as an infrastructure engineer having constructed and used a great number of servers, Mr. Hasegawa says: “Within the flow of automation, future infrastructure will likely take the form of code-built ‘Infrastructure as a Code’. As more and more new services are produced, we will have to use script to link these with infrastructure. Therefore, the infrastructure engineers of the future will be required to have the ability to write code and to understand code.”
When seen from that perspective, Mr. Hasegawa concludes that Zabbix “is fully loaded with the API required for linking and can do practically anything. We could describe it as a monitoring tool that is in harmony with the flow of Infrastructure as a Code.”
30 Zabbix Servers
Multi tenant monitoring support, on-premises and on AWS.
Number of monitored devices: Average 20 to 30 /Max. 200-250 per Zabbix server.
Number of triggers: Average 3,000 / Max. 20,000 per Zabbix server.
Number of item: Average 5,000 / Max. 60,000 per Zabbix server.
Want to share your story of using Zabbix Monitoring Solution?
|Fill out this Questionnaire|
Or contact our Marketing Team for further assistance.