KDDI Interview Operational efficiency case studies

KDDI had been using various monitoring applications for different systems until they standardized the monitoring tasks by using Zabbix. This has realized the integration of support systems and improved operational efficiency.

Objective

To be able to operate monitoring solutions internally, as well as standardizing and improving efficiency of monitoring

Requirements

Necessary functions for operation and monitoring are fully available, without relying on specific vendors or products

Ability to improve operation/monitoring within the company

Approach

Stipulated the policy for implementing monitoring

Created and deployed standard templates for Zabbix

Outcome

A common knowledge base necessary for setting up operation/monitoring has been created, and support systems have been integrated

The installation process has been simplified by standardization

Reduced the time to build and develop a monitoring system from three weeks to 30 minutes

Towards the Implementation of Monitoring Solutions That Can be Operated Independently

As a large telecommunications carrier, KDDI provides communication services centered on 5G technology and promotes digital transformation for clients, as well as operating a diverse range of businesses, including financial and energy sectors, inside and outside Japan.

The company started considering installing a new monitoring solution, because various monitoring applications had been used for different systems at that time. There were also efficiency issues, as operational tasks were carried out manually. In addition, on-site operators were concerned that the tasks depended on individual skills.

“As the monitoring servers differed depending on the equipment, additional learning cost arose every time an operator took charge of a new piece of equipment. We also wanted to reduce the cost of implementation, development and maintenance of monitoring equipment” says KDDI’s Mr. Taro Kamiya (Expert, Cloud Engineering Department, Engineering Division).

KDDI considered introducing a system that can improve the situation, but they could not make any change, as the vendor to which they outsourced the development was also responsible for monitoring. Another option was to modify the monitoring solution provided by the vendor so that KDDI could manage it internally; however, Mr. Kamiya explains: “It would have created an issue of responsibility demarcation point with the vendor. In addition, we wanted to resolve the situation where different monitoring systems were used depending on the system. For these reasons, we decided to consider a monitoring solution that we could operate within the company.”

Also, the monitoring content varied depending on the equipment, as monitoring items were designed or built for each piece of equipment, and the quality of monitoring was not uniform.

Because of these circumstances, Mr. Kamiya’s team started working towards standardization of monitoring. First of all, they stipulated the policy to standardize the implementation of monitoring. They documented the purposes and implementation examples of monitoring items to clarify the minimum requirements. The information was then provided to their development partners so that each department of KDDI could share the same understanding between them. Thus, the monitoring implementation policy was developed in preparation for introducing the standardization of monitoring.

Seeking the Best Monitoring Solution That Meets the Needs

Mr. Mitsuru Kawamata
Mr. Mitsuru Kawamata
Group Leader
System Asset Section
Cloud Engineering Department
Engineering Division
KDDI CORPORATION

Before installing a monitoring solution, KDDI compared and reviewed various tools. They narrowed them down to Zabbix and one other solution. Mr. Kamiya explains the reason why they finally chose Zabbix: “We had used Zabbix in the past, and a lot of our staff had some basic knowledge of it. In addition, more importantly, Zabbix is equipped with almost all functions required for monitoring operations, such as a statistics function.”

He also added, “Since our team mainly monitor logs, we needed to have a solution for that purpose, and Zabbix fit the bill. Most of other solutions were designed to monitor service status, and some of them were not suited for long-term data retention or log monitoring.” Although Mr. Kamiya had been concerned about the performance of log monitoring in the older versions including Zabbix 1.8, which he had used before, he says “When we installed Zabbix 5.0/6.0, we confirmed that the performance had been significantly improved, and determined it was safe to use.”

Mr. Kamiya also values the fact that monitoring settings such as item and trigger settings can be configured through GUI, in addition to customizing details for sending notifications to higher-levels. "I think Zabbix’s GUI is user-friendly and intuitive, making it easy for even first-time users to set up monitoring. It is also advantageous that API can be used to link with other systems when polling,” says Mr. Kamiya.

Integration of Monitoring Infrastructures Reduced Work Time from Three weeks to 30 Minutes

Mr. Taro Kamiya
Mr. Taro Kamiya
Expert
Cloud Engineering Department
Engineering Division
KDDI CORPORATION

Although the installation of Zabbix went smoothly, according to Mr. Kamiya, it was a little difficult to tune it. “In order to provide support internally, we enabled functions for basic tuning during the standardization phase, referring to examples of tuning of other systems. Especially, it was a little difficult to tune MariaDB, which we currently use, since some part of it may not function properly when it becomes large,” says Mr. Kamiya.

Despite that, the installation made a significant impact. Firstly, a common knowledge base that is necessary for setting up operation/monitoring has been created. This has not only reduced the cost but also made training for on-site operators easier. “Previously, there were a lot of things only certain staff knew, and even finding the right person to ask for help was hard work. But now, our team is acting like a Zabbix support department. Inquiries are coming in from departments we’ve never worked with before, which is challenging to handle; however, the efficiency has been improved thanks to the integration of support systems,” Mr. Kamiya states.

Furthermore, as a result of implementing a system that automatically applies the policy, “The process of building, applying templates, and setting up a monitoring system now takes approximately 30 minutes to complete. It used to take about three weeks, so this is a significant time saving,” says Mr. Kamiya. “Also, since the policy is applied to all of the hundreds of servers, without any omissions, it contributes to improving the quality. Container monitoring can also be automated by utilizing LLD systems,” Mr. Kamiya explains the outcomes.

Switching to the internal support operation has not caused any serious issue, Mr. Kamiya says. It has actually enabled them to provide faster support, since users can now ask questions to the person in charge within the organization, who has experience of using Zabbix, Mr. Kamiya has also set up a Q&A function in Microsoft Teams, which is used internally, where users can ask questions about Zabbix. Their target time to solve an issue is within three working days, and solutions to the question are provided with procedure notes.

Standardization of Monitoring Items Solved the Issue of Variance Between Tasks

After the installation of Zabbix, monitoring items were also standardized. Mr. Mitsuru Kawamata, Group Leader of the System Asset Section, says: “Previously, when installing equipment, it was necessary to check whether each piece of equipment ran monitoring properly, as it was individually designed. The standardization has enabled us to omit this installation process. There is no need to check monitoring items now, so approval procedures can be simplified.”

When implementing standardization, they aimed to include a feature that enables the setting up of monitoring items in accordance with the monitoring implementation policy stipulated by KDDI. In addition, KDDI produced standard templates for Zabbix, which comply with the policy. They also made it possible to tailor monitoring, depending on the environment, by changing macros. “I also think that the ease of use of the templates is an advantage of Zabbix,” says Mr. Kawamata.

Zabbix to be Installed for the Equipment for Networks and Mobile Communications

Now that KDDI has established a monitoring system for servers, they are planning to expand the use of Zabbix to other equipment, such as for networks and mobile communications. This is because they believe that sharing know-how and concerns can help mutually solve issues, resulting in improving the monitoring quality. The number of hosts to be monitored is planned to be increased to about 5,000 for server and mobile communication equipment respectively, and up to about 20,000 for network equipment.

The first task to realize this will be the implementation of standardization of baseline monitoring. This will enable each equipment to define KPIs for checking service usage, determine abnormality based on KPIs, and detect failures in real-time. KDDI is also considering standardizing machine learning by using Zabbix to analyze device-based metric data such as CPU usage, to determine the existence of abnormalities, and to detect failures.

“We will continue working to improve monitoring on Zabbix, as well as enhancing it to set the standard for equipment operation,” says Mr. Kamiya.

System Overview

Number of Zabbix Servers: normal Zabbix more than 10(inclued onthers more than 100)
Number of Zabbix Proxies: none
Redundancy: Yes. Active-Active
Number of monitored devices: approximately 10-1000/system base
Number of triggers: approximately 10-1000/system
Number of item: maximum 400,000
Number of users: more than 3/system
NVPS::233NVPS(14,000/min)~300NVPS(18,000/min)

KDDI corporation

KDDI, as a comprehensive communications company offering both fixed-line and mobile communications services, strives to be a leading company during changing times.

For individual customers, KDDI offers its mobile communications (au mobile phone) and fixed-line communications (broadband Internet/telephone) services under the brand name "au", helping to realize new seamless communications environments. And for business clients, KDDI provides all services in the ICT (Information and Communication Technology) realm, from FMC (Fixed Mobile Convergence) networks to data centers, applications, and security strategies, to help clients strengthen their businesses.

Head office:
Tokyo, Japan
Founded:
1984
Employees:
61,037(March, 2024)
Capital:
141,85mil.Yen
www.kddi.com

Want to share your story of using Zabbix Monitoring Solution?

Fill out this Questionnaire or contact our Marketing Team for further assistance.

Get started in 10 minutes - absolutely FREE

Download Zabbix

Zabbix is a professionally developed open-source software with no limits or hidden costs