Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/app/jenkins?at=release/7.0
Jenkins by HTTP
Overview
The template to monitor Apache Jenkins by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.
Requirements
Zabbix version: 7.0 and higher.
Tested versions
This template has been tested on:
- Jenkins 2.263.1
Configuration
Zabbix should be configured according to the instructions in the Templates out of the box section.
Setup
Metrics are collected by requests to Metrics API. For common metrics: Install and configure Metrics plugin parameters according official documentations. Do not forget to configure access to the Metrics Servlet by issuing API key and change macro {$JENKINS.API.KEY}.
For monitoring computers and builds: Create API token for monitoring user according official documentations and change macro {$JENKINS.USER}, {$JENKINS.API.TOKEN}. Don't forget to change macros {$JENKINS.URL}.
Macros used
Name | Description | Default |
---|---|---|
{$JENKINS.URL} | Jenkins URL in the format |
|
{$JENKINS.API.KEY} | API key to access Metrics Servlet |
|
{$JENKINS.USER} | Username for HTTP BASIC authentication |
zabbix |
{$JENKINS.API.TOKEN} | API token for HTTP BASIC authentication. |
|
{$JENKINS.PING.REPLY} | Expected reply to the ping. |
pong |
{$JENKINS.FILE_DESCRIPTORS.MAX.WARN} | Maximum percentage of file descriptors usage alert threshold (for trigger expression). |
85 |
{$JENKINS.JOB.HEALTH.SCORE.MIN.WARN} | Minimum job's health score (for trigger expression). |
50 |
Items
Name | Description | Type | Key and additional info |
---|---|---|---|
Get service metrics | HTTP agent | jenkins.get_metrics Preprocessing
|
|
Get healthcheck | HTTP agent | jenkins.healthcheck Preprocessing
|
|
Get jobs info | HTTP agent | jenkins.job_info Preprocessing
|
|
Get computer info | HTTP agent | jenkins.computer_info Preprocessing
|
|
Disk space check message | The message will reference the first node which fails this check. There may be other nodes that fail the check, but this health check is designed to fail fast. |
Dependent item | jenkins.disk_space.message Preprocessing
|
Temporary space check message | The message will reference the first node which fails this check. There may be other nodes that fail the check, but this health check is designed to fail fast. |
Dependent item | jenkins.temporary_space.message Preprocessing
|
Plugins check message | The message of plugins health check. |
Dependent item | jenkins.plugins.message Preprocessing
|
Thread deadlock check message | The message of thread deadlock health check. |
Dependent item | jenkins.thread_deadlock.message Preprocessing
|
Disk space check | Returns FAIL if any of the Jenkins disk space monitors are reporting the disk space as less than the configured threshold. |
Dependent item | jenkins.disk_space Preprocessing
|
Plugins check | Returns FAIL if any of the Jenkins plugins failed to start. |
Dependent item | jenkins.plugins Preprocessing
|
Temporary space check | Returns FAIL if any of the Jenkins temporary space monitors are reporting the temporary space as less than the configured threshold. |
Dependent item | jenkins.temporary_space Preprocessing
|
Thread deadlock check | Returns FAIL if there are any deadlocked threads in the Jenkins master JVM. |
Dependent item | jenkins.thread_deadlock Preprocessing
|
Get gauges | Raw items for gauges metrics. |
Dependent item | jenkins.gauges.raw Preprocessing
|
Executors count | The number of executors available to Jenkins. This is corresponds to the sum of all the executors of all the online nodes. |
Dependent item | jenkins.executor.count Preprocessing
|
Executors free | The number of executors available to Jenkins that are not currently in use. |
Dependent item | jenkins.executor.free Preprocessing
|
Executors in use | The number of executors available to Jenkins that are currently in use. |
Dependent item | jenkins.executor.in_use Preprocessing
|
Nodes count | The number of build nodes available to Jenkins, both online and offline. |
Dependent item | jenkins.node.count Preprocessing
|
Nodes offline | The number of build nodes available to Jenkins but currently offline. |
Dependent item | jenkins.node.offline Preprocessing
|
Nodes online | The number of build nodes available to Jenkins and currently online. |
Dependent item | jenkins.node.online Preprocessing
|
Plugins active | The number of plugins in the Jenkins instance that started successfully. |
Dependent item | jenkins.plugins.active Preprocessing
|
Plugins failed | The number of plugins in the Jenkins instance that failed to start. A value other than 0 is typically indicative of a potential issue within the Jenkins installation that will either be solved by explicitly disabling the plugin(s) or by resolving the plugin dependency issues. |
Dependent item | jenkins.plugins.failed Preprocessing
|
Plugins inactive | The number of plugins in the Jenkins instance that are not currently enabled. |
Dependent item | jenkins.plugins.inactive Preprocessing
|
Plugins with update | The number of plugins in the Jenkins instance that have a newer version reported as available in the current Jenkins update center metadata held by Jenkins. This value is not indicative of an issue with Jenkins but high values can be used as a trigger to review the plugins with updates with a view to seeing whether those updates potentially contain fixes for issues that could be affecting your Jenkins instance. |
Dependent item | jenkins.plugins.with_update Preprocessing
|
Projects count | The number of projects. |
Dependent item | jenkins.project.count Preprocessing
|
Jobs count | The number of jobs in Jenkins. |
Dependent item | jenkins.job.count.value Preprocessing
|
Get meters | Raw items for meters metrics. |
Dependent item | jenkins.meters.raw Preprocessing
|
Job scheduled, m1 rate | The rate at which jobs are scheduled. If a job is already in the queue and an identical request for scheduling the job is received then Jenkins will coalesce the two requests. This metric gives a reasonably pure measure of the load requirements of the Jenkins master as it is unaffected by the number of executors available to the system. |
Dependent item | jenkins.job.scheduled.m1.rate Preprocessing
|
Jobs scheduled, m5 rate | The rate at which jobs are scheduled. If a job is already in the queue and an identical request for scheduling the job is received then Jenkins will coalesce the two requests. This metric gives a reasonably pure measure of the load requirements of the Jenkins master as it is unaffected by the number of executors available to the system. |
Dependent item | jenkins.job.scheduled.m5.rate Preprocessing
|
Get timers | Raw items for timers metrics. |
Dependent item | jenkins.timers.raw Preprocessing
|
Job blocked, m1 rate | The rate at which jobs in the build queue enter the blocked state. |
Dependent item | jenkins.job.blocked.m1.rate Preprocessing
|
Job blocked, m5 rate | The rate at which jobs in the build queue enter the blocked state. |
Dependent item | jenkins.job.blocked.m5.rate Preprocessing
|
Job blocked duration, p95 | The amount of time which jobs spend in the blocked state. |
Dependent item | jenkins.job.blocked.duration.p95 Preprocessing
|
Job blocked duration, median | The amount of time which jobs spend in the blocked state. |
Dependent item | jenkins.job.blocked.duration.p50 Preprocessing
|
Job building, m1 rate | The rate at which jobs are built. |
Dependent item | jenkins.job.building.m1.rate Preprocessing
|
Job building, m5 rate | The rate at which jobs are built. |
Dependent item | jenkins.job.building.m5.rate Preprocessing
|
Job building duration, p95 | The amount of time which jobs spend building. |
Dependent item | jenkins.job.building.duration.p95 Preprocessing
|
Job building duration, median | The amount of time which jobs spend building. |
Dependent item | jenkins.job.building.duration.p50 Preprocessing
|
Job buildable, m1 rate | The rate at which jobs in the build queue enter the buildable state. |
Dependent item | jenkins.job.buildable.m1.rate Preprocessing
|
Job buildable, m5 rate | The rate at which jobs in the build queue enter the buildable state. |
Dependent item | jenkins.job.buildable.m5.rate Preprocessing
|
Job buildable duration, p95 | The amount of time which jobs spend in the buildable state. |
Dependent item | jenkins.job.buildable.duration.p95 Preprocessing
|
Job buildable duration, median | The amount of time which jobs spend in the buildable state. |
Dependent item | jenkins.job.buildable.duration.p50 Preprocessing
|
Job queuing, m1 rate | The rate at which jobs are queued. |
Dependent item | jenkins.job.queuing.m1.rate Preprocessing
|
Job queuing, m5 rate | The rate at which jobs are queued. |
Dependent item | jenkins.job.queuing.m5.rate Preprocessing
|
Job queuing duration, p95 | The total time which jobs spend in the build queue. |
Dependent item | jenkins.job.queuing.duration.p95 Preprocessing
|
Job queuing duration, median | The total time which jobs spend in the build queue. |
Dependent item | jenkins.job.queuing.duration.p50 Preprocessing
|
Job total, m1 rate | The rate at which jobs are queued. |
Dependent item | jenkins.job.total.m1.rate Preprocessing
|
Job total, m5 rate | The rate at which jobs are queued. |
Dependent item | jenkins.job.total.m5.rate Preprocessing
|
Job total duration, p95 | The total time which jobs spend from entering the build queue to completing building. |
Dependent item | jenkins.job.total.duration.p95 Preprocessing
|
Job total duration, median | The total time which jobs spend from entering the build queue to completing building. |
Dependent item | jenkins.job.total.duration.p50 Preprocessing
|
Job waiting, m1 rate | The rate at which jobs enter the quiet period. |
Dependent item | jenkins.job.waiting.m1.rate Preprocessing
|
Job waiting, m5 rate | The rate at which jobs enter the quiet period. |
Dependent item | jenkins.job.waiting.m5.rate Preprocessing
|
Job waiting duration, p95 | The total amount of time that jobs spend in their quiet period. |
Dependent item | jenkins.job.waiting.duration.p95 Preprocessing
|
Job waiting duration, median | The total amount of time that jobs spend in their quiet period. |
Dependent item | jenkins.job.waiting.duration.p50 Preprocessing
|
Build queue, blocked | The number of jobs that are in the Jenkins build queue and currently in the blocked state. |
Dependent item | jenkins.queue.blocked Preprocessing
|
Build queue, size | The number of jobs that are in the Jenkins build queue. |
Dependent item | jenkins.queue.size Preprocessing
|
Build queue, buildable | The number of jobs that are in the Jenkins build queue and currently in the blocked state. |
Dependent item | jenkins.queue.buildable Preprocessing
|
Build queue, pending | The number of jobs that are in the Jenkins build queue and currently in the blocked state. |
Dependent item | jenkins.queue.pending Preprocessing
|
Build queue, stuck | The number of jobs that are in the Jenkins build queue and currently in the blocked state. |
Dependent item | jenkins.queue.stuck Preprocessing
|
HTTP active requests, rate | The number of currently active requests against the Jenkins master Web UI. |
Dependent item | jenkins.http.active_requests.rate Preprocessing
|
HTTP response 400, rate | The rate at which the Jenkins master Web UI is responding to requests with an HTTP/400 status code. |
Dependent item | jenkins.http.bad_request.rate Preprocessing
|
HTTP response 500, rate | The rate at which the Jenkins master Web UI is responding to requests with an HTTP/500 status code. |
Dependent item | jenkins.http.server_error.rate Preprocessing
|
HTTP response 503, rate | The rate at which the Jenkins master Web UI is responding to requests with an HTTP/503 status code. |
Dependent item | jenkins.http.service_unavailable.rate Preprocessing
|
HTTP response 200, rate | The rate at which the Jenkins master Web UI is responding to requests with an HTTP/200 status code. |
Dependent item | jenkins.http.ok.rate Preprocessing
|
HTTP response other, rate | The rate at which the Jenkins master Web UI is responding to requests with a non-informational status code that is not in the list: HTTP/200, HTTP/201, HTTP/204, HTTP/304, HTTP/400, HTTP/403, HTTP/404, HTTP/500, or HTTP/503. |
Dependent item | jenkins.http.other.rate Preprocessing
|
HTTP response 201, rate | The rate at which the Jenkins master Web UI is responding to requests with an HTTP/201 status code. |
Dependent item | jenkins.http.created.rate Preprocessing
|
HTTP response 204, rate | The rate at which the Jenkins master Web UI is responding to requests with an HTTP/204 status code. |
Dependent item | jenkins.http.no_content.rate Preprocessing
|
HTTP response 404, rate | The rate at which the Jenkins master Web UI is responding to requests with an HTTP/404 status code. |
Dependent item | jenkins.http.not_found.rate Preprocessing
|
HTTP response 304, rate | The rate at which the Jenkins master Web UI is responding to requests with an HTTP/304 status code. |
Dependent item | jenkins.http.not_modified.rate Preprocessing
|
HTTP response 403, rate | The rate at which the Jenkins master Web UI is responding to requests with an HTTP/403 status code. |
Dependent item | jenkins.http.forbidden.rate Preprocessing
|
HTTP requests, rate | The rate at which the Jenkins master Web UI is receiving requests. |
Dependent item | jenkins.http.requests.rate Preprocessing
|
HTTP requests, p95 | The time spent generating the corresponding responses. |
Dependent item | jenkins.http.requests_p95.rate Preprocessing
|
HTTP requests, median | The time spent generating the corresponding responses. |
Dependent item | jenkins.http.requests_p50.rate Preprocessing
|
Version | Version of Jenkins server. |
Dependent item | jenkins.version Preprocessing
|
CPU Load | The system load on the Jenkins master as reported by the JVM's Operating System JMX bean. The calculation of system load is operating system dependent. Typically this is the sum of the number of processes that are currently running plus the number that are waiting to run. This is typically comparable against the number of CPU cores. |
Dependent item | jenkins.system.cpu.load Preprocessing
|
Uptime | The number of seconds since the Jenkins master JVM started. |
Dependent item | jenkins.system.uptime Preprocessing
|
File descriptor ratio | The ratio of used to total file descriptors |
Dependent item | jenkins.descriptor.ratio Preprocessing
|
Service ping | HTTP agent | jenkins.ping Preprocessing
|
Triggers
Name | Description | Expression | Severity | Dependencies and additional info |
---|---|---|---|---|
Jenkins: Disk space is too low | Jenkins disk space monitors are reporting the disk space as less than the configured threshold. The message will reference the first node which fails this check. |
last(/Jenkins by HTTP/jenkins.disk_space)=0 and length(last(/Jenkins by HTTP/jenkins.disk_space.message))>0 |
Warning | |
Jenkins: One or more Jenkins plugins failed to start | A failure is typically indicative of a potential issue within the Jenkins installation that will either be solved by explicitly disabling the failing plugin(s) or by resolving the corresponding plugin dependency issues. |
last(/Jenkins by HTTP/jenkins.plugins)=0 and length(last(/Jenkins by HTTP/jenkins.plugins.message))>0 |
Info | Manual close: Yes |
Jenkins: Temporary space is too low | Jenkins temporary space monitors are reporting the temporary space as less than the configured threshold. The message will reference the first node which fails this check. |
last(/Jenkins by HTTP/jenkins.temporary_space)=0 and length(last(/Jenkins by HTTP/jenkins.temporary_space.message))>0 |
Warning | |
Jenkins: There are deadlocked threads in Jenkins master JVM | There are any deadlocked threads in the Jenkins master JVM. |
last(/Jenkins by HTTP/jenkins.thread_deadlock)=0 and length(last(/Jenkins by HTTP/jenkins.thread_deadlock.message))>0 |
Warning | |
Jenkins: Service has no online nodes | last(/Jenkins by HTTP/jenkins.node.online)=0 |
Average | ||
Jenkins: Version has changed | The Jenkins version has changed. Acknowledge to close the problem manually. |
last(/Jenkins by HTTP/jenkins.version,#1)<>last(/Jenkins by HTTP/jenkins.version,#2) and length(last(/Jenkins by HTTP/jenkins.version))>0 |
Info | Manual close: Yes |
Jenkins: Host has been restarted | Uptime is less than 10 minutes. |
last(/Jenkins by HTTP/jenkins.system.uptime)<10m |
Info | Manual close: Yes |
Jenkins: Current number of used files is too high | min(/Jenkins by HTTP/jenkins.descriptor.ratio,5m)>{$JENKINS.FILE_DESCRIPTORS.MAX.WARN} |
Warning | ||
Jenkins: Service is down | last(/Jenkins by HTTP/jenkins.ping)=0 |
Average | Manual close: Yes |
LLD rule Jobs discovery
Name | Description | Type | Key and additional info |
---|---|---|---|
Jobs discovery | HTTP agent | jenkins.jobs Preprocessing
|
Item prototypes for Jobs discovery
Name | Description | Type | Key and additional info |
---|---|---|---|
Job [{#NAME}]: Get job | Raw data for a job. |
Dependent item | jenkins.job.get[{#NAME}] Preprocessing
|
Job [{#NAME}]: Health score | Represents health of project. A number between 0-100. Job Description: {#DESCRIPTION} Job Url: {#URL} |
Dependent item | jenkins.build.health[{#NAME}] Preprocessing
|
Job [{#NAME}]: Last Build number | Details: {#URL}/lastBuild/ |
Dependent item | jenkins.last_build.number[{#NAME}] Preprocessing
|
Job [{#NAME}]: Last Build duration | Build duration (in seconds). |
Dependent item | jenkins.last_build.duration[{#NAME}] Preprocessing
|
Job [{#NAME}]: Last Build timestamp | Dependent item | jenkins.last_build.timestamp[{#NAME}] Preprocessing
|
|
Job [{#NAME}]: Last Build result | Dependent item | jenkins.last_build.result[{#NAME}] Preprocessing
|
|
Job [{#NAME}]: Last Failed Build number | Details: {#URL}/lastFailedBuild/ |
Dependent item | jenkins.last_failed_build.number[{#NAME}] Preprocessing
|
Job [{#NAME}]: Last Failed Build duration | Build duration (in seconds). |
Dependent item | jenkins.last_failed_build.duration[{#NAME}] Preprocessing
|
Job [{#NAME}]: Last Failed Build timestamp | Dependent item | jenkins.last_failed_build.timestamp[{#NAME}] Preprocessing
|
|
Job [{#NAME}]: Last Successful Build number | Details: {#URL}/lastSuccessfulBuild/ |
Dependent item | jenkins.last_successful_build.number[{#NAME}] Preprocessing
|
Job [{#NAME}]: Last Successful Build duration | Build duration (in seconds). |
Dependent item | jenkins.last_successful_build.duration[{#NAME}] Preprocessing
|
Job [{#NAME}]: Last Successful Build timestamp | Dependent item | jenkins.last_successful_build.timestamp[{#NAME}] Preprocessing
|
Trigger prototypes for Jobs discovery
Name | Description | Expression | Severity | Dependencies and additional info |
---|---|---|---|---|
Jenkins: Job [{#NAME}]: Job is unhealthy | last(/Jenkins by HTTP/jenkins.build.health[{#NAME}])<{$JENKINS.JOB.HEALTH.SCORE.MIN.WARN} |
Warning | Manual close: Yes |
LLD rule Computers discovery
Name | Description | Type | Key and additional info |
---|---|---|---|
Computers discovery | HTTP agent | jenkins.computers Preprocessing
|
Item prototypes for Computers discovery
Name | Description | Type | Key and additional info |
---|---|---|---|
Computer [{#DISPLAY_NAME}]: Get computer | Raw data for a computer. |
Dependent item | jenkins.computer.get[{#DISPLAY_NAME}] Preprocessing
|
Computer [{#DISPLAY_NAME}]: Executors | The maximum number of concurrent builds that Jenkins may perform on this node. |
Dependent item | jenkins.computer.numExecutors[{#DISPLAY_NAME}] Preprocessing
|
Computer [{#DISPLAY_NAME}]: State | Represents the actual online/offline state. Node description: {#DESCRIPTION} |
Dependent item | jenkins.computer.state[{#DISPLAY_NAME}] Preprocessing
|
Computer [{#DISPLAY_NAME}]: Offline cause reason | If the computer was offline (either temporarily or not), will return the cause as a string (without user info). Empty string if the system was put offline without given a cause. |
Dependent item | jenkins.computer.offline.reason[{#DISPLAY_NAME}] Preprocessing
|
Computer [{#DISPLAY_NAME}]: Idle | Returns true if all the executors of this computer are idle. |
Dependent item | jenkins.computer.idle[{#DISPLAY_NAME}] Preprocessing
|
Computer [{#DISPLAY_NAME}]: Temporarily offline | Returns true if this node is marked temporarily offline. |
Dependent item | jenkins.computer.temp_offline[{#DISPLAY_NAME}] Preprocessing
|
Computer [{#DISPLAY_NAME}]: Available disk space | The available disk space of $JENKINS_HOME on agent. |
Dependent item | jenkins.computer.disk_space[{#DISPLAY_NAME}] Preprocessing
|
Computer [{#DISPLAY_NAME}]: Available temp space | The available disk space of the temporary directory. Java tools and tests/builds often create files in the temporary directory, and may not function properly if there's no available space. |
Dependent item | jenkins.computer.temp_space[{#DISPLAY_NAME}] Preprocessing
|
Computer [{#DISPLAY_NAME}]: Response time average | The round trip network response time from the master to the agent |
Dependent item | jenkins.computer.response_time[{#DISPLAY_NAME}] Preprocessing
|
Computer [{#DISPLAY_NAME}]: Available physical memory | The total physical memory of the system, available bytes. |
Dependent item | jenkins.computer.available_physical_memory[{#DISPLAY_NAME}] Preprocessing
|
Computer [{#DISPLAY_NAME}]: Available swap space | Available swap space in bytes. |
Dependent item | jenkins.computer.available_swap_space[{#DISPLAY_NAME}] Preprocessing
|
Computer [{#DISPLAY_NAME}]: Total physical memory | Total physical memory of the system, in bytes. |
Dependent item | jenkins.computer.total_physical_memory[{#DISPLAY_NAME}] Preprocessing
|
Computer [{#DISPLAY_NAME}]: Total swap space | Total number of swap space in bytes. |
Dependent item | jenkins.computer.total_swap_space[{#DISPLAY_NAME}] Preprocessing
|
Computer [{#DISPLAY_NAME}]: Clock difference | The clock difference between the master and nodes. |
Dependent item | jenkins.computer.clock_difference[{#DISPLAY_NAME}] Preprocessing
|
Trigger prototypes for Computers discovery
Name | Description | Expression | Severity | Dependencies and additional info |
---|---|---|---|---|
Jenkins: Computer [{#DISPLAY_NAME}]: Node is down | Node down with reason: {{ITEM.LASTVALUE2}.regsub("(.*)",\1)} |
last(/Jenkins by HTTP/jenkins.computer.state[{#DISPLAY_NAME}])=1 and length(last(/Jenkins by HTTP/jenkins.computer.offline.reason[{#DISPLAY_NAME}]))>0 |
Average | Depends on:
|
Jenkins: Computer [{#DISPLAY_NAME}]: Node is temporarily offline | Node is temporarily Offline with reason: {{ITEM.LASTVALUE2}.regsub("(.*)",\1)} |
last(/Jenkins by HTTP/jenkins.computer.temp_offline[{#DISPLAY_NAME}])=1 and length(last(/Jenkins by HTTP/jenkins.computer.offline.reason[{#DISPLAY_NAME}]))>0 |
Info | Manual close: Yes |
Feedback
Please report any issues with the template at https://support.zabbix.com
You can also provide feedback, discuss the template, or ask for help at ZABBIX forums