Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/app/gitlab_http?at=release/7.0
GitLab by HTTP
Overview
This template is designed to monitor GitLab by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.
The template GitLab by HTTP
— collects metrics by an HTTP agent from the GitLab /-/metrics
endpoint.
See https://docs.gitlab.com/ee/administration/monitoring/prometheus/gitlab_metrics.html.
Requirements
Zabbix version: 7.0 and higher.
Tested versions
This template has been tested on:
- GitLab 13.5.3 EE
Configuration
Zabbix should be configured according to the instructions in the Templates out of the box section.
Setup
This template works with self-hosted GitLab instances. Internal service metrics are collected from the GitLab /-/metrics
endpoint.
To access metrics following two methods are available:
- Explicitly allow monitoring instance IP address in gitlab whitelist configuration.
- Get token from Gitlab
Admin -> Monitoring -> Health check
page: http://your.gitlab.address/admin/health_check; Use this token in macro{$GITLAB.HEALTH.TOKEN}
as variable path, like:?token=your_token
. Remember to change the macros{$GITLAB.URL}
. Also, see the Macros section for a list of macros used to set trigger values.
NOTE. Some metrics may not be collected depending on your Gitlab instance version and configuration. See Gitlab's documentation for further information about its metric collection.
Macros used
Name | Description | Default |
---|---|---|
{$GITLAB.URL} | URL of a GitLab instance. |
http://localhost |
{$GITLAB.HEALTH.TOKEN} | The token path for Gitlab health check. Example |
|
{$GITLAB.UNICORN.UTILIZATION.MAX.WARN} | The maximum percentage of Unicorn workers utilization for a trigger expression. |
90 |
{$GITLAB.PUMA.UTILIZATION.MAX.WARN} | The maximum percentage of Puma thread utilization for a trigger expression. |
90 |
{$GITLAB.HTTP.FAIL.MAX.WARN} | The maximum number of HTTP request failures for a trigger expression. |
2 |
{$GITLAB.REDIS.FAIL.MAX.WARN} | The maximum number of Redis client exceptions for a trigger expression. |
2 |
{$GITLAB.UNICORN.QUEUE.MAX.WARN} | The maximum number of Unicorn queued requests for a trigger expression. |
1 |
{$GITLAB.PUMA.QUEUE.MAX.WARN} | The maximum number of Puma queued requests for a trigger expression. |
1 |
{$GITLAB.OPEN.FDS.MAX.WARN} | The maximum percentage of used file descriptors for a trigger expression. |
90 |
Items
Name | Description | Type | Key and additional info |
---|---|---|---|
Get instance metrics | HTTP agent | gitlab.get_metrics Preprocessing
|
|
Instance readiness check | The readiness probe checks whether the GitLab instance is ready to accept traffic via Rails Controllers. |
HTTP agent | gitlab.readiness Preprocessing
|
Application server status | Checks whether the application server is running. This probe is used to know if Rails Controllers are not deadlocked due to a multi-threading. |
HTTP agent | gitlab.liveness Preprocessing
|
Version | Version of the GitLab instance. |
Dependent item | gitlab.deployments.version Preprocessing
|
Ruby: First process start time | Minimum UNIX timestamp of ruby processes start time. |
Dependent item | gitlab.ruby.process_start_time_seconds.first Preprocessing
|
Ruby: Last process start time | Maximum UNIX timestamp ruby processes start time. |
Dependent item | gitlab.ruby.process_start_time_seconds.last Preprocessing
|
User logins, total | Counter of how many users have logged in since GitLab was started or restarted. |
Dependent item | gitlab.user_session_logins_total Preprocessing
|
User CAPTCHA logins failed, total | Counter of failed CAPTCHA attempts during login. |
Dependent item | gitlab.failed_login_captcha_total Preprocessing
|
User CAPTCHA logins, total | Counter of successful CAPTCHA attempts during login. |
Dependent item | gitlab.successful_login_captcha_total Preprocessing
|
Upload file does not exist | Number of times an upload record could not find its file. |
Dependent item | gitlab.upload_file_does_not_exist Preprocessing
|
Pipelines: Processing events, total | Total amount of pipeline processing events. |
Dependent item | gitlab.pipeline.processing_events_total Preprocessing
|
Pipelines: Created, total | Counter of pipelines created. |
Dependent item | gitlab.pipeline.created_total Preprocessing
|
Pipelines: Auto DevOps pipelines, total | Counter of completed Auto DevOps pipelines. |
Dependent item | gitlab.pipeline.auto_devops_completed.total Preprocessing
|
Pipelines: Auto DevOps pipelines, failed | Counter of completed Auto DevOps pipelines with status "failed". |
Dependent item | gitlab.pipeline.auto_devops_completed_total.failed Preprocessing
|
Pipelines: CI/CD creation duration | The sum of the time in seconds it takes to create a CI/CD pipeline. |
Dependent item | gitlab.pipeline.pipeline_creation Preprocessing
|
Pipelines: Pipelines: CI/CD creation count | The count of the time it takes to create a CI/CD pipeline. |
Dependent item | gitlab.pipeline.pipeline_creation.count Preprocessing
|
Database: Connection pool, busy | Connections to the main database in use where the owner is still alive. |
Dependent item | gitlab.database.connection_pool_busy Preprocessing
|
Database: Connection pool, current | Current connections to the main database in the pool. |
Dependent item | gitlab.database.connection_pool_connections Preprocessing
|
Database: Connection pool, dead | Connections to the main database in use where the owner is not alive. |
Dependent item | gitlab.database.connection_pool_dead Preprocessing
|
Database: Connection pool, idle | Connections to the main database not in use. |
Dependent item | gitlab.database.connection_pool_idle Preprocessing
|
Database: Connection pool, size | Total connection to the main database pool capacity. |
Dependent item | gitlab.database.connection_pool_size Preprocessing
|
Database: Connection pool, waiting | Threads currently waiting on this queue. |
Dependent item | gitlab.database.connection_pool_waiting Preprocessing
|
Redis: Client requests rate, queues | Number of Redis client requests per second. (Instance: queues) |
Dependent item | gitlab.redis.client_requests.queues.rate Preprocessing
|
Redis: Client requests rate, cache | Number of Redis client requests per second. (Instance: cache) |
Dependent item | gitlab.redis.client_requests.cache.rate Preprocessing
|
Redis: Client requests rate, shared_state | Number of Redis client requests per second. (Instance: shared_state) |
Dependent item | gitlab.redis.client_requests.shared_state.rate Preprocessing
|
Redis: Client exceptions rate, queues | Number of Redis client exceptions per second. (Instance: queues) |
Dependent item | gitlab.redis.client_exceptions.queues.rate Preprocessing
|
Redis: Client exceptions rate, cache | Number of Redis client exceptions per second. (Instance: cache) |
Dependent item | gitlab.redis.client_exceptions.cache.rate Preprocessing
|
Redis: client exceptions rate, shared_state | Number of Redis client exceptions per second. (Instance: shared_state) |
Dependent item | gitlab.redis.client_exceptions.shared_state.rate Preprocessing
|
Cache: Misses rate, total | The cache read miss count. |
Dependent item | gitlab.cache.misses_total.rate Preprocessing
|
Cache: Operations rate, total | The count of cache operations. |
Dependent item | gitlab.cache.operations_total.rate Preprocessing
|
Ruby: CPU usage per second | Average CPU time util in seconds. |
Dependent item | gitlab.ruby.process_cpu_seconds.rate Preprocessing
|
Ruby: Running_threads | Number of running Ruby threads. |
Dependent item | gitlab.ruby.threads_running Preprocessing
|
Ruby: File descriptors opened, avg | Average number of opened file descriptors. |
Dependent item | gitlab.ruby.file_descriptors.avg Preprocessing
|
Ruby: File descriptors opened, max | Maximum number of opened file descriptors. |
Dependent item | gitlab.ruby.file_descriptors.max Preprocessing
|
Ruby: File descriptors opened, min | Minimum number of opened file descriptors. |
Dependent item | gitlab.ruby.file_descriptors.min Preprocessing
|
Ruby: File descriptors, max | Maximum number of open file descriptors per process. |
Dependent item | gitlab.ruby.process_max_fds Preprocessing
|
Ruby: RSS memory, avg | Average RSS Memory usage in bytes. |
Dependent item | gitlab.ruby.process_resident_memory_bytes.avg Preprocessing
|
Ruby: RSS memory, min | Minimum RSS Memory usage in bytes. |
Dependent item | gitlab.ruby.process_resident_memory_bytes.min Preprocessing
|
Ruby: RSS memory, max | Maximum RSS Memory usage in bytes. |
Dependent item | gitlab.ruby.process_resident_memory_bytes.max Preprocessing
|
HTTP requests rate, total | Number of requests received into the system. |
Dependent item | gitlab.http.requests.rate Preprocessing
|
HTTP requests rate, 5xx | Number of handle failures of requests with HTTP-code 5xx. |
Dependent item | gitlab.http.requests.5xx.rate Preprocessing
|
HTTP requests rate, 4xx | Number of handle failures of requests with code 4XX. |
Dependent item | gitlab.http.requests.4xx.rate Preprocessing
|
Transactions per second | Transactions per second (gitlab_transaction_* metrics). |
Dependent item | gitlab.transactions.rate Preprocessing
|
Triggers
Name | Description | Expression | Severity | Dependencies and additional info |
---|---|---|---|---|
Gitlab instance is not able to accept traffic | last(/GitLab by HTTP/gitlab.readiness)=0 |
High | Depends on:
|
|
Liveness check was failed | The application server is not running or Rails Controllers are deadlocked. |
last(/GitLab by HTTP/gitlab.liveness)=0 |
High | |
Version has changed | The GitLab version has changed. Acknowledge to close the problem manually. |
last(/GitLab by HTTP/gitlab.deployments.version,#1)<>last(/GitLab by HTTP/gitlab.deployments.version,#2) and length(last(/GitLab by HTTP/gitlab.deployments.version))>0 |
Info | Manual close: Yes |
Too many Redis queues client exceptions | "Too many Redis client exceptions during the requests to Redis instance queues." |
min(/GitLab by HTTP/gitlab.redis.client_exceptions.queues.rate,5m)>{$GITLAB.REDIS.FAIL.MAX.WARN} |
Warning | |
Too many Redis cache client exceptions | "Too many Redis client exceptions during the requests to Redis instance cache." |
min(/GitLab by HTTP/gitlab.redis.client_exceptions.cache.rate,5m)>{$GITLAB.REDIS.FAIL.MAX.WARN} |
Warning | |
Too many Redis shared_state client exceptions | "Too many Redis client exceptions during the requests to Redis instance shared_state." |
min(/GitLab by HTTP/gitlab.redis.client_exceptions.shared_state.rate,5m)>{$GITLAB.REDIS.FAIL.MAX.WARN} |
Warning | |
Failed to fetch info data | Zabbix has not received a metrics data for the last 30 minutes |
nodata(/GitLab by HTTP/gitlab.ruby.threads_running,30m)=1 |
Warning | Manual close: Yes Depends on:
|
Current number of open files is too high | min(/GitLab by HTTP/gitlab.ruby.file_descriptors.max,5m)/last(/GitLab by HTTP/gitlab.ruby.process_max_fds)*100>{$GITLAB.OPEN.FDS.MAX.WARN} |
Warning | ||
Too many HTTP requests failures | "Too many requests failed on GitLab instance with 5xx HTTP code" |
min(/GitLab by HTTP/gitlab.http.requests.5xx.rate,5m)>{$GITLAB.HTTP.FAIL.MAX.WARN} |
Warning |
LLD rule Unicorn metrics discovery
Name | Description | Type | Key and additional info |
---|---|---|---|
Unicorn metrics discovery | DiscoveryUnicorn specific metrics, when Unicorn is used. |
HTTP agent | gitlab.unicorn.discovery Preprocessing
|
Item prototypes for Unicorn metrics discovery
Name | Description | Type | Key and additional info |
---|---|---|---|
Unicorn: Workers | The number of Unicorn workers |
Dependent item | gitlab.unicorn.unicorn_workers[{#SINGLETON}] Preprocessing
|
Unicorn: Active connections | The number of active Unicorn connections. |
Dependent item | gitlab.unicorn.active_connections[{#SINGLETON}] Preprocessing
|
Unicorn: Queued connections | The number of queued Unicorn connections. |
Dependent item | gitlab.unicorn.queued_connections[{#SINGLETON}] Preprocessing
|
Trigger prototypes for Unicorn metrics discovery
Name | Description | Expression | Severity | Dependencies and additional info |
---|---|---|---|---|
Unicorn worker utilization is too high | min(/GitLab by HTTP/gitlab.unicorn.active_connections[{#SINGLETON}],5m)/last(/GitLab by HTTP/gitlab.unicorn.unicorn_workers[{#SINGLETON}])*100>{$GITLAB.UNICORN.UTILIZATION.MAX.WARN} |
Warning | ||
Unicorn is queueing requests | min(/GitLab by HTTP/gitlab.unicorn.queued_connections[{#SINGLETON}],5m)>{$GITLAB.UNICORN.QUEUE.MAX.WARN} |
Warning |
LLD rule Puma metrics discovery
Name | Description | Type | Key and additional info |
---|---|---|---|
Puma metrics discovery | Discovery of Puma specific metrics when Puma is used. |
HTTP agent | gitlab.puma.discovery Preprocessing
|
Item prototypes for Puma metrics discovery
Name | Description | Type | Key and additional info |
---|---|---|---|
Active connections | Number of puma threads processing a request. |
Dependent item | gitlab.puma.active_connections[{#SINGLETON}] Preprocessing
|
Workers | Total number of puma workers. |
Dependent item | gitlab.puma.workers[{#SINGLETON}] Preprocessing
|
Running workers | The number of booted puma workers. |
Dependent item | gitlab.puma.running_workers[{#SINGLETON}] Preprocessing
|
Stale workers | The number of old puma workers. |
Dependent item | gitlab.puma.stale_workers[{#SINGLETON}] Preprocessing
|
Running threads | The number of running puma threads. |
Dependent item | gitlab.puma.running[{#SINGLETON}] Preprocessing
|
Queued connections | The number of connections in that puma worker's "todo" set waiting for a worker thread. |
Dependent item | gitlab.puma.queued_connections[{#SINGLETON}] Preprocessing
|
Pool capacity | The number of requests the puma worker is capable of taking right now. |
Dependent item | gitlab.puma.pool_capacity[{#SINGLETON}] Preprocessing
|
Max threads | The maximum number of puma worker threads. |
Dependent item | gitlab.puma.max_threads[{#SINGLETON}] Preprocessing
|
Idle threads | The number of spawned puma threads which are not processing a request. |
Dependent item | gitlab.puma.idle_threads[{#SINGLETON}] Preprocessing
|
Killer terminations, total | The number of workers terminated by PumaWorkerKiller. |
Dependent item | gitlab.puma.killer_terminations_total[{#SINGLETON}] Preprocessing
|
Trigger prototypes for Puma metrics discovery
Name | Description | Expression | Severity | Dependencies and additional info |
---|---|---|---|---|
Puma instance thread utilization is too high | min(/GitLab by HTTP/gitlab.puma.active_connections[{#SINGLETON}],5m)/last(/GitLab by HTTP/gitlab.puma.max_threads[{#SINGLETON}])*100>{$GITLAB.PUMA.UTILIZATION.MAX.WARN} |
Warning | ||
Puma is queueing requests | min(/GitLab by HTTP/gitlab.puma.queued_connections[{#SINGLETON}],15m)>{$GITLAB.PUMA.QUEUE.MAX.WARN} |
Warning |
Feedback
Please report any issues with the template at https://support.zabbix.com
You can also provide feedback, discuss the template, or ask for help at ZABBIX forums