Ad Widget

Collapse

sda: Disk read/write request responses are too high

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • YaPaP
    Junior Member
    • Jun 2020
    • 11

    #1

    sda: Disk read/write request responses are too high

    Hello,

    I added 3 hosts to system but I am getting many many times this error:

    sda: Disk read/write request responses are too high (read > 20 ms for 15m or write > 20 ms for 15m)

    I disabled it from template ( I do not remeber exactly the location) but getting still this error from every 3 hosts.

    Can I change the values (increase the values) or delete this warning completely ?
  • badfiles
    Junior Member
    • Jun 2020
    • 5

    #2
    Yes, I noticed Zabbix 5.0.1 with mysql creates insane disk write traffic (10-20 MB/s) for just 3 standard linux hosts. Same version with postgres does no such thing.
    I hope it's a bug

    Comment

    • YaPaP
      Junior Member
      • Jun 2020
      • 11

      #3
      actually this problem makes the zabbix unusable, every few minutes I am getting same error and resolved messages.

      Can I change myql with postgres or should I install from scratch?

      Comment

      • badfiles
        Junior Member
        • Jun 2020
        • 5

        #4
        Currently I am investigating insane write traffic with mariadb. I believe it's more likely mariadb or OS environment issue, I am still not sure. My postgres installation is in quite different environment.

        Comment

        • badfiles
          Junior Member
          • Jun 2020
          • 5

          #5
          To trigger huge write traffic you only need to simply change innodb_log_file_size, which is not supposed to be simply changed:
          Code:
          Please note however that [URL="https://mysqldatabaseadministration.blogspot.com/2007/01/increase-innodblogfilesize-proper-way.html"]you cannot simply change[/URL] the value of this variable. You need to shutdown the server, remove the InnoDB log files, set the new value in my.cnf, start the server, then check the error logs if everything went fine.
          Zabbix (even web gui alone) makes a lot of commits and transactions, which trigger innodb_log writes. Somehow improperly resized innodb_log is written about 10 times more than properly created.
          Resizing the log properly I managed to drop Disk utilization dramatically: Click image for larger version

Name:	Screenshot from 2020-06-26 13-06-06.png
Views:	45828
Size:	25.1 KB
ID:	404117
          Last edited by badfiles; 26-06-2020, 12:09.

          Comment

          • Guest

            #6
            Originally posted by badfiles
            To which value did you set this on your setup? Which value would you recommend if a small 50 device Zabbix instance is running together with mariadb on a Debian 10 "buster" amd64 server with 8 GiB of RAM installed?

            Comment

            • jackinthebox
              Junior Member
              • Jun 2022
              • 2

              #7
              I know this is a bit older post, but I have been getting hundred emails from my storage servers that use HDD with the default values of 20ms response times for read and write.

              After a lot off googling, I still did not find the answer on how to change the values, but finally figured it out. Below is how you can change the response times.
              I hope this helps a lot of people like me :-)


              How to change the Macro Values

              The trigger -- Disk read/write request responses are too high (read > 20 ms for 15m or write > 20 ms for 15m) -- is typically part of the "Template OS xxxx" templates.
              So first you need to figure out, where it came from (which template) -

              Just go to Configuration -> Hosts and see what template you are using.

              IN my case it was a Linux server.

              Click image for larger version

Name:	1.png
Views:	33566
Size:	34.4 KB
ID:	445822

              Next piece of information we need is what "Macro" needs to be changed. So go to the Monitoring -> Problems, then click on the "Problem" text - ( which may look like "sdb: Disk read/write request responses are too high (read > 20 ms for 15m or write > 20 ms for 15m)" ) and in the pop=up click Configuration

              You should see something like this. Note that the two Macro are $VFS.DEV.READ.AWAIT.WARN and $VFS.DEV.WRITE.AWAIT.WARN
              Also note that you are viewing the configuration of a single server, not all servers that use the template.
              We want to change the Template, not just one server.

              Click image for larger version

Name:	1.png
Views:	33530
Size:	54.5 KB
ID:	445823

              Now that you know which template its coming from, go to Configuration -> Template -> "Template OS Linux by Zabbix Agent" (use your actual template)


              Once template is open, Click on the "Macros" -> "Inherited and Templates Macros"

              Find the two macros and click on Change (screenshot shows Remove since I already changed it)
              $VFS.DEV.READ.AWAIT.WARN and $VFS.DEV.WRITE.AWAIT.WARN
              Note I changed the values from 20/20 to 35/50 for read/write

              Finally, scroll to bottom of page and click UPDATE.

              The new values take effect immediately.


              Click image for larger version

Name:	1.png
Views:	33496
Size:	39.8 KB
ID:	445824

              Comment

              • acovington7920
                Junior Member
                • Oct 2022
                • 2

                #8
                Thank you!

                Comment

                • PaoZap
                  Junior Member
                  • Dec 2022
                  • 3

                  #9
                  It worked for me. Tnx

                  Comment

                  • touch.sreng
                    Junior Member
                    • Feb 2024
                    • 1

                    #10
                    Hi, jackinthebox,

                    It works for me when we change the macro value, but I would like to check to explain about value that we have changed

                    {$VFS.DEV.READ.AWAIT.WARN} from 20 => 35
                    {$VFS.DEV.WRITE.AWAIT.WARN} from 20 => 50

                    Why would we change it? and is the value suite for SSD and HDD disk monitoring?​

                    Comment

                    • HJ_SK
                      Junior Member
                      • Feb 2024
                      • 2

                      #11
                      Thanks for this info, on 6.4 it's a little different but works now.
                      But.... I think it is better to do this :
                      data collection → templates → search for : linux by Zabbix agent → full clone template →
                      linux RPi by Zabbix agent → change these macro’s values to : save changes.
                      Go to : monitoring → hosts → and pick the host of your choice & double click this name
                      configuration → Host → templates name → search linux RPi by Zabbix agent & selecteert
                      Old one (linux by Zabbix agent) → unlink & clear
                      Click on update.
                      When there is any official update from Zabbix on the original template you're back on old situation.
                      On the other side you have to look what and why they changed their template.
                      Attached Files

                      Comment

                      • Vivek K V
                        Junior Member
                        • Jan 2024
                        • 1

                        #12
                        Hello,
                        I am getting this error from a quiet long time. Referred this thread and changed the threshold values but doesn't help me since i have an average 40 ms write time.

                        I have changed the innodb_redo_log_capacity value to 2G from 500MB as per the badfiles suggestion in this thread on 08-06-2022. But I can see there is no drop in write response time. Can some one help with what causing high response in write time and how we can reduce it instead increasing alert threshold.

                        Note: innodb_log_file_size and innodb_log_files_in_group are deprecated in MySQL 8.0.30. These variables are superseded by innodb_redo_log_capacity.

                        Comment

                        • jackinthebox
                          Junior Member
                          • Jun 2022
                          • 2

                          #13
                          This update suggested is only for increasing the response time. If you are seeing 40 ms response time, and only want to see the error when the time is > 75 ms, you can update accordingly. All this does is reduce how many alert emails you get.

                          After the initial changes I did, I did increase my read/write on HDD time to 35/75
                          Note that this is for HDD raid.

                          For MySQL specific issues, this is not the right place. Check ChatGPT :-)
                          Below is what I got from it..

                          High write response times in MySQL despite increasing the innodb_redo_log_capacity could be caused by several factors unrelated to the redo log capacity itself. Here’s how you can troubleshoot and improve performance:
                          1. Assess Disk I/O Performance
                          • Problem: Slow disk subsystems (e.g., HDDs instead of SSDs) can bottleneck write operations.
                          • Solution:
                            • Check the IOPS (Input/Output Operations Per Second) and latency of your storage system using tools like iostat or vmstat.
                            • Consider moving to faster disks (e.g., NVMe or SSD) or tuning your RAID setup if applicable.

                          2. Buffer Pool Tuning
                          • Problem: Insufficient buffer pool size can lead to excessive disk writes and lower performance.
                          • Solution:
                            • Ensure innodb_buffer_pool_size is large enough to hold the working set of your data.
                            • Monitor the Innodb_buffer_pool_pages_free and Innodb_buffer_pool_pages_dirty metrics to determine if the buffer pool is effectively sized.

                          3. Redo Log Flushing Behavior
                          • Problem: Overhead from frequent redo log flushes.
                          • Solution:
                            • Check and adjust innodb_flush_log_at_trx_commit. For example:
                              • 1: Flushes the log to disk on each commit (default, safest for ACID compliance but can be slower).
                              • 2: Flushes log to disk every second, reducing I/O at the cost of possible data loss.
                              • 0: Flushes log to disk only when the log buffer is full (fastest but least durable).
                            • Test with a less aggressive setting if acceptable for your workload.

                          4. Check the Write Workload
                          • Problem: The workload may involve high write contention or unoptimized queries.
                          • Solution:
                            • Use the Performance Schema or the sys schema to analyze and identify slow queries or write-heavy transactions.
                            • Optimize queries, indexes, and schema design to reduce write pressure.

                          5. File System and Mount Options
                          • Problem: Suboptimal file system settings can degrade performance.
                          • Solution:
                            • If using Linux, ensure the innodb_data_home_dir and innodb_log_group_home_dir are on a file system with appropriate mount options (noatime, nodiratime).
                            • Use a file system optimized for database workloads (e.g., XFS or ext4 with journaling disabled for database files).

                          6. Monitor and Optimize Transactions
                          • Problem: Long-running or uncommitted transactions can cause high write latency.
                          • Solution:
                            • Monitor the Innodb_trx table for long-running transactions.
                            • Ensure transactions are committed or rolled back promptly.

                          7. Review Configuration Parameters
                          • Problem: Other MySQL configuration parameters may need adjustment.
                          • Solution:
                            • Increase the innodb_io_capacity and innodb_io_capacity_max settings to allow for more background flushing.
                            • Set innodb_flush_neighbors to 0 for SSDs to avoid unnecessary I/O operations.

                          8. Assess Hardware Limitations
                          • Problem: The server hardware may be underpowered for the workload.
                          • Solution:
                            • Monitor CPU, memory, and disk usage during high write times.
                            • Upgrade server resources if utilization is consistently maxed out.

                          9. Reduce Lock Contention
                          • Problem: High lock contention can cause delays in writes.
                          • Solution:
                            • Use SHOW ENGINE INNODB STATUS to check for lock contention.
                            • Optimize transactions to minimize the duration and scope of locks.

                          Tools for Analysis
                          • MySQL Workbench Performance Dashboard for visual insights.
                          • pt-query-digest (Percona Toolkit) to analyze slow query logs.
                          • SHOW GLOBAL STATUS LIKE 'innodb%' to monitor InnoDB metrics.

                          By systematically addressing these areas, you can identify and mitigate the causes of high write response times. If the issue persists, provide additional workload and environment details for deeper troubleshooting.

                          Comment

                          Working...