---Problem Description---
In a virtual machine, during MySQL performance tests with sysbench, IO operations freeze, and the virtual disk does not respond. The data of MySQL is on a virtual drive, backed by a host's local NVMe, attached to VM as a raw virtio-block device. The test runs smoothly for a few minutes. After a while, the IO operations freeze, and any attempt to read or write to the virtual drive remains to wait. Also, after the problem occurs, every read operation to the affected drive (e.g. ls, cat, etc.) stays waiting forever.
---Expecting to happen---
To not have lines with "tps: 0.00 qps: 0.00", like the last four in the example.
---Additional Notes---
1. This is not happening on every run, so it is possible for some test iterations to complete successfully.
2. The same happens with a larger number of sysbench threads (e.g. 8, 16, 24, 32) too.
3. The problem does not occur if the io policy of the data drive is changed from io="native" to io="io_uring" (at least for 7 hours of continuous testing).
4. While IO operations in the VM are frozen, the NVMe device responds to requests from the host. (e.g. dd if=/dev/nvme1n1 of=/dev/null bs=512 count=1 iflag=direct).
Please find attached the libvirt XML configuration of the example VM.
---Problem Description---
In a virtual machine, during MySQL performance tests with sysbench, IO operations freeze, and the virtual disk does not respond. The data of MySQL is on a virtual drive, backed by a host's local NVMe, attached to VM as a raw virtio-block device. The test runs smoothly for a few minutes. After a while, the IO operations freeze, and any attempt to read or write to the virtual drive remains to wait. Also, after the problem occurs, every read operation to the affected drive (e.g. ls, cat, etc.) stays waiting forever.
---Host Hardware---
CPU: AMD EPYC 7302P 16-Core Processor (32 threads)
RAM: 128 GB
OS Drive: Toshiba KXG60ZNV256G M.2 NVMe PCI-E SSD (256 GB)
Data Drive: Samsung PM983 MZQLB960HAJR-00007 U.2 (960 GB)
---Host Software---
OS: Ubuntu 22.04 LTS
Kernel: 5.15.0-27-generic
Qemu: 1:6.2+dfsg-2ubuntu6
Libvirt: 8.0.0-1ubuntu7
---VM Hardware--- 'static' >8</vcpu> passthrough' check='none' migratable='on'/> 'pc-q35- 6.2'>hvm< /type>
vCPU: <vcpu placement=
CPU Mode: <cpu mode='host-
RAM: 64 GB
OS Type: <type arch='x86_64' machine=
OS Drive (64 GB):
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' cache='none' io='native' discard='unmap'/>
<target dev='vda' bus='virtio'/>
Block Data Drive:
<disk type="block" device="disk">
<driver name="qemu" type="raw" cache="none" io="native" discard="unmap"/>
<target dev="vdb" bus="virtio"/>
---VM Software & Configuration---
OS: Ubuntu 22.04 LTS (minimized)
Kernel: 5.15.0-27-generic
Swap: disabled
OS Drive: /dev/vda2; file-system: ext4; mount-options: defaults; mount-point: /
Data Drive: /dev/vdb
MySQL: 8.0.28-0ubuntu4
Sysbench: 1.0.20+ds-2
---Prepare the VM---
1. Install Ubuntu 22.04 LTS (minimized) as VM OS
2. Boot the VM & log-in as root
3. apt-get install mysql-server mysql-common sysbench apparmor-utils
4. systemctl disable --now mysql.service
5. aa-complain /usr/sbin/mysqld
6. systemctl restart apparmor
---Reproduction--- /data/mysql --lc-messages- dir=/usr/ share/mysql/ english --log-error --max_connectio ns=256 --socket= /var/run/ mysqld/ mysqld. sock --table_ open_cache= 512 --tmpdir=/var/tmp --innodb_ buffer_ pool_size= 1024M --innodb_ data_file_ path=ibdata1: 32M:autoextend --innodb_ data_home_ dir=/data/ mysql --innodb_ doublewrite= 0 --innodb_ flush_log_ at_trx_ commit= 1 --innodb_ flush_method= O_DIRECT --innodb_ lock_wait_ timeout= 50 --innodb_ log_buffer_ size=16M --innodb_ log_file_ size=256M --innodb_ log_group_ home_dir= /data/mysql --innodb_ max_dirty_ pages_pct= 80 --innodb_ thread_ concurrency= 0 --user=root --initialize- insecure /data/mysql --lc-messages- dir=/usr/ share/mysql/ english --log-error --max_connectio ns=256 --socket= /var/run/ mysqld/ mysqld. sock --table_ open_cache= 512 --tmpdir=/var/tmp --innodb_ buffer_ pool_size= 1024M --innodb_ data_file_ path=ibdata1: 32M:autoextend --innodb_ data_home_ dir=/data/ mysql --innodb_ doublewrite= 0 --innodb_ flush_log_ at_trx_ commit= 1 --innodb_ flush_method= O_DIRECT --innodb_ lock_wait_ timeout= 50 --innodb_ log_buffer_ size=16M --innodb_ log_file_ size=256M --innodb_ log_group_ home_dir= /data/mysql --innodb_ max_dirty_ pages_pct= 80 --innodb_ thread_ concurrency= 0 --user=root & sysbench/ oltp_read_ write.lua --threads=10 --table- size=20000000 --events=0 --time=900 --mysql-user=root --tables=10 --delete_inserts=10 --index_updates=10 --non_index_ updates= 10 --db-ps- mode=disable --report-interval=1 --db-driver=mysql --mysql-db=test1m --max-requests=0 --rand-seed=303 prepare sysbench/ oltp_read_ write.lua --threads=6 --table- size=20000000 --events=0 --time=900 --mysql-user=root --tables=10 --delete_inserts=10 --index_updates=10 --non_index_ updates= 10 --db-ps- mode=disable --report-interval=1 --db-driver=mysql --mysql-db=test1m --max-requests=0 --rand-seed=303 run
1. Reboot the VM & log-in as root
2. mkdir -p /data
3. mkfs.ext4 /dev/vdb
4. mount /dev/vdb /data
5. mkdir /data/mysql
6. mkdir /var/run/mysqld
7. /usr/sbin/mysqld --no-defaults --datadir=
8. /usr/sbin/mysqld --no-defaults --datadir=
9. echo 'status' | mysql -uroot # verify that MySQL server is up)
10. echo 'drop database test1m' | mysql -uroot
11. echo 'create database test1m' | mysql -uroot
12. /usr/share/
/usr/share/
---Resulting Output--- 13116.00/ 654.00) lat (ms,95%): 30.81 err/s: 0.00 reconn/s: 0.00 12810.89/ 639.99) lat (ms,95%): 39.65 err/s: 0.00 reconn/s: 0.00 12603.97/ 634.00) lat (ms,95%): 30.81 err/s: 0.00 reconn/s: 0.00 12094.14/ 599.66) lat (ms,95%): 25.28 err/s: 0.00 reconn/s: 0.00
...
[ 620s ] thds: 6 tps: 327.00 qps: 18348.00 (r/w/o: 4578.00/
[ 621s ] thds: 6 tps: 320.00 qps: 17930.85 (r/w/o: 4479.96/
[ 622s ] thds: 6 tps: 317.00 qps: 17670.96 (r/w/o: 4432.99/
[ 623s ] thds: 6 tps: 299.83 qps: 16896.41 (r/w/o: 4202.61/
[ 624s ] thds: 6 tps: 0.00 qps: 6.00 (r/w/o: 0.00/6.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 625s ] thds: 6 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 626s ] thds: 6 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 627s ] thds: 6 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
...
---Expecting to happen---
To not have lines with "tps: 0.00 qps: 0.00", like the last four in the example.
---Additional Notes---
1. This is not happening on every run, so it is possible for some test iterations to complete successfully.
2. The same happens with a larger number of sysbench threads (e.g. 8, 16, 24, 32) too.
3. The problem does not occur if the io policy of the data drive is changed from io="native" to io="io_uring" (at least for 7 hours of continuous testing).
4. While IO operations in the VM are frozen, the NVMe device responds to requests from the host. (e.g. dd if=/dev/nvme1n1 of=/dev/null bs=512 count=1 iflag=direct).
Please find attached the libvirt XML configuration of the example VM.
Best regards,
Nikolay Tenev