There is no useful information in dmesg or syslog unfortunately.
The fails are intermittent. When launching a lot of instances at once coming from the same _base image, we see the error.
The image cache base directory exists and nova can write to it:
root@compute:/var/lib/nova/instances# ls -l
total 256
drwxr-xr-x 2 nova nova 4096 Aug 17 14:09 02db8511-2f20-41da-bcc2-797a9bbbe63b
... snip ...
drwxr-xr-x 2 nova nova 4096 Aug 29 17:24 bab8ddbf-c483-4462-9273-755812d84903
drwxr-xr-x 2 nova nova 4096 Sep 7 13:33 _base
drwxr-xr-x 2 nova nova 4096 Sep 9 17:10 c3251e4f-4c0e-42d8-a039-78ed9263b46c
... snip
root@compute:/var/lib/nova/instances/_base# ls -la
total 34802256
drwxr-xr-x 2 nova nova 4096 Sep 7 13:33 .
drwxr-xr-x 65 nova nova 8192 Sep 9 17:10 ..
-rw-r--r-- 1 libvirt-qemu kvm 8589934592 Sep 8 12:50 21171f1738d671d6801abab7196e4a5460c57af9
-rw-r--r-- 1 libvirt-qemu kvm 16105807872 Sep 9 09:13 3e58771f795c5e889445b424cbce395a69bbfb08
... snip
The nfs mount point is:
1.2.3.4:/data on /var/lib/nova/instances type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=5.6.7.8,local_lock=none,addr=1.2.3.4)
We can simulate it outside of nova by creating a file of a certain size inside the nfs export. Then in a loop run the touch operation; and in another loop run the copy operation to wherever.
Now and then we see the input/output error.
There is no useful information in dmesg or syslog unfortunately.
The fails are intermittent. When launching a lot of instances at once coming from the same _base image, we see the error.
The image cache base directory exists and nova can write to it:
root@compute: /var/lib/ nova/instances# ls -l 2f20-41da- bcc2-797a9bbbe6 3b c483-4462- 9273-755812d849 03 4c0e-42d8- a039-78ed9263b4 6c
total 256
drwxr-xr-x 2 nova nova 4096 Aug 17 14:09 02db8511-
... snip ...
drwxr-xr-x 2 nova nova 4096 Aug 29 17:24 bab8ddbf-
drwxr-xr-x 2 nova nova 4096 Sep 7 13:33 _base
drwxr-xr-x 2 nova nova 4096 Sep 9 17:10 c3251e4f-
... snip
root@compute: /var/lib/ nova/instances/ _base# ls -la 6801abab7196e4a 5460c57af9 89445b424cbce39 5a69bbfb08
total 34802256
drwxr-xr-x 2 nova nova 4096 Sep 7 13:33 .
drwxr-xr-x 65 nova nova 8192 Sep 9 17:10 ..
-rw-r--r-- 1 libvirt-qemu kvm 8589934592 Sep 8 12:50 21171f1738d671d
-rw-r--r-- 1 libvirt-qemu kvm 16105807872 Sep 9 09:13 3e58771f795c5e8
... snip
The nfs mount point is: nova/instances type nfs4 (rw,relatime, vers=4. 1,rsize= 65536,wsize= 65536,namlen= 255,hard, proto=tcp, port=0, timeo=600, retrans= 2,sec=sys, clientaddr= 5.6.7.8, local_lock= none,addr= 1.2.3.4)
1.2.3.4:/data on /var/lib/
We can simulate it outside of nova by creating a file of a certain size inside the nfs export. Then in a loop run the touch operation; and in another loop run the copy operation to wherever.
Now and then we see the input/output error.