I looked at the logs. There were 4 restores started in the cinder backup logs. 3 of them finished but 1 didn’t. Of the 4, 2 of them started at almost the same time. One finished and the other one didn’t finish. The unfinished one did finish restoring data and started to restore metadata. At that time, there were connection problems to MySQL so the metadata restore was not finished and eventually timed out.
We can see that the restore of backup started:
2015-08-10 21:34:27.541 INFO cinder.backup.manager [req-139ab0ae-d0ed-43d5-a254-19e895c9ff95 tempest-VolumesBackupsV2Test-998958099] Restore backup started, backup: bd79c0a3-5a76-4dc3-a774-d8ef515c46b1 volume: bd776a72-66c0-43a6-8d48-39f541634226.
Now the restore of data finished:
2015-08-10 21:36:08.861 DEBUG cinder.backup.chunkeddriver [req-3629e4d4-a3b0-4f6f-9f30-7e6140e95e78 None] v1 volume backup restore of bd79c0a3-5a76-4dc3-a774-d8ef515c46b1 finished. _restore_v1 /opt/stack/new/cinder/cinder/backup/chunkeddriver.py:634
Now it started to restore metadata:
2015-08-10 21:36:08.863 DEBUG cinder.backup.driver [req-3629e4d4-a3b0-4f6f-9f30-7e6140e95e78 None] Restoring volume base metadata _restore_vol_base_meta /opt/stack/new/cinder/cinder/backup/driver.py:168
2015-08
If this is finished, we should see "Restore backup finished, backup bd79c0a3-5a76-4dc3-a774-d8ef515c46b1 restored to volume: bd776a72-66c0-43a6-8d48-39f541634226." However, this never happened.
Instead, there is a Lost connection to MySQL failure at this time:
2015-08-10 21:36:37.420 ERROR oslo_messaging.rpc.dispatcher [req-1abbfe4f-97f2-4184-b598-522a1e0e2c19 None] Exception during message handling: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query')
2015-08-10 21:36:37.42
Then after a few minutes, tempest timed out the restore request and sent a delete backup request:
2015-08-10 21:39:55.018 INFO cinder.backup.manager [req-c6888349-c6f3-4a1d-842f-a34df2161aa6 tempest-VolumesBackupsV2Test-998958099] Delete backup started, backup: bd79c0a3-5a76-4dc3-a774-d8ef515c46b1.
In the cinder volume log, there is a connection pool is full and discard connection error. This matches the Lost connection to MySQL error in cinder backup log:
2015-08-10 21:36:52.474 WARNING requests.packages.urllib3.connectionpool [req-bd798c22-6983-45e4-b0e7-aa5a8e1dffe4 tempest-VolumesV2ActionsTest-1525165338] Connection pool is full, discarding connection: 127.0.0.1
From the following site, it says Lost connection usually indicates network problems. https://dev.mysql.com/doc/refman/5.0/en/error-lost-connection.html
Usually it indicates network connectivity trouble and you should check the condition of your network if this error occurs frequently. If the error message includes “during query,” this is probably the case you are experiencing.
So to make the restore tests more stable, we could look at suggestions from the above web site or increment max_pool_size and max_overflow in cinder.conf. By default, max_pool_size is 5 and max_overflow is 10.
# Changing the following options max_pool_size and max_overflow
# to allow more db connections
[database]
# Maximum number of SQL connections to keep open in a pool
# (integer value)
max_pool_size=20
# If set, use this value for max_overflow with sqlalchemy
# (integer value)
max_overflow=30
I looked at the logs. There were 4 restores started in the cinder backup logs. 3 of them finished but 1 didn’t. Of the 4, 2 of them started at almost the same time. One finished and the other one didn’t finish. The unfinished one did finish restoring data and started to restore metadata. At that time, there were connection problems to MySQL so the metadata restore was not finished and eventually timed out.
From the cinder backup log - logs.openstack. org/82/ 202782/ 15/gate/ gate-tempest- dsvm-neutron- full/372a95f/ logs/screen- c-bak.txt. gz#_2015- 08-10_21_ 36_37_525
http://
We can see that the restore of backup started: backup. manager [req-139ab0ae- d0ed-43d5- a254-19e895c9ff 95 tempest- VolumesBackupsV 2Test-998958099 ] Restore backup started, backup: bd79c0a3- 5a76-4dc3- a774-d8ef515c46 b1 volume: bd776a72- 66c0-43a6- 8d48-39f5416342 26.
2015-08-10 21:34:27.541 INFO cinder.
2015-08-10 21:34:27.825 DEBUG cinder. backup. chunkeddriver [req-0a4871ba- 550a-4435- 8e15-07421d4f9e ef tempest- VolumesBackupsV 1Test-103041537 0] starting restore of backup volume_ 8b22ff43- 58ed-4148- 9824-20db1efd21 d2/201508102133 15/az_nova_ backup_ bd79c0a3- 5a76-4dc3- a774-d8ef515c46 b1 container: volumebackups, to volume bd776a72- 66c0-43a6- 8d48-39f5416342 26, backup: bd79c0a3- 5a76-4dc3- a774-d8ef515c46 b1. restore /opt/stack/ new/cinder/ cinder/ backup/ chunkeddriver. py:648
2015-08
Now the restore of data finished: backup. chunkeddriver [req-3629e4d4- a3b0-4f6f- 9f30-7e6140e95e 78 None] v1 volume backup restore of bd79c0a3- 5a76-4dc3- a774-d8ef515c46 b1 finished. _restore_v1 /opt/stack/ new/cinder/ cinder/ backup/ chunkeddriver. py:634
2015-08-10 21:36:08.861 DEBUG cinder.
Now it started to restore metadata: backup. driver [req-3629e4d4- a3b0-4f6f- 9f30-7e6140e95e 78 None] Restoring volume base metadata _restore_ vol_base_ meta /opt/stack/ new/cinder/ cinder/ backup/ driver. py:168
2015-08-10 21:36:08.863 DEBUG cinder.
2015-08
If this is finished, we should see "Restore backup finished, backup bd79c0a3- 5a76-4dc3- a774-d8ef515c46 b1 restored to volume: bd776a72- 66c0-43a6- 8d48-39f5416342 26." However, this never happened.
Instead, there is a Lost connection to MySQL failure at this time: rpc.dispatcher [req-1abbfe4f- 97f2-4184- b598-522a1e0e2c 19 None] Exception during message handling: (pymysql. err.Operational Error) (2013, 'Lost connection to MySQL server during query')
2015-08-10 21:36:37.420 ERROR oslo_messaging.
2015-08-10 21:36:37.42
Then after a few minutes, tempest timed out the restore request and sent a delete backup request: backup. manager [req-c6888349- c6f3-4a1d- 842f-a34df2161a a6 tempest- VolumesBackupsV 2Test-998958099 ] Delete backup started, backup: bd79c0a3- 5a76-4dc3- a774-d8ef515c46 b1.
2015-08-10 21:39:55.018 INFO cinder.
In the cinder volume log, there is a connection pool is full and discard connection error. This matches the Lost connection to MySQL error in cinder backup log:
http:// logs.openstack. org/82/ 202782/ 15/gate/ gate-tempest- dsvm-neutron- full/372a95f/ logs/screen- c-vol.txt. gz
2015-08-10 21:36:52.474 WARNING requests. packages. urllib3. connectionpool [req-bd798c22- 6983-45e4- b0e7-aa5a8e1dff e4 tempest- VolumesV2Action sTest-152516533 8] Connection pool is full, discarding connection: 127.0.0.1
From the following site, it says Lost connection usually indicates network problems. /dev.mysql. com/doc/ refman/ 5.0/en/ error-lost- connection. html
https:/
Usually it indicates network connectivity trouble and you should check the condition of your network if this error occurs frequently. If the error message includes “during query,” this is probably the case you are experiencing.
So to make the restore tests more stable, we could look at suggestions from the above web site or increment max_pool_size and max_overflow in cinder.conf. By default, max_pool_size is 5 and max_overflow is 10.
# Changing the following options max_pool_size and max_overflow
# to allow more db connections
[database]
# Maximum number of SQL connections to keep open in a pool
# (integer value)
max_pool_size=20
# If set, use this value for max_overflow with sqlalchemy
# (integer value)
max_overflow=30