Also the experience to debug the issue is terrible. Nothing appears in logs during the volume_stats update.
The state is 'up'. You make a API call like attaching a volume. The state become 'down'. API got MessagingTimeout and nothing else.
I have understand the issue when I have enabled debug and see that cinder-volume was only logging "connecting to ceph (timeout=-1). _connect_to_rados" and nothing else.
The message appears every 1 to 10 seconds (I'm guessing that depends on the size of the volume)
I have manually added some additional logging and found that cinder-volume was stuck inside _get_usage_info() loops.
Also the experience to debug the issue is terrible. Nothing appears in logs during the volume_stats update.
The state is 'up'. You make a API call like attaching a volume. The state become 'down'. API got MessagingTimeout and nothing else.
I have understand the issue when I have enabled debug and see that cinder-volume was only logging "connecting to ceph (timeout=-1). _connect_to_rados" and nothing else.
The message appears every 1 to 10 seconds (I'm guessing that depends on the size of the volume)
I have manually added some additional logging and found that cinder-volume was stuck inside _get_usage_info() loops.