Bug #1783978 “nova_compute mount nfs backend every time when vm ...” : Bugs : kolla-ansible

Revision history for this message

Michal Nasiadka (mnasiadka) wrote on 2019-09-12:

#1

Is it still a bug? Can you reproduce it with latest kolla-ansible/kolla code?

Changed in kolla:
status:	New → Incomplete

Revision history for this message

Launchpad Janitor (janitor) wrote on 2019-11-12:

#2

[Expired for kolla because there has been no activity for 60 days.]

Changed in kolla:
status:	Incomplete → Expired

Revision history for this message

Konstantinos Mouzakitis (mouza8) wrote on 2020-02-13:

#3

Hello all! I'm facing the same issue, using the stable/stein release with centos-binary images. The cinder backends related to this are the ones that use a shared bind mount: http://paste.openstack.org/show/789538/. I've checked this with a NFS backend as well as a Quobyte one and every time one of the nova containers is restarted the mounts double (plus one for me):

[root@node02 ~]# mount | grep nova -c
7
[root@node02 ~]# docker restart nova_compute
nova_compute
[root@node02 ~]# mount | grep nova -c
15
[root@node02 ~]# docker restart nova_compute
nova_compute
[root@node02 ~]# mount | grep nova -c
31
[root@node02 ~]#

Any ideas are very much appreciated!

Thanks a lot!

Konstantinos Mouzakitis (mouza8) on 2020-02-13

Changed in kolla:
status:	Expired → New

Revision history for this message

Radosław Piliszek (yoctozepto) wrote on 2020-02-13:

#4

Could you actually show the mounts? (possibly replacing sensitive stuff with random unique values if any)

Changed in kolla:
status:	New → Incomplete

Revision history for this message

Mariusz Karpiarz (mkarpiarz) wrote on 2020-02-14:

#5

Download full text (6.5 KiB)

```
# mount | grep nfs
192.168.17.11:/kolla_nfs on /var/lib/nova/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/nova/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/docker/volumes/nova_compute/_data/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/docker/volumes/nova_compute/_data/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/docker/volumes/nova_compute/_data/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/nova/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/nova/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/docker/volumes/nova_compute/_data/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
# mount | grep -c nfs
8
# docker restart nova_compute
nova_compute
# mount | grep -c nfs
16
# mount | grep nfs
192.168.17.11:/kolla_nfs on /var/lib/nova/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/nova/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/docker/volumes/nova_compute/_data/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/docker/volumes/nova_compute/_data/mnt/03084b2d...

```
# mount | grep nfs
192.168.17.11:/kolla_nfs on /var/lib/nova/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/nova/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/docker/volumes/nova_compute/_data/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/docker/volumes/nova_compute/_data/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/docker/volumes/nova_compute/_data/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/nova/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/nova/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/docker/volumes/nova_compute/_data/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
# mount | grep -c nfs
8
# docker restart nova_compute
nova_compute
# mount | grep -c nfs
16
# mount | grep nfs
192.168.17.11:/kolla_nfs on /var/lib/nova/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/nova/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/docker/volumes/nova_compute/_data/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/docker/volumes/nova_compute/_data/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/docker/volumes/nova_compute/_data/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/nova/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/nova/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/docker/volumes/nova_compute/_data/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/docker/volumes/nova_compute/_data/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/nova/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/nova/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/docker/volumes/nova_compute/_data/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/nova/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/nova/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/docker/volumes/nova_compute/_data/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
192.168.17.11:/kolla_nfs on /var/lib/docker/volumes/nova_compute/_data/mnt/03084b2d0f988f513a2652c556e3ad4d type nfs4 (rw,relatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.17.5,local_lock=none,addr=192.168.17.11)
#
```

Revision history for this message

Konstantinos Mouzakitis (mouza8) wrote on 2020-02-21:

#6

Hello all. Just wanted to update you on some further investigation I did on this. So, the problem isn't the shared bind mount used by the backends mentioned above. It's the nested bind mount! There's /var/lib/nova as a destination of a bind mount on the nova containers and with the above cinder backends /var/lib/nova/mnt is also added as a separate bind mount. This causes the mounts to be doubled every time the nova containers (compute, libvirt, ssh) are restarted.

A solution I've found to this is to change the directory that nova mounts the cinder volumes, by changing quobyte_mount_point_base or nfs_mount_point_base depending on the backend. This also needs that directory to be configured as a new bind mount in the three nova containers in kolla-ansible/ansible/roles/nova/defaults/main.yml. A reconfigure after this and it's all working fine. Then you can just umount the old stale mounts on the host.

I'll be looking to patch this by adding the above login in the nova defaults file if we have nfs or quobyte enabled.

Hope this helps anyone that is facing the mounts problem!

Revision history for this message

Launchpad Janitor (janitor) wrote on 2020-04-22:

#7

[Expired for kolla because there has been no activity for 60 days.]

Changed in kolla:
status:	Incomplete → Expired

Radosław Piliszek (yoctozepto) on 2020-04-22

Changed in kolla:
status:	Expired → Confirmed
importance:	Undecided → Medium

Radosław Piliszek (yoctozepto) on 2020-06-01

Changed in kolla:
assignee:	nobody → Radosław Piliszek (yoctozepto)
Changed in kolla-ansible:
status:	New → Confirmed
importance:	Undecided → Medium
no longer affects:	kolla
Changed in kolla-ansible:
assignee:	nobody → Radosław Piliszek (yoctozepto)

Revision history for this message

Kristina Jasser (marvin01) wrote on 2021-11-09:

#8

same problem here - will someone try to find a solution at some point?

ERDEM AĞBAHCA (erdemag) on 2022-01-19

information type:	Public → Public Security
information type:	Public Security → Public

Revision history for this message

Radosław Piliszek (yoctozepto) wrote on 2022-01-19:

#9

Erdem reached out to me with his currently happening issue and we are working on debugging this issue so that I can propose the best fix.

Changed in kolla-ansible:
status:	Confirmed → In Progress

Revision history for this message

Radosław Piliszek (yoctozepto) wrote on 2022-01-20 (last edit on 2022-01-20):

#10

I have created a minimal reproducer: https://gist.github.com/yoctozepto/e6fdc2789297fdfdff4fd45fe64c9cb9

And found that the issue was already observed, albeit for a different reason: https://github.com/moby/moby/issues/35323

It seems docker treats cases (2) and (4) differently and for some reason preserves the "shared" submounts and then forcibly mounts again on container start which causes the exponential growth effect.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2022-01-20: Fix proposed to kolla-ansible (master)

#11

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/825514

Revision history for this message

Radosław Piliszek (yoctozepto) wrote on 2022-01-20:

#12

Please test this patch https://review.opendev.org/c/openstack/kolla-ansible/+/825514 on an already deployed solution (apply and redeploy). Does it allow the VMs to continue to run? Does it allow to create new VMs? Can pre-existing VMs be manipulated (rebooted, migrated, shelved and unshelved). I need this info for the release note.

Revision history for this message

ERDEM AĞBAHCA (erdemag) wrote on 2022-01-28:

#13

Hello. Thank you for quick solution.This patch solves mount problem.

To answer your questions:
1 - It allows only instances that do not have volume/volumes attached. You need to stop volume attaching instances to prevent data loss due to volume mount changes or at least to remove already filled mountpoint cap on physical compute node.
2 - After successful redeployment of nova-libvirt container you can manipulate pre-existing VMs.

However:
Info: If nova-libvirt is not running you can do the things below over horizon webgui.
If you have running instances on the compute and nova-libvirt container cannot start due to this bug here is what you should do to prevent data loss.

1 - Make sure gracefully shutdown instances that has volumes attached on the compute. Over ssh or connecting via vnc from physical_ip:5XXX port
2 - Check if the instances that you shutdown still has processes running on the compute node. Freebsd instances need killing even after graceful shutdown. Make sure to kill corresponding qemu processes to those instances.
3 - umount /var/lib/nova/mnt on compute node. It can still respond as busy, but since you have stopped all volume attaching instances you don't have to worry about data loss and you can lazy umount with "umount -l"
4 - Apply the patch (This won't affect other running instances on the same compute)
5 - Start the instances you previously shutdown, check if there is any problem with the volumes (there shouldn't be any).

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2022-04-29: Change abandoned on kolla-ansible (master)

#14

Change abandoned by "Radosław Piliszek <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/825514
Reason: not pursuing

Revision history for this message

Angelos Kolaitis (aggkolaitis) wrote on 2023-01-04 (last edit on 2023-01-04):

#15

I also was affected by this issue after enabling Cinder with NFS backend, and the proposed bug-fix did solve the bug for me. Are there any plans to move forward with it? Is there any way to patch this bug without risking breaking existing deployments?

Revision history for this message

Antony Messerli (antonym) wrote on 2023-07-28 (last edit on 2023-07-28):

#16

We are seeing this behavior as well, where the mounts slowly increase until the machine appears to become unstable. We are on 2023.1 using Cinder with an NFS backend.

kolla-ansible

nova_compute mount nfs backend every time when vm was launched

Bug Description

Other bug subscribers

Remote bug watches