Juju creates extra/redundant VMs when using MAAS vmhost as backing cloud
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Expired
|
Undecided
|
Unassigned | ||
MAAS |
Expired
|
Undecided
|
Unassigned |
Bug Description
Some weird race condition between Juju's request towards MAAS causes multiple duplicate
VM hosts to be created when using MAAS as the backing cloud with a either KVM or LXC VM vhost configured.
Steps to reproduce:
------
1. Deploy MAAS 3.x
2. Add a LXC VM host/hypervisor to MAAS so that it will be used to spin up VMs
3. Bootstrap Juju 2.x/stable controller using MAAS as the backing cloud
4. Deploy multiple VMs using the LXC VM host, for example via:
$ juju add-machine -n3
Expected result
The exact count of machines desired by the model or -n argument are spun up.
Actual result:
Redundant/errnoeous VMs are spun up (cheif-wolf, bright-wew, super-moose):
l+-----
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+------
| bright-ewe | STOPPED | | | VIRTUAL-MACHINE | 0 |
+------
| chief-wolf | STOPPED | | | VIRTUAL-MACHINE | 0 |
+------
| deep-goose | RUNNING | 172.16.66.195 (eth0) | | VIRTUAL-MACHINE | 0 |
+------
| hip-goat | RUNNING | 172.16.66.219 (eth0) | | VIRTUAL-MACHINE | 0 |
+------
| steady-cat | RUNNING | 172.16.66.151 (eth0) | | VIRTUAL-MACHINE | 0 |
+------
| super-moose | STOPPED | | | VIRTUAL-MACHINE | 0 |
+------
| valued-drum | RUNNING | 172.16.66.207 (eth0) | | VIRTUAL-MACHINE | 0 |
+------
I haven't narrowed down what exactly causes this -- is Juju not getting a reply from MAAS
fast enough and sending duplicate create requests?
tags: | added: kvm maas-provider |
Changed in juju: | |
status: | New → Incomplete |
Just an addendum -- it also seems like once these extraneous instances are created,
neither Juju or MAAS are aware of the VMs in their respective databases, but these VMs still show up in lxc list output as shown in the bug description.
It almost seems like MAAS is sending LXC the go ahead to spin up the VMs, gives up on it and doesn't check with the hypervisor that LXC went ahead and created these instances (albeit in a stopped state).