We found another environment that had several machines duplicated (1 on failed deployment and 1 deployed OK), the workaround used is:
0) Switch to the model where there are duplicated machines
$ juju switch SOME_MODEL
1) Identify the number of the machine(s) that want to be *removed* (e.g. 33)
2) Set the harvest mode to none
$ juju model-config provisioner-harvest-mode=none
3) Remove the machine previously identified
$ juju remove-machine --force MACHINE_NUMBER
4) monitor the progress in juju status and in the maas web ui
- juju status should stop displaying the machine that failed to deploy in the maas
- the maas web ui should *not* show any change in the system, at this level juju shouldn't be making any changes.
- Wait a few minutes before to proceed to the next step to make sure any background task is completed
5) Repeat steps 3 and 4 for each duplicated machine marked as "failed deployment"
6) Restore the default value of the harvester mode
$ juju model-config --reset provisioner-harvest-mode
By luck we found this during a live session and we asked the operator to NOT remove the machines in failed state, an uninformed user could naively run "juju remove-machine --force X" and end up losing data. Considering there is risk of losing data, I think this bug should be marked as "Critical".
We found another environment that had several machines duplicated (1 on failed deployment and 1 deployed OK), the workaround used is:
0) Switch to the model where there are duplicated machines harvest- mode=none harvest- mode
$ juju switch SOME_MODEL
1) Identify the number of the machine(s) that want to be *removed* (e.g. 33)
2) Set the harvest mode to none
$ juju model-config provisioner-
3) Remove the machine previously identified
$ juju remove-machine --force MACHINE_NUMBER
4) monitor the progress in juju status and in the maas web ui
- juju status should stop displaying the machine that failed to deploy in the maas
- the maas web ui should *not* show any change in the system, at this level juju shouldn't be making any changes.
- Wait a few minutes before to proceed to the next step to make sure any background task is completed
5) Repeat steps 3 and 4 for each duplicated machine marked as "failed deployment"
6) Restore the default value of the harvester mode
$ juju model-config --reset provisioner-
By luck we found this during a live session and we asked the operator to NOT remove the machines in failed state, an uninformed user could naively run "juju remove-machine --force X" and end up losing data. Considering there is risk of losing data, I think this bug should be marked as "Critical".