OpenStack Charm Guide

Bug #2008509
Comment #1

Comment 1 for bug 2008509

Revision history for this message

Trent Lloyd (lathiat) wrote on 2023-05-09: Re: Juju should restrict upgrade-series prepare if a subordinate charm doesn't support the new series

I hit this issue in production with series-upgrade of a Yoga OpenStack cloud from focal->jammy. The issue for us was the hacluster charm. The 2.0.3 charm only supports focal while the 2.4 charm supports both focal+jammy. It still occurs on 2.9.42.

It's not clear from the original description, but the "upgrade-series prepare" does error with the following, after typing yes to confirm the upgrade:
"ERROR charm "hacluster" does not support jammy, force not used"

However the pre-upgrade-series hooks run in the background anyway, even though the juju client exits after that error. Then the hacluster unit goes into the failed state.

In our debugging, db.machineUpgradeSeriesLocks is empty. You can re-run prepare with --force which then creates a lock however the units are still stuck in a failed state. It seems that still leaves the object in a state the transcation won't let happen - perhaps because the hooks already ran.

= Workaround =
If you only attempted the "prepare" on a single unit, you can force-remove that unit, scale it back out, upgrade the hacluster charm and then proceed with a series upgrade. I was not able to find a way to get the broken unit out of the broken state.

juju remove-unit keystone/0 --force
juju add-unit keystone
juju upgrade-charm keystone --channel 2.4/stable

= Reproducer =
You can deploy a simple bundle with keystone and hacluster to reproduce the issue. I have attached the bundle as keystone-focal-yoga.yaml

juju add-model keystone1
juju deploy ./keystone-focal-yoga.yaml
juju upgrade-series 0 prepare jammy

= Expectations =

- This is a critical issue that needs prioritising for a new 2.9.43 release.

- However we also need to determine if we can easily fix this situation as people are very likely to get stuck and removing and scaling the broken unit is very error prone in practice and best avoided if possible.