upgrading does not update 'openstack-release-version' and installs wrong dependencies
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Gnocchi Charm |
In Progress
|
Undecided
|
Unassigned | ||
OpenStack AODH Charm |
In Progress
|
Undecided
|
Unassigned | ||
OpenStack Designate Charm |
In Progress
|
Undecided
|
Unassigned | ||
OpenStack Manila Charm |
In Progress
|
High
|
Alex Kavanagh | ||
charms.openstack |
In Progress
|
High
|
Alex Kavanagh |
Bug Description
There is a property in the sqlite .unit-state.db file named "charmers.
Unfortunately, when upgrading manila openstack release from queens to rocky for example, or rocky to stein, and so on, does not update this property. This results in the following reproducible issues:
1) upgrading from queens to rocky changes the dependencies from py2 to py3. Therefore, due to the above mentioned issues, both dependencies end up being installed:
$ dpkg --list | grep pymysql
ii python-pymysql 0.8.0-1 all Pure-Python MySQL driver - Python 2.x
ii python3-pymysql 0.8.0-1 all Pure-Python MySQL Driver - Python 3.x
This issue is not immediately detectable since there are no charm/hook errors produced by that anomaly.
2) upgrading from bionic-ussuri to focal-ussuri, having upgraded from queens cause the following error:
unit-manila-4: 11:00:56 WARNING unit.manila/
unit-manila-4: 11:00:57 ERROR juju.worker.
however, the above errors are not immediately triggered upon a completed upgrade. They are only triggered when a config-changed invokes the install charm function, which may happen due to various reasons such as updating the ssl certs.
Certainly future errors would arise independently of being deployed on queens, such as when dependencies change again or are no longer available in newer ubuntu versions such as jammy.
The property should be updated every time the openstack version of manila is upgraded through the charm.
Steps to reproduce the sqlite inconsistency:
1) deploy any version of manila
2) upgrade it
3) check the .unit-state.db file, property charmers.
Steps to reproduce the upgrade errors:
1) deploy manila on queens
2) upgrade to rocky
3) perform a ssl cert update
4) run dpkg --list | grep pymysql in the manila unit
5) upgrade further to ussuri, and then to focal
6) update ssl certs again, therefore hitting hook error
Changed in charm-manila: | |
assignee: | nobody → Rodrigo Barbieri (rodrigo-barbieri2010) |
tags: | added: sts |
Changed in charms.openstack: | |
status: | New → Triaged |
importance: | Undecided → High |
Changed in charms.openstack: | |
status: | Fix Released → In Progress |
Changed in charm-manila: | |
status: | Incomplete → In Progress |
Changed in charm-designate: | |
status: | New → In Progress |
Changed in charm-aodh: | |
status: | New → In Progress |
Changed in charm-gnocchi: | |
status: | New → In Progress |
After a significant amount of investigation I've found out several things about this bug:
1) the unitdata.set() that happens at [1] is only actually written to the file at the end of successful hook/action execution.
2) In my deployment, running the action openstack-upgrade with --wait shows that the action actually fails, despite having upgraded the packages successfully. It fails at [2] and does not even get to perform a db migration sync.
3) Diving deeper into [2], at [3] it gets a list of relation interfaces to process and render config files based on. When running 3 manila units, the cluster interface is present with class relations. openstack- ha.peers. OpenstackHAPeer s. When running only 1 manila unit, the cluster interface is not present, and the upgrade does not fail (for me, in my deployment)
4) Diving deeper at [3], at [4] it processes the interfaces, notice an interesting "except TypeError" there, more on this later. The constructor invoked through the variable class instantiation eventually reaches [5], where it iterates through the interfaces and at [6] it hits the problem where it tries to set the attribute to the relation adapters class. The attribute being 'cluster', and the value being the class OpenstackHAPeers.
When it fails, there are absolutely no errors in the logs. The action run with --wait finishes with the following fields:
message: can't set attribute
status: failed
Those are easily overlooked due to the other stdout printed, such as packages installed during the upgrade. Fortunately this is an action that can be repeated over and over until successful.
That message "can't set attribute" is actually an AttributeError (not a TypeError). By adding an extra try/except block I can capture it and confirm it. No traceback information though (if I did things correctly). Also, adding the AttributeError to the outer try/except definition does not solve the problem because it is just invoked again and fails the same way.
So I added extra logging such as:
hookenv.log("TEST LOG: {} {} {} {} {}".format( relation, adapter_ name,adapter, self,self. __dict_ _))
just above [6] and it printed the following:
TEST DATA: <relations. openstack- ha.peers. OpenstackHAPeer s object at 0x7fea21c85cf8> cluster <charms_ openstack. adapters. PeerHARelationA dapter object at 0x7fea21c9c908> <charm. openstack. manila. ManilaRelationA dapters object at 0x7fea21c85780> {'_charm_ instance_ weakref' : <weakref at 0x7fea21c7a3b8; to 'ManilaCharmRocky' at 0x7fea21c85ba8>, '_relations': {'options', 'amqp'}, 'options': <charms_ openstack. adapters. DefaultConfigur ationAdapter object at 0x7fea21c9c3c8>, '_adapters': {'amqp': <class 'charm. openstack. manila. TransportURLAda pter'>, 'shared_db': <class 'charms_ openstack. adapters. DatabaseRelatio nAdapter' >, 'cluster': <class 'charms_ openstack. adapters. PeerHARelationA dapter' >, 'coordinator_ memcached' : <class 'charms_ openstack. adapters. MemcacheRelatio nAdapter' >}, 'amqp': <charm. openstack. manila. TransportURLAda pter object at 0x7fea21c9cb00>}
I was unsure whether this was correct or not, so I compared to the placement charm which does not fail. It prints the following:
TEST DATA: <relations. openstack- ha.peers. OpenstackHAPeer s o...