Comment 4 for bug 1915045

Revision history for this message
Jason Hobbs (jason-hobbs) wrote : Re: [Bug 1915045] Re: units stuck in insufficient peers even though there are peers

In general we're running automated tests, all the time.

Why isn't this relation data being logged? Is there a way to turn on
logging for it?

We can try to reproduce manually depending on time availability.

On Tue, Feb 9, 2021 at 5:15 AM Alex Kavanagh <email address hidden>
wrote:

> This is a strange one, to be sure. I suspect we'd need to have a poke
> at a live (failed) system to see what is going on with the relation data
> to get to this state. Is it possible to give us a nudge in OpenStack
> charms team when this next occurs so we can take a look please?
>
> --
> You received this bug notification because you are a member of Canonical
> Field High, which is subscribed to the bug report.
> https://bugs.launchpad.net/bugs/1915045
>
> Title:
> units stuck in insufficient peers even though there are peers
>
> Status in OpenStack hacluster charm:
> New
>
> Bug description:
> In an HA deploy, we have 3 units of hacluster-octavia, but 2 say there
> aren't enough peers and remain blocked forever:
>
> octavia/0* blocked executing 0/lxd/8
> 10.244.40.104 9876/tcp 'shared-db' missing, 'amqp' missing,
> 'identity-service' incomplete, 'sdn-subordinate' missing, Awaiting end-user
> execution of `configure-resources` action to create required resources
> hacluster-octavia/0* blocked executing
> 10.244.40.104 Insufficient peer units for ha cluster
> (require 3)
> logrotated/21 active executing
> 10.244.40.104 (config-changed) Unit is ready.
> neutron-openvswitch-octavia/0* maintenance executing
> 10.244.40.104 (config-changed) Configuring ovs
> public-policy-routing/11 active executing
> 10.244.40.104 (start) Unit is ready
> octavia/1 blocked executing 2/lxd/8
> 10.244.41.66 9876/tcp 'shared-db' incomplete, 'amqp' missing,
> 'identity-service' missing, 'sdn-subordinate' missing, Awaiting leader to
> create required resources
> hacluster-octavia/2 active executing
> 10.244.41.66 (leader-settings-changed) Unit is ready
> and clustered
> logrotated/61 active executing
> 10.244.41.66 (config-changed) Unit is ready.
> neutron-openvswitch-octavia/2 maintenance executing
> 10.244.41.66 (config-changed) Configuring ovs
> public-policy-routing/46 active executing
> 10.244.41.66 (config-changed) Unit is ready
> octavia/2 blocked executing 4/lxd/9
> 10.244.41.26 9876/tcp 'shared-db' missing, 'amqp' missing,
> 'identity-service' incomplete, 'sdn-subordinate' missing, Awaiting leader
> to create required resources
> hacluster-octavia/1 blocked executing
> 10.244.41.26 (config-changed) Insufficient peer units
> for ha cluster (require 3)
> logrotated/58 active executing
> 10.244.41.26 (config-changed) Unit is ready.
> neutron-openvswitch-octavia/1 maintenance executing
> 10.244.41.26 (config-changed) Configuring ovs
> public-policy-routing/44 active executing
> 10.244.41.26 (start) Unit is ready
>
> Example test run:
>
> https://solutions.qa.canonical.com/testruns/testRun/3675e02f-09b9-4695-9696-ae6ae7f4921d
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/charm-hacluster/+bug/1915045/+subscriptions
>