Charm stuck in waiting after rejoining the cluster
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MySQL InnoDB Cluster Charm |
Triaged
|
Medium
|
Unassigned |
Bug Description
Running focal/ussuri.
Was testing availability zone failure and the node with one of the mysql-innodb-
After the nodes came up, all services restored fairly quickly except mysql-innodb-
It says:
Cluster is inaccessible from this instance. Please check logs for details.
Checking the logs I see:
RuntimeError: Dba.get_cluster: Group replication does not seem to be active in instance '10.103.223.3:3306'
10.103.223.3 is the IP of mysql-innodb-
I tried the juju action (from a healthy unit) to rejoin the cluster:
juju run-action mysql-innodb-
But that failed saying:
The group_replicati
The mysql logs say there may be corruption in the relay log. It also says it set the member to read-only and then left the group (well before me running the rejoin-instance action).
I suspect running the reboot-
I'll try removing and re-adding:
juju run-action mysql-innodb-
juju run-action mysql-innodb-
force=true was necessary because the node is marked ERROR.
These action succeeded but now the juju status for that unit says:
Instance not yet configured for clustering
After connecting to mysql in the bad unit (using the pw from leader-get mysql.passwd) I executed:
stop group_replication;
reset replica;
Afterwards, running the add-instance action worked and the cluster-status action shows all three nodes joined with the new one RO, as expected.
However, juju status still shows it's waiting with:
Instance not yet configured for clustering
I've tried manually running hooks, restarting mysql and juju agent on the suspect node, and the status still shows waiting.
Checking the logs, neither mysql nor juju are showing any errors and the unit appears to be functioning appropriately so this seems to be a charm bug and not an actual state of things.
Adding back an instance after it has been forcibly removed causes the charm state to go off-sync. The charm currently does not monitor the cluster state properly, so it would require the following command to set the charm-state back to what it should be after adding back the instance:
juju run -u mysql-innodb- cluster/ leader -- leader-set cluster- instance- configured- 192-168- 0-32=True
replace 192-168-0-32 with the IP of the instance you added back