Migrating Juju models with graylog results in timeout during the quiescing phase
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Graylog Charm |
New
|
Undecided
|
Unassigned |
Bug Description
Migrating large Juju models with graylog VMs results in a timeout during the quiescing phase.
The workaround is to wait for the model to enter this status during migration:
migrating: quiescing, waiting for agents to report back: 1079 succeeded, 3 still to report
Then, SSH into each graylog unit and restart jujud-machine-* service. Restarting this service before the migration doesn't help.
The juju model migration proceeds after that.
This has been observed on 4 production clouds (different envs and customers) already while migrating models from bionic to focal based Juju controllers. Juju version seems to be irrelevant but the issue was observed when migrating models with:
juju 2.9.22 on source and target controller
juju 2.9.32 on source and target controller
juju 2.9.33 on source and target controller
juju 2.9.34 source to juju 2.9.37 target
It happens with Graylog running on 3 units, it also happens with Graylog running on 1 unit only.
During all attmepted migrations, Graylog was the only affected application.