rabbitmq-server takes a long time to process amqp-relation-changed
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Autopilot Log Analyser |
Fix Committed
|
High
|
Francis Ginther | ||
Landscape Server |
New
|
Undecided
|
Francis Ginther | ||
rabbitmq-server (Juju Charms Collection) |
Invalid
|
High
|
David Ames |
Bug Description
I'm seeing this issue with the current next charm:
cs:~openstack-
Deployed by landscape autopilot (17.01~
https:/
https:/
https:/
https:/
The attached logs are extracted from build 641.
It appears that the rabbitmq charm is taking a long time to process amqp-relation-
[landscape-
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
...
ubuntu 10561 0.0 0.0 36232 3372 ? R 23:56 0:00 \_ ps fauxww
...
root 9567 0.0 0.0 19700 3264 ? Ss 22:47 0:00 bash /var/lib/
0/exec-start.sh
root 9571 0.0 0.3 301872 58020 ? Sl 22:47 0:03 \_ /var/lib/
d unit --data-dir /var/lib/juju --unit-name rabbitmq-server/0 --debug
root 4421 1.7 0.3 132120 63264 ? S 23:45 0:10 \_ /usr/bin/python /var/lib/
root 10398 0.0 0.0 4508 1672 ? S 23:56 0:00 \_ /bin/sh /usr/sbin/
root 10406 0.0 0.0 49344 3160 ? S 23:56 0:00 \_ su rabbitmq -s /bin/sh -c /usr/lib/
rabbitmq 10415 0.0 0.0 4508 700 ? Ss 23:56 0:00 \_ sh -c /usr/lib/
rabbitmq 10417 0.0 0.1 377080 17612 ? Sl 23:56 0:00 \_ /usr/lib/
The juju unit log for rabbitmq also show consistent processing of amqp relation hooks up until the dump is collected. And in this deployment, nova-cloud-compute is still waiting on it's messaging relation to be completed.
The autopilot deployment failed because the landscape-client subordinate running on the same unit as rabbitmq-server/0 never got a chance to complete and communicate back to landscape. It appears that juju was only ever running rabbitmq-server hooks (note: juju hooks running on a unit are serialized, so it's possible that the landscape-client hooks are queued up waiting for the rabbitmq-server hooks to finish).
I've attached the logs from the rabbitmq-server and nova-cloud-compute units.
Changed in rabbitmq-server (Juju Charms Collection): | |
status: | New → Triaged |
importance: | Undecided → High |
assignee: | nobody → David Ames (thedac) |
milestone: | none → 17.01 |
Changed in landscape: | |
assignee: | nobody → Francis Ginther (fginther) |
Changed in autopilot-log-analyser: | |
assignee: | nobody → Francis Ginther (fginther) |
status: | New → In Progress |
importance: | Undecided → High |
summary: |
- rabbitmq-server takes a long time to process amqp-relation-joined + rabbitmq-server takes a long time to process amqp-relation-changed |
Changed in autopilot-log-analyser: | |
status: | In Progress → Fix Committed |
Changed in landscape: | |
milestone: | 17.01 → 17.02 |
I don't have the exact juju configuration used at the time of the failure, but the only option that landscape appears to be tweaking is "min-cluster-size: 3" and "nagios_context: region1". All the other options should be set to the default.