[18.08,Xenial-Queens] RMQ malfunction although looks good and expected to self-recover from crash
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack RabbitMQ Server Charm |
In Progress
|
High
|
Liam Young |
Bug Description
Some info:
* rabbitmq-server 3.6.10-1~cloud0
* cs:rabbitmq-
* source=
Initial status:
* rmq/1, rmq/5, rmq/6
* containers on top of 3 different compute nodes
* the compute node hosting rmq/5 goes down (kernel panic) and needs to be rebooted
Next status (1):
* rmq is partitioned: https:/
* rmq/1 is stopped
* rmq/5 is stopped
* rmq/5 is started
* rmq/1 is started
Next status (2):
* rmq looks good: https:/
* However, clients errors continue to be very similar: https:/
In the end, we had to:
* stop all 3 units: /1, then /5, then /6
* start them again: /6, then /5, then /1
* the above made all clients able to register in RMQ and work as expected
Application configuration is:
"""
rabbitmq-server:
bindings:
? ''
: oam-space
amqp: internal-space
cluster: internal-space
charm: cs:rabbitmq-
num_units: 3
options:
min-
queue_
source: cloud:xenial-queens
to:
- lxd:20
- lxd:19
- lxd:16
"""
cluster_
Changed in charm-rabbitmq-server: | |
status: | New → Triaged |
Changed in charm-rabbitmq-server: | |
milestone: | none → 19.04 |
assignee: | nobody → Shane Peters (shaner) |
Changed in charm-rabbitmq-server: | |
status: | Fix Committed → Fix Released |
Flagging as high, pending validation @ master or 18.11 charm revisions. If we can confirm this, keep the prio high. If it is resolved, advise charm upgrade.