Pacemaker RA for RabbitMQ cannot detect when its app hangs and keeps it unresponsive
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Mirantis OpenStack |
Fix Committed
|
Critical
|
Bogdan Dobrelya | ||
5.0.x |
Invalid
|
Critical
|
Fuel Library (Deprecated) | ||
5.1.x |
Won't Fix
|
Critical
|
Fuel Library (Deprecated) | ||
6.0.x |
Fix Committed
|
Critical
|
Bogdan Dobrelya | ||
6.1.x |
Fix Committed
|
Critical
|
Bogdan Dobrelya |
Bug Description
[root@fuel ~]# fuel --fuel-version
api: '1.0'
astute_sha: f7cda2171b0b677
auth_required: true
build_id: 2015-02-07_20-50-01
build_number: '76'
feature_groups:
- mirantis
fuellib_sha: 64f3ebe9fcbd18b
fuelmain_sha: c799e3a6d88289e
nailgun_sha: 2ef819732a3ee7a
ostf_sha: 3b57985d4d21555
production: docker
release: 6.0.1
release_versions:
2014.2-6.0.1:
VERSION:
api: '1.0'
astute_sha: f7cda2171b0b677
build_id: 2015-02-07_20-50-01
build_number: '76'
feature_
- mirantis
fuellib_sha: 64f3ebe9fcbd18b
fuelmain_sha: c799e3a6d88289e
nailgun_sha: 2ef819732a3ee7a
ostf_sha: 3b57985d4d21555
production: docker
release: 6.0.1
Baremetal,Ubuntu, HA, Neutron-
Controllers:3 Computes:96
I have changed /usr/bin/
147 config.
this change regard to https:/
Deployment has been passed successfully, but during full rally test primary controller node has been marked as offline. Also this node is unreachable via ssh.
[root@fuel ~]# ssh node-19
Warning: Permanently added 'node-19' (RSA) to the list of known hosts.
Write failed: Broken pipe
But at the moment I have one opened ssh session which gives the able to execute some commands.
here is output of top command:
http://
root@node-19:~# free -m
total used free shared buffers cached
Mem: 32142 31768 373 0 211 11263
-/+ buffers/cache: 20292 11849
Swap: 15624 12 15612
"rabbitmqctl cluster_status" and "rabbitmqctl list_queues" commands just hang on this node
from other controller node:
root@node-52:~# rabbitmqctl cluster_status
Cluster status of node 'rabbit@node-52' ...
[{nodes,
{running_
{cluster_
{partitions,[]}]
...done.
root@node-52:~# rabbitmqctl list_queues | grep -v 0$
Listing queues ...
dhcp_agent.node-19 96
notifications.error 415
reply_0c7bc35f0
...done.
root@node-19:~# dmesg | grep -i error
[ 9.798790] ACPI Error: [\_SB_.PRAD]
[ 10.883460] ACPI Error: Method parse/execution failed [\_GPE._L24] (Node ffff880853d9d3e8), AE_NOT_FOUND (20131115/
[ 16.284591] ioapic: probe of 0000:00:05.4 failed with error -22
[ 17.779631] ERST: Error Record Serialization Table (ERST) support is initialized.
[ 31.029678] EXT4-fs (sda3): re-mounted. Opts: errors=remount-ro
crm status output is here
http://
The last line in rabbitmq log is:
=INFO REPORT==== 18-Feb-
accepting AMQP connection <0.9669.490> (192.168.0.54:41674 -> 192.168.0.21:5673)
snapshot will be here asap
Changed in mos: | |
milestone: | none → 6.0.2 |
milestone: | 6.0.2 → 6.0.1 |
importance: | Undecided → Critical |
status: | New → In Progress |
description: | updated |
I can't reach bash even via IPMI. /bugs.launchpad .net/fuel/ +bug/1422186
It's very similar with https:/