1. In db node, nodeg6, due to following trace, nova logs grew to 30+GB and took complete available disk space..
2. This left cassandra process in a bad state, while the status showed ACTIVE, it was not listening to port 9160
3. The setup was running with 2 DB nodes; With one DB node down, read & write consistency quorum was not met ..
3. Due to above issue, discovery & API services started seeing error [listed in previous post]
Actions to be taken:
a. For R1.10, db nodes should be in odd numbers; In case of upgrade from previous releases, this should be addressed first ..
b. Discovery needs to address the issue in error code path
c. We should show correct state of DB process..
2014-08-29 06:25:02.107 30834 TRACE nova.openstack.common.rpc.common ConnectionError: 530: (NOT_ALLOWED - attempt to reuse consumer tag '1', (60, 20), None)
2014-08-29 06:25:02.107 30834 TRACE nova.openstack.common.rpc.common
2014-08-29 06:25:02.108 30834 INFO nova.openstack.common.rpc.common [-] Reconnecting to AMQP server on 10.204.217.46:5672
2014-08-29 06:25:02.117 30834 INFO nova.openstack.common.rpc.common [-] Connected to AMQP server on 10.204.217.46:5672
2014-08-29 06:25:02.118 30834 ERROR nova.openstack.common.rpc.common [-] Failed to consume message from queue: 530: (NOT_ALLOWED - attempt to reuse consumer tag '1', (60, 20), None)
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common Traceback (most recent call last):
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py", line 577, in ensure
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common return method(*args, **kwargs)
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py", line 655, in _consume
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common queues_tail.consume(nowait=False)
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py", line 191, in consume
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common self.queue.consume(*args, callback=_callback, **options)
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/entity.py", line 595, in consume
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common nowait=nowait)
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/amqp/channel.py", line 1769, in basic_consume
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common (60, 21), # Channel.basic_consume_ok
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 69, in wait
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common self.channel_id, allowed_methods)
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/amqp/connection.py", line 237, in _wait_method
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common self.wait()
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 71, in wait
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common return self.dispatch_method(method_sig, args, content)
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 88, in dispatch_method
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common return amqp_method(self, args)
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/amqp/connection.py", line 491, in _close
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common raise ConnectionError(reply_code, reply_text, (class_id, method_id))
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common ConnectionError: 530: (NOT_ALLOWED - attempt to reuse consumer tag '1', (60, 20), None)
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.common.rpc.common
2014-08-29 06:25:02.119 30834 INFO nova.openstack.common.rpc.common [-] Reconnecting to AMQP server on 10.204.217.46:5672
Sequence of events that lead to the problem:
1. In db node, nodeg6, due to following trace, nova logs grew to 30+GB and took complete available disk space..
2. This left cassandra process in a bad state, while the status showed ACTIVE, it was not listening to port 9160
3. The setup was running with 2 DB nodes; With one DB node down, read & write consistency quorum was not met ..
3. Due to above issue, discovery & API services started seeing error [listed in previous post]
Actions to be taken:
a. For R1.10, db nodes should be in odd numbers; In case of upgrade from previous releases, this should be addressed first ..
b. Discovery needs to address the issue in error code path
c. We should show correct state of DB process..
2014-08-29 06:25:02.107 30834 TRACE nova.openstack. common. rpc.common ConnectionError: 530: (NOT_ALLOWED - attempt to reuse consumer tag '1', (60, 20), None) common. rpc.common common. rpc.common [-] Reconnecting to AMQP server on 10.204.217.46:5672 common. rpc.common [-] Connected to AMQP server on 10.204.217.46:5672 common. rpc.common [-] Failed to consume message from queue: 530: (NOT_ALLOWED - attempt to reuse consumer tag '1', (60, 20), None) common. rpc.common Traceback (most recent call last): common. rpc.common File "/usr/lib/ python2. 7/dist- packages/ nova/openstack/ common/ rpc/impl_ kombu.py" , line 577, in ensure common. rpc.common return method(*args, **kwargs) common. rpc.common File "/usr/lib/ python2. 7/dist- packages/ nova/openstack/ common/ rpc/impl_ kombu.py" , line 655, in _consume common. rpc.common queues_ tail.consume( nowait= False) common. rpc.common File "/usr/lib/ python2. 7/dist- packages/ nova/openstack/ common/ rpc/impl_ kombu.py" , line 191, in consume common. rpc.common self.queue. consume( *args, callback=_callback, **options) common. rpc.common File "/usr/lib/ python2. 7/dist- packages/ kombu/entity. py", line 595, in consume common. rpc.common nowait=nowait) common. rpc.common File "/usr/lib/ python2. 7/dist- packages/ amqp/channel. py", line 1769, in basic_consume common. rpc.common (60, 21), # Channel. basic_consume_ ok common. rpc.common File "/usr/lib/ python2. 7/dist- packages/ amqp/abstract_ channel. py", line 69, in wait common. rpc.common self.channel_id, allowed_methods) common. rpc.common File "/usr/lib/ python2. 7/dist- packages/ amqp/connection .py", line 237, in _wait_method common. rpc.common self.wait() common. rpc.common File "/usr/lib/ python2. 7/dist- packages/ amqp/abstract_ channel. py", line 71, in wait common. rpc.common return self.dispatch_ method( method_ sig, args, content) common. rpc.common File "/usr/lib/ python2. 7/dist- packages/ amqp/abstract_ channel. py", line 88, in dispatch_method common. rpc.common return amqp_method(self, args) common. rpc.common File "/usr/lib/ python2. 7/dist- packages/ amqp/connection .py", line 491, in _close common. rpc.common raise ConnectionError (reply_ code, reply_text, (class_id, method_id)) common. rpc.common ConnectionError: 530: (NOT_ALLOWED - attempt to reuse consumer tag '1', (60, 20), None) common. rpc.common common. rpc.common [-] Reconnecting to AMQP server on 10.204.217.46:5672
2014-08-29 06:25:02.107 30834 TRACE nova.openstack.
2014-08-29 06:25:02.108 30834 INFO nova.openstack.
2014-08-29 06:25:02.117 30834 INFO nova.openstack.
2014-08-29 06:25:02.118 30834 ERROR nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.118 30834 TRACE nova.openstack.
2014-08-29 06:25:02.119 30834 INFO nova.openstack.