I've had deployment of an HA environment fail three attempts in a row. It appears that 'rabbitmqctl list-users' is running before the service has initialized so the command fails; the last time it happened I sshed to the node as soon as the error was reported and ran the command manually with no trouble. Before the fourth attempt, I hacked a "rescue retry" into /etc/puppet/modules/rabbitmq/lib/puppet/provider/rabbitmq_user/rabbitmqctl.rb; I neglected to add logging so I don't know if it worked because of the hack or because the timing just worked out, but the deployment was successful. I won't be able to do any more test deployments though.
def self.instances
rabbitmqctl('list_users').split(/\n/)[1..-2].collect do |line|
if line =~ /^(\S+)(\s+\S+|)$/
new(:name => $1)
else
raise Puppet::Error, "Cannot parse invalid user line: #{line}"
end
end
rescue
retry
end
VERSION:
feature_groups:
- mirantis
production: "docker"
release: "5.1"
api: "1.0"
build_number: "11"
build_id: "2014-09-17_21-40-34"
astute_sha: "f5fbd89d1e0e1f22ef9ab2af26da5ffbfbf24b13"
fuellib_sha: "d9b16846e54f76c8ebe7764d2b5b8231d6b25079"
ostf_sha: "64cb59c681658a7a55cc2c09d079072a41beb346"
nailgun_sha: "eb8f2b358ea4bb7eb0b2a0075e7ad3d3a905db0d"
fuelmain_sha: "8ef433e939425eabd1034c0b70e90bdf888b69fd"
Failure log:
2014-10-03T22:13:53.667080+01:00 debug: (Service[p_rabbitmq-server](provider=pacemaker)) LAST SUCCESSFUL OP :
2014-10-03T22:13:53.667080+01:00 debug: (Service[p_rabbitmq-server](provider=pacemaker))
2014-10-03T22:13:53.667161+01:00 debug: (Service[p_rabbitmq-server](provider=pacemaker)) "start"
2014-10-03T22:13:53.667161+01:00 debug: (Service[p_rabbitmq-server](provider=pacemaker)) LSO: {"node-1.cde.ctc.com"=>"start"}
2014-10-03T22:13:53.667161+01:00 debug: (Service[p_rabbitmq-server](provider=pacemaker)) STATUS IS: running
2014-10-03T22:14:00.103519+01:00 debug: Executing '/usr/sbin/crm resource manage master_p_rabbitmq-server'
2014-10-03T22:14:00.369498+01:00 notice: (/Stage[main]/Rabbitmq::Service/Service[p_rabbitmq-server]/enable) enable changed 'true' to 'true'
2014-10-03T22:14:00.378113+01:00 debug: (/Stage[main]/Rabbitmq::Service/Service[p_rabbitmq-server]) The container Class[Rabbitmq::Service] will propagate my refresh event
2014-10-03T22:14:00.379173+01:00 info: (/Stage[main]/Rabbitmq::Service/Service[p_rabbitmq-server]) Evaluated in 13.62 seconds
2014-10-03T22:14:00.380314+01:00 info: (Class[Rabbitmq::Service]) Starting to evaluate the resource
2014-10-03T22:14:00.388625+01:00 debug: (Class[Rabbitmq::Service]) The container Stage[main] will propagate my refresh event
2014-10-03T22:14:00.389772+01:00 info: (Class[Rabbitmq::Service]) Evaluated in 0.01 seconds
2014-10-03T22:14:00.390886+01:00 info: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) Starting to evaluate the resource
2014-10-03T22:14:00.393611+01:00 debug: Executing '/usr/sbin/rabbitmqctl list_users'
2014-10-03T22:14:00.617753+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) Could not evaluate: Execution of '/usr/sbin/rabbitmqctl list_users' returned 2: Listing users ...
2014-10-03T22:14:00.618017+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) Error: unable to connect to node 'rabbit@node-1': nodedown
2014-10-03T22:14:00.618017+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest])
2014-10-03T22:14:00.618017+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) DIAGNOSTICS
2014-10-03T22:14:00.618017+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) ===========
2014-10-03T22:14:00.619011+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest])
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) attempted to contact: ['rabbit@node-1']
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest])
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) rabbit@node-1:
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) * connected to epmd (port 4369) on node-1
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) * epmd reports: node 'rabbit' not running at all
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) other nodes on node-1: [murano]
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) * suggestion: start the node
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest])
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) current node details:
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) - node name: 'rabbitmqctl11897@node-1'
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) - home dir: /var/lib/rabbitmq
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) - cookie hash: soeIWU2jk2YNseTyDSlsEA==
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest])
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest])
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/util/execution.rb:188:in `execute'
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/provider/command.rb:23:in `execute'
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/provider.rb:237:in `rabbitmqctl'
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/provider.rb:444:in `send'
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/provider.rb:444:in `rabbitmqctl'
2014-10-03T22:14:00.619141+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /etc/puppet/modules/rabbitmq/lib/puppet/provider/rabbitmq_user/rabbitmqctl.rb:30:in `exists?'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/property/ensure.rb:81:in `retrieve'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/type.rb:1035:in `retrieve'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/type.rb:1063:in `retrieve_resource'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/transaction/resource_harness.rb:223:in `from_resource'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/transaction/resource_harness.rb:17:in `evaluate'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/transaction.rb:174:in `apply'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/transaction.rb:187:in `eval_resource'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/transaction.rb:117:in `call'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/transaction.rb:117:in `evaluate'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/util.rb:327:in `thinmark'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/1.8/benchmark.rb:308:in `realtime'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/util.rb:326:in `thinmark'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/transaction.rb:117:in `evaluate'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/graph/relationship_graph.rb:118:in `traverse'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/transaction.rb:108:in `evaluate'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/resource/catalog.rb:164:in `apply'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:149:in `with_destination'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/transaction/report.rb:108:in `as_logging_destination'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/resource/catalog.rb:163:in `apply'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/configurer.rb:125:in `apply_catalog'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/util.rb:161:in `benchmark'
2014-10-03T22:14:00.619249+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/1.8/benchmark.rb:308:in `realtime'
2014-10-03T22:14:00.619304+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/util.rb:160:in `benchmark'
2014-10-03T22:14:00.619304+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/configurer.rb:124:in `apply_catalog'
2014-10-03T22:14:00.619304+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/configurer.rb:192:in `run'
2014-10-03T22:14:00.619304+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/application/apply.rb:268:in `apply_catalog'
2014-10-03T22:14:00.619304+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/application/apply.rb:218:in `main'
2014-10-03T22:14:00.619304+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/application/apply.rb:146:in `run_command'
2014-10-03T22:14:00.619304+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/application.rb:364:in `run'
2014-10-03T22:14:00.619304+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/application.rb:470:in `plugin_hook'
2014-10-03T22:14:00.619584+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/application.rb:364:in `run'
2014-10-03T22:14:00.619584+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/util.rb:478:in `exit_on_fail'
2014-10-03T22:14:00.619584+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/application.rb:364:in `run'
2014-10-03T22:14:00.619584+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/util/command_line.rb:137:in `run'
2014-10-03T22:14:00.619584+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/lib/ruby/site_ruby/1.8/puppet/util/command_line.rb:91:in `execute'
2014-10-03T22:14:00.620421+01:00 err: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) /usr/bin/puppet:4
2014-10-03T22:14:00.620569+01:00 info: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) Evaluated in 0.23 seconds
2014-10-03T22:14:00.621531+01:00 info: (Class[Rabbitmq::Server]) Starting to evaluate the resource
2014-10-03T22:14:00.624394+01:00 debug: (Class[Rabbitmq::Server]) The container Stage[main] will propagate my refresh event
2014-10-03T22:14:00.625514+01:00 info: (Class[Rabbitmq::Server]) Evaluated in 0.00 seconds
2014-10-03T22:14:00.631480+01:00 info: (/Stage[main]/Neutron::Quota/Neutron_config[quotas/quota_router]) Starting to evaluate the resource
2014-10-03T22:14:00.632915+01:00 notice: (/Stage[main]/Neutron::Quota/Neutron_config[quotas/quota_router]) Dependency Rabbitmq_user[guest] has failures: true
2014-10-03T22:14:00.634836+01:00 warning: (/Stage[main]/Neutron::Quota/Neutron_config[quotas/quota_router]) Skipping because of failed dependencies
2014-10-03T22:14:00.635985+01:00 info: (/Stage[main]/Neutron::Quota/Neutron_config[quotas/quota_router]) Evaluated in 0.00 seconds
Success log:
/usr/lib/ruby/site_ruby/1.8/puppet/transaction.rb:1082014-10-03T23:05:40.041782+01:00 debug: (Service[p_rabbitmq-server](provider=pacemaker)) LAST SUCCESSFUL OP :
2014-10-03T23:05:40.041883+01:00 debug: (Service[p_rabbitmq-server](provider=pacemaker))
2014-10-03T23:05:40.041883+01:00 debug: (Service[p_rabbitmq-server](provider=pacemaker)) "start"
2014-10-03T23:05:40.041883+01:00 debug: (Service[p_rabbitmq-server](provider=pacemaker)) LSO: {"node-1.cde.ctc.com"=>"start"}
2014-10-03T23:05:40.041883+01:00 debug: (Service[p_rabbitmq-server](provider=pacemaker)) STATUS IS: running
2014-10-03T23:05:48.965064+01:00 debug: Executing '/usr/sbin/crm resource manage master_p_rabbitmq-server'
2014-10-03T23:05:49.225535+01:00 notice: (/Stage[main]/Rabbitmq::Service/Service[p_rabbitmq-server]/ensure) ensure changed 'stopped' to 'running'
2014-10-03T23:05:49.226619+01:00 debug: (/Stage[main]/Rabbitmq::Service/Service[p_rabbitmq-server]) The container Class[Rabbitmq::Service] will propagate my refresh event
2014-10-03T23:05:49.227711+01:00 info: (/Stage[main]/Rabbitmq::Service/Service[p_rabbitmq-server]) Unscheduling refresh on Service[p_rabbitmq-server]
2014-10-03T23:05:49.228819+01:00 info: (/Stage[main]/Rabbitmq::Service/Service[p_rabbitmq-server]) Evaluated in 41.22 seconds
2014-10-03T23:05:49.229938+01:00 info: (Class[Rabbitmq::Service]) Starting to evaluate the resource
2014-10-03T23:05:49.238204+01:00 debug: (Class[Rabbitmq::Service]) The container Stage[main] will propagate my refresh event
2014-10-03T23:05:49.239321+01:00 info: (Class[Rabbitmq::Service]) Evaluated in 0.01 seconds
2014-10-03T23:05:49.240443+01:00 info: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) Starting to evaluate the resource
2014-10-03T23:05:49.243512+01:00 debug: Executing '/usr/sbin/rabbitmqctl list_users'
2014-10-03T23:05:49.478386+01:00 debug: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) Nothing to manage: no ensure and the resource doesn't exist
2014-10-03T23:05:49.479483+01:00 info: (/Stage[main]/Rabbitmq::Server/Rabbitmq_user[guest]) Evaluated in 0.24 seconds
2014-10-03T23:05:49.480594+01:00 info: (Class[Rabbitmq::Server]) Starting to evaluate the resource
2014-10-03T23:05:49.489413+01:00 debug: (Class[Rabbitmq::Server]) The container Stage[main] will propagate my refresh event
2014-10-03T23:05:49.490495+01:00 info: (Class[Rabbitmq::Server]) Evaluated in 0.01 seconds
2014-10-03T23:05:49.497557+01:00 info: (/Stage[main]/Neutron::Quota/Neutron_config[quotas/quota_router]) Starting to evaluate the resource
2014-10-03T23:05:49.509255+01:00 notice: (/Stage[main]/Neutron::Quota/Neutron_config[quotas/quota_router]/ensure) created
2014-10-03T23:05:49.510331+01:00 debug: (/Stage[main]/Neutron::Quota/Neutron_config[quotas/quota_router]) The container Class[Neutron::Quota] will propagate my refresh event
2014-10-03T23:05:49.511408+01:00 info: (/Stage[main]/Neutron::Quota/Neutron_config[quotas/quota_router]) Scheduling refresh of Service[neutron-server]
2014-10-03T23:05:49.511655+01:00 info: (/Stage[main]/Neutron::Quota/Neutron_config[quotas/quota_router]) Evaluated in 0.01 seconds
Frank, thank you for posting the bug report. We have not yet experienced such an issue, so we need a more verbose logging output to see whether there were any errors related to pacemaker/corosync, puppet traces and rabbitmq logs. Is it possible for you to upload a diagnostic snapshot of your deployment, so that we could analyze it and try to reproduce your environment?