rmq next charm: config-changed hook fails when deployed to lxc

Bug #1475320 reported by Ryan Beisner
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
rabbitmq-server (Juju Charms Collection)
Fix Released
Undecided
Unassigned

Bug Description

The rabbitmq-server next charm works on bare metal but not in a container; the stable charm works fine on both.

After deploy, the rabbit startup errors out with:

ERROR: node with name "rabbit" already running on "10-245-173-99"

DIAGNOSTICS
===========

nodes in question: ['rabbit@10-245-173-99']

hosts, their running nodes and ports:
- 10-245-173-99: [{rabbit,44415},
                  {rabbitmqctl11575,47036},
                  {rabbitmqprelaunch11590,41368}]

current node details:
- node name: 'rabbitmqprelaunch11590@juju-machine-0-lxc-0'
- home dir: /var/lib/rabbitmq
- cookie hash: /Pe4Mo4pg2tLXWUO0aB7hQ==

After stopping the rabbitmq-server service on the unit, there remains a running mq server process.

Kill -9ing that service, then starting the rabbitmq-server service seems to resolve in place.

## REPRODUCER:
To remove the surrounding complexity of OpenStack and bundles, here is a simplified approach to reproducing this bug.

juju bootstrap --constraints "arch=amd64 tags=uosci"
mkdir trusty

# grab next charm
bzr branch lp:~openstack-charmers/charms/trusty/rabbitmq-server/next trusty/rabbitmq-server

# grab next charm with proposed fix
bzr branch lp:~james-page/charms/trusty/rabbitmq-server/fixup-configure-nodename trusty/rabbitmq-server-fix1

# grab stable charm
bzr branch lp:charms/trusty/rabbitmq-server trusty/rabbitmq-server-stable

# Manually change "name:" to "rabbitmq-server-fix1" in trusty/rabbitmq-server-fix1/metadata.yaml.
# Manually change "name:" to "rabbitmq-server-stable" in trusty/rabbitmq-server-stable/metadata.yaml.

# Deploy next charm to a container (FAIL):
juju deploy --to lxc:0 --repository=./ local:trusty/rabbitmq-server

# Deploy next charm to bare metal (OK):
juju add-unit rabbitmq-server

# Deploy proposed charm fix to a container (FAIL):
juju deploy --to lxc:0 --repository=./ local:trusty/rabbitmq-server-fix1 # fails

# Deploy proposed charm fix to bare metal (OK):
juju add-unit rabbitmq-server-fix1 # succeeds, causes rmq to install to bare metal, no lxc.

# Deploy stable charm to a container (OK):
juju deploy --to lxc:0 --repository=./ local:trusty/rabbitmq-server-stable

# Deploy stable charm to bare metal (OK):
juju add-unit rabbitmq-server-stable

# juju stat after reproducer:
juju stat --format tabular
juju stat --format yaml

SEE --> http://paste.ubuntu.com/11888683/

## RESULTANT CONFS:
jenkins@juju-osci-machine-13:~/bzr⟫ juju run --all "cat /etc/rabbitmq/rabbitmq-env.conf"
# ignore 0
- MachineId: "0"
  Stderr: "Warning: Permanently added '10.245.168.11' (ECDSA) to the list of known
    hosts.\r\ncat: /etc/rabbitmq/rabbitmq-env.conf: No such file or directory\n"

# next charm in lxc
- MachineId: 0/lxc/0
    RABBITMQ_NODENAME=rabbit@10-245-173-99

# proposed fix charm in lxc
- MachineId: 0/lxc/1
    RABBITMQ_NODENAME=rabbit@10-245-173-100

# stable charm in lxc
- MachineId: 0/lxc/2
    RABBITMQ_NODENAME=rabbit@10-245-173-101

# next charm on bare metal
- MachineId: "1"
    RABBITMQ_NODENAME=rabbit@fat-machine

# proposed fix charm on bare metal
- MachineId: "2"
    RABBITMQ_NODENAME=rabbit@cylindrical-base

# stable charm on bare metal
- MachineId: "3"
    RABBITMQ_NODENAME=rabbit@grizzled-family

## JUJU/MAAS VERSION INFO:
# juju version:
jenkins@juju-osci-machine-13:~/bzr⟫ apt-cache policy juju
juju:
  Installed: 1.24.2-0ubuntu1~14.04.1~juju1
  Candidate: 1.24.2-0ubuntu1~14.04.1~juju1
  Version table:
 *** 1.24.2-0ubuntu1~14.04.1~juju1 0
        500 http://ppa.launchpad.net/juju/stable/ubuntu/ trusty/main amd64 Packages
        100 /var/lib/dpkg/status
     1.20.11-0ubuntu0.14.04.1 0
        500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty-updates/universe amd64 Packages
     1.18.1-0ubuntu1 0
        500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty/universe amd64 Packages

# maas version:
ubuntu@lescina:~$ apt-cache policy maas
maas:
  Installed: 1.8.0+bzr4001-0ubuntu2~trusty1
  Candidate: 1.8.0+bzr4001-0ubuntu2~trusty1
  Version table:
 *** 1.8.0+bzr4001-0ubuntu2~trusty1 0
        500 http://ppa.launchpad.net/maas-maintainers/experimental/ubuntu/ trusty/main amd64 Packages
        500 http://ppa.launchpad.net/maas-maintainers/stable/ubuntu/ trusty/main amd64 Packages
        100 /var/lib/dpkg/status
     1.5.4+bzr2294-0ubuntu1.3 0
        500 http://archive.ubuntu.com/ubuntu/ trusty-updates/main amd64 Packages
     1.5.4+bzr2294-0ubuntu1.2 0
        500 http://security.ubuntu.com/ubuntu/ trusty-security/main amd64 Packages
     1.5+bzr2252-0ubuntu1 0
        500 http://archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages
You have new mail in /var/mail/ubuntu

Related branches

Revision history for this message
Ryan Beisner (1chb1n) wrote :

It is worth noting that reverse dns is technically "working" in this environment and scenario, but the ptr records for the units which are in lxc are handled slightly differently by maas. http://paste.ubuntu.com/11888267/

Revision history for this message
Ryan Beisner (1chb1n) wrote :
Download full text (5.2 KiB)

Demonstrating that A and PTR records succeed from the vantage point of the unit, yet the install fails even with the proposed fix:

ubuntu@juju-machine-0-lxc-0:/var/lib/juju/agents/unit-rabbitmq-server-0/charm$ host 10.245.173.92
92.173.245.10.in-addr.arpa domain name pointer 10-245-173-92.dellstack.

ubuntu@juju-machine-0-lxc-0:/var/lib/juju/agents/unit-rabbitmq-server-0/charm$ host 10-245-173-92
10-245-173-92.dellstack has address 10.245.173.92

unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40 Error: unable to connect to node 'rabbit@10-245-173-92': nodedown
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40 DIAGNOSTICS
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40 ===========
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40 nodes in question: ['rabbit@10-245-173-92']
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40 hosts, their running nodes and ports:
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40 - 10-245-173-92: [{rabbit,41684},{rabbitmqctl8680,35629}]
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40 current node details:
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40 - node name: 'rabbitmqctl8680@juju-machine-0-lxc-0'
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40 - home dir: /var/lib/rabbitmq
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40 - cookie hash: 2bb6Ir+XIpVCUFy2DXh5xw==
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40 Traceback (most recent call last):
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40 File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/config-changed", line 777, in <module>
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40 hooks.execute(sys.argv)
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed logger.go:40 File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/charmhelpers/core/hookenv.py", line 603, in execute
unit-rabbitmq-server-0[900]: 2015-07-16 15:27:51 INFO unit.rabbitmq-server/0.config-changed...

Read more...

Revision history for this message
Ryan Beisner (1chb1n) wrote : Re: rmq next charm: pkg install fails when deployed to lxc

While dns may still play a role, I've changed the bug stance to be an issue of whether or not the charm is deployed inside a container. ie. It works on bare metal, but not inside a container.

summary: - rmq next charm: pkg install fails when reverse dns fails - Error:
- unable to connect to node 'rabbit@10-245-173-55': nodedown
+ rmq next charm: pkg install fails when deployed to lxc
description: updated
description: updated
Revision history for this message
Ryan Beisner (1chb1n) wrote :
Revision history for this message
Ryan Beisner (1chb1n) wrote :
Revision history for this message
Ryan Beisner (1chb1n) wrote :
Revision history for this message
Ryan Beisner (1chb1n) wrote :

Added log attachments for units 0..3.

Ryan Beisner (1chb1n)
description: updated
Ryan Beisner (1chb1n)
description: updated
Ryan Beisner (1chb1n)
description: updated
summary: - rmq next charm: pkg install fails when deployed to lxc
+ rmq next charm: config-changed hook fails when deployed to lxc
Ryan Beisner (1chb1n)
description: updated
Revision history for this message
Ryan Beisner (1chb1n) wrote :

FYI - this is a blocker for all of our bare metal next-charm testing. I believe I've got a fix, will link and propose after confirming.

Revision history for this message
Ryan Beisner (1chb1n) wrote :

Added lp:~1chb1n/charms/trusty/rabbitmq-server/fixup-configure-nodename2 to the same test environment as rabbitmq-server-fix2, and the config-changed hook and other completed successfully.

I have a full openstack deploy test underway using this branch to confirm functionality and will post back with results.

Revision history for this message
Ryan Beisner (1chb1n) wrote :

FYI, juju stat with rabbitmq-server-fix2 deployed to lxc successfully: http://paste.ubuntu.com/11889788/

Revision history for this message
Ryan Beisner (1chb1n) wrote :

Woot. Next rmq should be unblocked with fix2.

Juju stat and misc api checks a-ok:
http://paste.ubuntu.com/11890002/

Bare metal next charm mojo spec deploy test:
http://10.245.162.77:8080/job/mojo_runner_baremetal/88/

Mojo output:
http://paste.ubuntu.com/11890017/

Liam Young (gnuoy)
Changed in rabbitmq-server (Juju Charms Collection):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.