juju deployed LXD inherit constraints from the host causing lxd deploys to fail

Bug #1685782 reported by james beedy
24
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Witold Krecicki

Bug Description

As of 2.1.2, my lxd instances fail to deploy when the 'spaces' constraint is supplied for the host using the AWS provider.

This is critical because it puts a dead stop to anything being deployed to lxd on a host that uses the spaces constraint after 2.1.2.

To reproduce (must be Juju 2.1.2):

$ juju deploy ubuntu --constraints "spaces=myspace" # gets id 0

$ juju deploy ubuntu ubuntu-lxd --to lxd:0

The ubuntu-lxd deploy will absolutely fail with the lxd showing a "down" status, and message "unable to find host bridge for space(s) "myspace" for container".

Juju 2.1.2 (Fail) - http://paste.ubuntu.com/24427794/

Juju 2.0.3 (Working) - http://paste.ubuntu.com/24427822/

Revision history for this message
james beedy (jamesbeedy) wrote :

This is blocking my use case 100%. I can't spin up my own pre 2.1 controller and migrate my JAAS models over to my own controller because no migrations < 2.1.x, and can't deploy anything more on the JAAS controller due to this. My lxd deploys are dead in the water until this gets patched.

summary: - juju deployed LXD inherit constraints from the host causing lxd to fail
+ juju deployed LXD inherit constraints from the host causing lxd deploys
+ to fail
Revision history for this message
John A Meinel (jameinel) wrote :

So I have the feeling we've always had a bug wrt containers accidentally inheriting the space from their host machine. However, in 2.0 we ignored space constraints for containers. In 2.1 we explicitly try to put containers into the spaces that you've asked for.

On AWS we are unable to get IP addresses and configure host machines to allow containers to share in the same space as the host machine, so they are actually effectively only in the "" (unnamed/unknown) space.

It should be possible to do:
  juju deploy host-aap --bind space
  juju deploy container-app --to lxd:MACHINE --bind ""

We may have to play around with syntax, as empty strings need to be interpreted at the right time.

In a bundle it would look like

  app:
    bindings:
      "": ""

I believe.

Or as a machine constraint:

  juju deploy foo --constraints 'space=""'

james beedy (jamesbeedy)
description: updated
Revision history for this message
james beedy (jamesbeedy) wrote :

@jameinel I've just spun something up now with the added '--bind ""' you have suggested, and it didn't seem to get help, see http://paste.ubuntu.com/24447874/

Here is a direct comparison of the same commands being ran on a 2.0.2 vs a 2.1.2 (I already had this in the works, so I may as well post it).

2.0.2 - http://paste.ubuntu.com/24447738/

2.1.2 - http://paste.ubuntu.com/24447858/

Revision history for this message
james beedy (jamesbeedy) wrote :

`juju show-machine 0` for 2.1.2^ http://paste.ubuntu.com/24447884/

Revision history for this message
james beedy (jamesbeedy) wrote :

@jameinel, your next suggestion gives

$ juju deploy ubuntu ubuntu-lxd-test --to lxd:34 --constraints 'spaces=""'
ERROR bad "spaces" constraint: "\"\"" is not a valid space name

and

$ juju deploy ubuntu ubuntu-lxd-test --to lxd:34 --constraints "spaces=''"
ERROR bad "spaces" constraint: "''" is not a valid space name

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 1685782] Re: juju deployed LXD inherit constraints from the host causing lxd deploys to fail

Can you try just setting the constraints rather than the binding? "juju
deploy --to lxd:X --constraints spaces=" ?

John
=:->

On Mon, Apr 24, 2017 at 5:26 PM, james beedy <email address hidden> wrote:

> `juju show-machine 0` for 2.1.2^ http://paste.ubuntu.com/24447884/
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1685782
>
> Title:
> juju deployed LXD inherit constraints from the host causing lxd
> deploys to fail
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1685782/+subscriptions
>

Revision history for this message
james beedy (jamesbeedy) wrote :

@jameinel, juju took that command, but I have the same broken result of the lxd state "down"

Revision history for this message
james beedy (jamesbeedy) wrote :

The latest command, `juju
deploy --to lxd:X --constraints spaces=` does give a new message though http://paste.ubuntu.com/24448266/

Revision history for this message
james beedy (jamesbeedy) wrote :

ahh, disregard #8, it seems it only has that "failed to start instance" bit while it is retrying ... following the timeout the message reads the same as the rest. The juju deployed seems to still inherit the hosts space even when spaces= ... strange

Revision history for this message
John A Meinel (jameinel) wrote :

Do you have a constraint or binding on the application you are trying to
deploy as well? Or is it only provisioning the machine that has a space
constraint?

I wonder if you're actually running into this one:
logger.Debugf("container %q not qualified to a space, host machine %q is
using spaces %s",
containerId, m.Id(), network.QuoteSpaceSet(hostSpaces))

Can you enable DEBUG level logging and include the log output from the
controller when trying to deploy the container?

  juju model-config -m controller
logging-config="<root>=INFO;juju.network=DEBUG"

is probably sufficient. (the actual module where that would be logged looks
to be juju.network.containerizer)
I often just do:
    juju model-config -m controller logging-config="<root>=DEBUG"
when I'm not sure.

It *might* also be useful to do:
  juju model-config -m controller
logging-config="<root>=DEBUG;juju.state=TRACE"
but that is likely to be a *lot* of logging.

I'm a bit curious what this would be reporting:
logger.Tracef("machine %q found constraints %s and bindings %s",
m.Id(), network.QuoteSpaceSet(spaces), network.QuoteSpaceSet(bindings))

Which would indicate what spaces we think the container wants to be in,
before we do a final pass.

John
=:->

On Mon, Apr 24, 2017 at 6:51 PM, james beedy <email address hidden> wrote:

> The latest command, `juju
> deploy --to lxd:X --constraints spaces=` does give a new message though
> http://paste.ubuntu.com/24448266/
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1685782
>
> Title:
> juju deployed LXD inherit constraints from the host causing lxd
> deploys to fail
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1685782/+subscriptions
>

Revision history for this message
james beedy (jamesbeedy) wrote :

@jameinel These are the logs from the controller immediately after bootstrap for the commands:

juju add-machine --constraints "spaces=test-space"

juju deploy ubuntu --to lxd:0

Revision history for this message
John A Meinel (jameinel) wrote :

This seems to be the logs from the target machine, but not also the logs of
the controller. Is it possible to have both?
2017-04-25 12:49:36 WARNING juju.network.containerizer bridgepolicy.go:341
container "0/lxd/0" wants spaces "test-space" could not find host "0"
bridges for "test-space", found bridges []

Is indicating that we're asking to get a network address in 'test-space'
which is presumably the host's space, but which is something we are unable
to do for containers on AWS, as we have not implemented support for getting
an extra IP from AWS and configuring the instance to actually allow traffic
for another IP and MAC address for that machine.

On Tue, Apr 25, 2017 at 4:56 PM, james beedy <email address hidden> wrote:

> @jameinel These are the logs from the controller immediately after
> bootstrap for the commands:
>
> juju add-machine --constraints "spaces=test-space"
>
> juju deploy ubuntu --to lxd:0
>
> ** Attachment added: "controller.logs.tar.gz"
> https://bugs.launchpad.net/juju/+bug/1685782/+attachment/
> 4867757/+files/controller.logs.tar.gz
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1685782
>
> Title:
> juju deployed LXD inherit constraints from the host causing lxd
> deploys to fail
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1685782/+subscriptions
>

Revision history for this message
John A Meinel (jameinel) wrote :

Sorry, I was wrong, this does appear to be the correct log file, but at the
wrong logging level:
2017-04-25 12:36:03 DEBUG juju.worker.logger logger.go:50 reconfiguring
logging from "<root>=DEBUG" to "<root>=WARNING;unit=DEBUG"

If this is a new bootstrap, you can use "juju bootstrap --debug", if you
want to use an existing system, you can use:
 juju model-config -m controller "logging-config=<root>=DEBUG"

John
=:->

On Thu, Apr 27, 2017 at 9:31 AM, John Meinel <email address hidden> wrote:

> This seems to be the logs from the target machine, but not also the logs
> of the controller. Is it possible to have both?
> 2017-04-25 12:49:36 WARNING juju.network.containerizer bridgepolicy.go:341
> container "0/lxd/0" wants spaces "test-space" could not find host "0"
> bridges for "test-space", found bridges []
>
> Is indicating that we're asking to get a network address in 'test-space'
> which is presumably the host's space, but which is something we are unable
> to do for containers on AWS, as we have not implemented support for getting
> an extra IP from AWS and configuring the instance to actually allow traffic
> for another IP and MAC address for that machine.
>
>
> On Tue, Apr 25, 2017 at 4:56 PM, james beedy <email address hidden> wrote:
>
>> @jameinel These are the logs from the controller immediately after
>> bootstrap for the commands:
>>
>> juju add-machine --constraints "spaces=test-space"
>>
>> juju deploy ubuntu --to lxd:0
>>
>> ** Attachment added: "controller.logs.tar.gz"
>> https://bugs.launchpad.net/juju/+bug/1685782/+attachment/48
>> 67757/+files/controller.logs.tar.gz
>>
>> --
>> You received this bug notification because you are subscribed to juju.
>> Matching subscriptions: juju bugs
>> https://bugs.launchpad.net/bugs/1685782
>>
>> Title:
>> juju deployed LXD inherit constraints from the host causing lxd
>> deploys to fail
>>
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/juju/+bug/1685782/+subscriptions
>>
>
>

Revision history for this message
james beedy (jamesbeedy) wrote :

@jameinel, I gave you superuser on this controller and admin on the model. Ask rick_h for your register command ... I sent it to him so you guys could access it and get the info you need.

Changed in juju:
status: New → Incomplete
status: Incomplete → New
james beedy (jamesbeedy)
description: updated
Tim Penhey (thumper)
tags: added: constraints network spaces
Changed in juju:
status: New → Triaged
importance: Undecided → Medium
Witold Krecicki (wpk)
Changed in juju:
assignee: nobody → Witold Krecicki (wpk)
importance: Medium → High
John A Meinel (jameinel)
Changed in juju:
status: Triaged → In Progress
milestone: none → 2.2-rc1
Revision history for this message
Witold Krecicki (wpk) wrote :
Revision history for this message
John A Meinel (jameinel) wrote :

The associated patch has landed in time for 2.2-rc1. It is still possible that the revision that CI is able to greenlight won't include it, but likely it should handle this problem.

@jamesbeedy would you be able to test an --edge snap from tomorrow to see if it fixes this problem for you?

Changed in juju:
status: In Progress → Fix Committed
Revision history for this message
james beedy (jamesbeedy) wrote :
Revision history for this message
james beedy (jamesbeedy) wrote :

looks like I missed the `juju subnets` output in the paste above.

`juju subnets` <- http://paste.ubuntu.com/24742865/

Revision history for this message
John A Meinel (jameinel) wrote :

in 2.2 we started auto detecting subnets in preparation for future work
(which is why add-subnet is failing). you can still create spaces by
supplying the subnets to the add-space command.

so instead of

juju add-space foo
juju add-subnet subnet-abcdef foo

you just do

juju add-space foo subnet-1 subnet-2

note that subnets can be CIDR, provider id or the name (tag) of a subnet.

We are having a sprint at the end of June to specifically focus on cleaning
up space and subnet management (remove-space, updating subnets, etc).

It is clumsy right now, but you should be able to get configured.

John
=:->

On Jun 2, 2017 04:51, "james beedy" <email address hidden> wrote:

> looks like I missed the `juju subnets` output in the paste above.
>
> `juju subnets` <- http://paste.ubuntu.com/24742865/
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1685782
>
> Title:
> juju deployed LXD inherit constraints from the host causing lxd
> deploys to fail
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1685782/+subscriptions
>

Revision history for this message
james beedy (jamesbeedy) wrote :

@jameinel that did it. many thanks

Revision history for this message
james beedy (jamesbeedy) wrote :

geh ... I may have been slightly premature with my reply .... its been sitting there for a while now, with no apparent error, but also no ip http://paste.ubuntu.com/24745029/

Revision history for this message
james beedy (jamesbeedy) wrote :

I then deployed a machine without any spaces constraint, and then deployed the lxd u2 to it, and it got an ip http://paste.ubuntu.com/24745117/

Revision history for this message
james beedy (jamesbeedy) wrote :

I still see the spaces constraint on 1/lxd/0 too http://paste.ubuntu.com/24745125/

Revision history for this message
Witold Krecicki (wpk) wrote :

If the paste is correctly indented the space constraint is on "1", not "1/lxd/0" (although it is an UX error that it's printed after containers).

I've just tried to reproduce it by doing:
juju add-machine --constraints "spaces=foobar" -> results in "1"
juju add-machine lxd:1

The result is:
1 started 54.229.137.144 i-0e1dae1b16cf14da2 xenial eu-west-1a running
1/lxd/0 started 10.0.215.241 juju-703836-1-lxd-0 xenial Container started

Before the fix from PR7433 the container would never go past 'pending'.

Could you make sure that all controllers are upgraded to the newest version and that this is a version with this PR included? Could you also provide exact steps, maybe there's some small detail that I'm overlooking.

Revision history for this message
John A Meinel (jameinel) wrote :

Unfortunately, 'snap' doesn't seem to record the exact Git revision that it
was built with, so it is possible that the current '--edge' didn't have
Witold's fix.
Its also possible that you were running from tip and we really didn't fix
the problem.

Can you list the specific steps that you used? You showed the bootstrap and
'spaces' and 'subnet' commands that you ran, but not the

juju deploy X --constrainst
juju deploy Y --to lxd:

etc.

I *think* we understand what you've been doing, but having the explicit
step-by-step can help make sure we're not doing something that
accidentally/magically makes it work.

On Fri, Jun 2, 2017 at 9:35 AM, james beedy <email address hidden> wrote:

> I then deployed a machine without any spaces constraint, and then
> deployed the lxd u2 to it, and it got an ip
> http://paste.ubuntu.com/24745117/
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1685782
>
> Title:
> juju deployed LXD inherit constraints from the host causing lxd
> deploys to fail
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1685782/+subscriptions
>

Revision history for this message
John A Meinel (jameinel) wrote :

    containers:
      1/lxd/0:
        juju-status:
          current: pending
          since: 02 Jun 2017 05:12:29Z
        instance-id: juju-4ebaba-1-lxd-0
        machine-status:
          current: running
          message: Container started
          since: 02 Jun 2017 05:17:05Z
        series: xenial

Doesn't have an IP address or device, so doesn't have a place to put what
space we think it is in.
It is true that:

  constraints: spaces=testspace

Refers to the host machine 1, not to the container. (it just happens that
'constraints' sorts after 'containers' and thus gets ordered in an
unhelpful way.)

Other things that might be useful, would be to 'juju ssh 1' and then 'exec
lxc juju-4ebaba-1-lxd-0 bash' and then poke around at things like
/var/log/cloud-init-output.log
and
/var/log/juju/machine-1-lxd-0.log
and
ip a s
and
/etc/network/interfaces

On Fri, Jun 2, 2017 at 11:39 AM, Witold Krecicki <<email address hidden>
> wrote:

> If the paste is correctly indented the space constraint is on "1", not
> "1/lxd/0" (although it is an UX error that it's printed after
> containers).
>
> I've just tried to reproduce it by doing:
> juju add-machine --constraints "spaces=foobar" -> results in "1"
> juju add-machine lxd:1
>
> The result is:
> 1 started 54.229.137.144 i-0e1dae1b16cf14da2 xenial eu-west-1a
> running
> 1/lxd/0 started 10.0.215.241 juju-703836-1-lxd-0 xenial
> Container started
>
> Before the fix from PR7433 the container would never go past 'pending'.
>
> Could you make sure that all controllers are upgraded to the newest
> version and that this is a version with this PR included? Could you also
> provide exact steps, maybe there's some small detail that I'm
> overlooking.
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1685782
>
> Title:
> juju deployed LXD inherit constraints from the host causing lxd
> deploys to fail
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1685782/+subscriptions
>

Revision history for this message
james beedy (jamesbeedy) wrote :

@jameinel can I just add you as a user and send you over the register command so you can get what you need?

Revision history for this message
John A Meinel (jameinel) wrote :

we can try that. we probably also need "juju ssh-import-id jameinel" though
that doesn't tell me the commands you ran.

you can send me the register string via email
John
=:->

On Jun 3, 2017 08:50, "james beedy" <email address hidden> wrote:

> @jameinel can I just add you as a user and send you over the register
> command so you can get what you need?
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1685782
>
> Title:
> juju deployed LXD inherit constraints from the host causing lxd
> deploys to fail
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1685782/+subscriptions
>

Revision history for this message
Anastasia (anastasia-macmood) wrote :

I verified this fix against 2.2 tip:

1. bootstrapped on AWS with vpc config argument to ensure I can add spaces and deploy to them:

juju bootstrap aws mycontroller --config vpc-id=vpc-XXXX

2. added space:

juju add-space two 172.30.2.0/24

3. deploy ubuntu in "two" space:

juju deploy ubuntu --constraints "spaces=two"

4. deploy another ubuntu into the container on the machine created in 3, machine 0 in my case:

juju deploy ubuntu ubuntu-lxd --to lxd:0

Resulting status: http://pastebin.ubuntu.com/24863906/

Revision history for this message
Anastasia (anastasia-macmood) wrote :

And other interesting information about machine 0 and a container... http://pastebin.ubuntu.com/24863951/

Revision history for this message
John A Meinel (jameinel) wrote :

Since 2.2.0 final is now out, @jamesbeedy do you have a chance to try one more time? Then we can be sure you are using the exact version that we are describing.
Our testing has shown that it has been fixed, but its possible we missed a step that you were doing differently than we expected.

Revision history for this message
james beedy (jamesbeedy) wrote :

@jameinel I can verify its fixed! Thank you!

John A Meinel (jameinel)
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.