juju deploy --to lxd does not create base machine

Bug #1590960 reported by Nate Finch
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
Critical
Christian Muirhead

Bug Description

how to repro:

juju deploy ubuntu --to lxd

This will create the service, but no new machine for the container will be created.

I see this error in the logs (not sure if it's related):

machine-0: 2016-06-09 20:14:30 ERROR juju.apiserver.imagemetadata metadata.go:244 encountered cannot read index data, cannot read data for source "default cloud images" at URL https://streams.canonical.com/juju/images/releases
/streams/v1/index.sjson: openpgp: signature made by unknown entity while getting published images metadata from default cloud images

and then I get this (which obviously is related):

machine-0: 2016-06-09 20:19:18 ERROR juju.state.unit unit.go:719 unit ubuntu/0 cannot get assigned machine: unit "ubuntu/0" is not assigned to a machine

Tags: deploy lxd
Revision history for this message
Nate Finch (natefinch) wrote :

machine-0.log attached

Revision history for this message
Nate Finch (natefinch) wrote :

example juju status --format=yaml output:

model: default
machines: {}
applications:
  mysql:
    charm: cs:trusty/mysql-38
    exposed: false
    application-status:
      current: unknown
      message: Waiting for agent initialization to finish
      since: 09 Jun 2016 16:18:34-04:00
    relations:
      cluster:
      - mysql
    units:
      mysql/0:
        workload-status:
          current: unknown
          message: Waiting for agent initialization to finish
          since: 09 Jun 2016 16:18:34-04:00
        juju-status:
          current: allocating
          since: 09 Jun 2016 16:18:34-04:00
  ubuntu:
    charm: cs:ubuntu-0
    exposed: false
    application-status:
      current: unknown
      message: Waiting for agent initialization to finish
      since: 09 Jun 2016 16:18:24-04:00
    units:
      ubuntu/0:
        workload-status:
          current: unknown
          message: Waiting for agent initialization to finish
          since: 09 Jun 2016 16:18:24-04:00
        juju-status:
          current: allocating
          since: 09 Jun 2016 16:18:24-04:00

Revision history for this message
Nate Finch (natefinch) wrote :

Curtis said that deploying to a container via a bundle still works, and juju add-machine lxd works (creates a machine with an lxd container on it). And I can then deploy --to 0/lxd/0 ... so it's only the straight deploy --to lxd that seems to be broken, as far as I can tell.

Changed in juju-core:
status: New → Triaged
importance: Undecided → Critical
milestone: none → 2.0-beta9
Changed in juju-core:
assignee: nobody → Christian Muirhead (2-xtian)
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

I don't recall deploy --to lxd to have ever worked, it's not documented anywhere, there are no examples, and I think the real issue here is we should report an error for "--to lxd" in deploy, the same way as with 'deploy --to lxd:'.

The only reason I can think of using --to lxd with deploy is if I have a machine called 'lxd' and I want to deploy there.

Also, it's not LXD specific, --to lxc or --to kvm behave the same.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Interesting.. 'juju deploy ubuntu --to bogus' adds the unit and a pending machine, which quickly transitions to error state (e.g. on MAAS: cannot run instances: cannot run instances: ServerError: 409 CONFLICT (No available node matches constraints: name=bogus)).

It looks like placement parsing treats 'lxd', 'lxc', and 'kvm' on their own specially, rather than hostnames.

Revision history for this message
Cheryl Jennings (cherylj) wrote :

Looks like this problem exists on 1.25 as well, but it at least gives an error during deploy:

$ juju deploy ubuntu --to lxc
Added charm "cs:xenial/ubuntu-0" to the environment.
ERROR adding new machine to host unit "ubuntu/0": cannot add a new machine: machine not found

$ juju status
environment: amazon
machines:
  "0":
    agent-state: started
    agent-version: 1.25.5
    dns-name: 52.38.197.82
    instance-id: i-7e04f2d1
    instance-state: running
    series: xenial
    hardware: arch=amd64 cpu-cores=1 cpu-power=300 mem=3840M root-disk=8192M availability-zone=us-west-2a
    state-server-member-status: has-vote
services:
  ubuntu:
    charm: cs:xenial/ubuntu-0
    exposed: false
    service-status:
      current: unknown
      message: Waiting for agent initialization to finish
      since: 10 Jun 2016 11:16:50-05:00
    units:
      ubuntu/0:
        workload-status:
          current: unknown
          message: Waiting for agent initialization to finish
          since: 10 Jun 2016 11:16:50-05:00
        agent-status:
          current: allocating
          since: 10 Jun 2016 11:16:50-05:00
        agent-state: pending

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

FWIW that's a terrible error it doesn't give you any useful informatino.

I've verified different providers and it seems most fail with placement strings not containing "=" (most providers, except maas, cloudsigma, and joyent). Joyent fails with non-empty placement, while cloudsigma implements the PrecheckInstance (used to validate such things), but simply returns nil.
MAAS assumes the lack of "=" in the placement string means "node hostname".

Revision history for this message
Nate Finch (natefinch) wrote :

FYI placement can include things like 0, for a machine name, or 0/lxc/1 for a container name... so definitely should not be restricted to strings with an = sign. I think the other things are hacks on top of the normal placement for individual clouds.

Revision history for this message
Christian Muirhead (2-xtian) wrote :

After talking about this we decided to make it consistent with the behaviour of add-machine <container-type>, so deploy --to lxd will create a new machine with a new lxd container, and deploy to the container. This doesn't add any more provider error-checking than there is in the add-machine case, so you can still end up in a failed state if you deploy --to kvm on an AWS controller, for example.

PR: https://github.com/juju/juju/pull/5598

Changed in juju-core:
status: Triaged → In Progress
tags: added: blocker
Changed in juju-core:
status: In Progress → Fix Committed
tags: removed: blocker
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
affects: juju-core → juju
Changed in juju:
milestone: 2.0-beta9 → none
milestone: none → 2.0-beta9
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.