after upgrading model from 1.25 to v2.x, juju migrate fails on lxd containers
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Triaged
|
Low
|
Unassigned |
Bug Description
Juju client 2.6.4, current model (and controller) 2.6.4 all using Trusty.
The original install was 1.2x, maas 1.9, with multiple units using LXC containers.
This particular model was upgraded to 2.x, and the containers migrated to lxd as part of that process.
When I try to migrate the model to a new controller (Xenial), I get the following:
:~/bootstack-
00:17:21 INFO juju.cmd supercommand.go:57 running juju [2.6.4 gc go1.10.4]
00:17:21 DEBUG juju.cmd supercommand.go:58 args: []string{
00:17:21 INFO juju.juju api.go:67 connecting to API addresses: [10.101.
00:17:21 DEBUG juju.api apiclient.go:1092 successfully dialed "wss://
00:17:21 INFO juju.api apiclient.go:624 connection established to "wss://
00:17:21 DEBUG juju.api monitor.go:35 RPC connection died
00:17:21 INFO juju.juju api.go:67 connecting to API addresses: [10.101.
00:17:21 DEBUG juju.api apiclient.go:1092 successfully dialed "wss://
00:17:21 INFO juju.api apiclient.go:624 connection established to "wss://
00:17:21 DEBUG juju.api monitor.go:35 RPC connection died
00:17:21 INFO juju.juju api.go:67 connecting to API addresses: [10.101.
00:17:21 DEBUG juju.api apiclient.go:1092 successfully dialed "wss://
00:17:21 INFO juju.api apiclient.go:624 connection established to "wss://
00:17:21 DEBUG juju.api monitor.go:35 RPC connection died
ERROR source prechecks failed: machine 0/lxd/0 not running
00:17:21 DEBUG cmd supercommand.go:496 error stack:
source prechecks failed: machine 0/lxd/0 not running
/build/
The container mentioned is one of the containers that was migrated from lxc.
Digging in the db using db.statuses.find( {_id: { $regex: /0\/lxd\// }}).pretty() I see that the 'status' and 'statusinfo' fields aren't populated on containers that were migrated. Restarting the machine agent doesn't seem to affect this.
Good status:
{
"_id" : "43759692-
"neverset" : false,
"status" : "running",
},
"txn-queue" : [ ],
"txn-revno" : NumberLong(52),
"updated" : NumberLong(
}
Not so good:
{
"_id" : "43759692-
"neverset" : false,
"status" : "",
},
"txn-queue" : [ ],
"txn-revno" : NumberLong(2),
"updated" : NumberLong(
}
juju show-machine shows something very similar:
Good:
0/lxd/17:
current: started
since: 26 Jun 2019 04:30:57Z
version: 2.6.4
dns-name: 10.101.52.166
- 10.101.52.166
- 10.101.60.1
current: running
message: Container started
since: 02 May 2019 21:22:54Z
current: applied
since: 26 Jun 2019 05:32:03Z
series: trusty
eth0:
- 10.101.52.166
- 10.101.52.6
space: space-0
is-up: true
eth1:
- 10.101.60.1
space: space-0
is-up: true
Not so good:
0/lxd/0:
current: started
since: 26 Jun 2019 04:30:05Z
version: 2.6.4
dns-name: 10.101.52.90
- 10.101.52.90
since: 01 Oct 2018 05:51:41Z
current: applied
since: 26 Jun 2019 05:32:03Z
series: trusty
hardware: arch=amd64
0/lxd/14:
current: started
since: 26 Jun 2019 04:30:07Z
version: 2.6.4
dns-name: 10.101.52.96
- 10.101.52.241
- 10.101.52.96
since: 01 Oct 2018 05:51:41Z
current: applied
since: 26 Jun 2019 05:32:03Z
series: trusty
hardware: arch=amd64
(note also the lack of network-interfaces)
We need to find out why the machine agent on the lxd machines aren't updating the link-layer devices or the provider status.