Local-provider lxc Failed to create lxc_container

Bug #1414016 reported by Curtis Hovey
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
Critical
Dimiter Naydenov

Bug Description

gitbranch:master:github.com/juju/juju f6088670

All local-provider lxc bugs for trusty, utopic, and vivid are have failed across all archs. Only precise amd64 passed.
http://juju-ci.vapour.ws:8080/job/local-deploy-trusty-amd64/752/console
http://juju-ci.vapour.ws:8080/job/local-deploy-utopic-amd64/895/console
http://juju-ci.vapour.ws:8080/job/local-deploy-vivid-amd64/17/console
http://juju-ci.vapour.ws:8080/job/local-deploy-trusty-ppc64/890/console
http://juju-ci.vapour.ws:8080/job/local-deploy-trusty-i386/706/console

ERROR 1 is in state container failed to start and failed to destroy: manual cleanup of containers needed: error executing "lxc-start": network is not created for 'lxc.network.hwaddr' = '00:16:3e:cd:f5:42' option; Failed to parse config: lxc.network.hwaddr = 00:16:3e:cd:f5:42; Failed to create lxc_container

The only suspect is
Commit f608867 Merge pull request #1468 from dimitern/lxc-network-config-propagation

Curtis Hovey (sinzui)
description: updated
Changed in juju-core:
milestone: none → 1.23
description: updated
description: updated
Changed in juju-core:
assignee: nobody → Dimiter Naydenov (dimitern)
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Sorry about this! I'm investigating now.

However, the logs are not sufficient to understand what the problem is. I need to see the generated config ang logs from /var/lib/juju/containers/* (cloud-init, container.log, console.log), /var/lib/juju/removed-containers/* (if any; same files), /var/lib/lxc/juju-*template/config, and /var/lib/lxc/jenkins-*/config.

It will also be much easier to see what's going on if all these jobs were configured with "logging-config: <root>=INFO, juju.container=TRACE", as most of the operations around containers are heavily logged on TRACE and DEBUG levels. This will be much more useful for analyzing such failures in the future.

Changed in juju-core:
status: Triaged → In Progress
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

I found the issue - in the generated lxc config for the containers having that issue "hwaddr" appears before any other lxc.network.* setting, which causes lxc to refuse to parse the config.

After consulting both hallyn and the lxc project source, the simplest solution is to ensure in all generated lxc config files any lxc.network.* lines appear *after* the (first or only) lxc.network.type = ... line. I'm working on a fix now.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

After a lot of testing and a few approaches attempted, I'm proposing the following fix for this:
https://github.com/juju/juju/pull/1480

Changed in juju-core:
status: In Progress → Fix Committed
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

@curtis, @mgz:
It seems all of the jobs above have passed successfully after my fix was committed. Should we unblock CI now?

Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 1.23 → 1.23-beta1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.