Single lxd machine stuck on pending, Container started

Bug #1956981 reported by Bas de Bruijne
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
High
Unassigned

Bug Description

Single lxd machine stuck on pending:

------------------------------------------
Machine State DNS Inst id Series AZ Message
0 started 10.244.8.128 azurill focal zone1 Deployed
0/lxd/0 started 10.244.8.176 juju-7c9753-0-lxd-0 focal zone1 Container started
0/lxd/1 started 10.246.65.92 juju-7c9753-0-lxd-1 focal zone1 Container started
0/lxd/2 started 10.244.8.178 juju-7c9753-0-lxd-2 focal zone1 Container started
0/lxd/3 pending pending focal Creating container
0/lxd/4 started 10.246.65.79 juju-7c9753-0-lxd-4 focal zone1 Container started
0/lxd/5 started 10.244.8.177 juju-7c9753-0-lxd-5 focal zone1 Container started
0/lxd/6 started 10.246.65.89 juju-7c9753-0-lxd-6 focal zone1 Container started
0/lxd/7 started 10.244.8.179 juju-7c9753-0-lxd-7 focal zone1 Container started
0/lxd/8 started 10.244.8.167 juju-7c9753-0-lxd-8 focal zone1 Container started
0/lxd/9 started 10.244.8.175 juju-7c9753-0-lxd-9 focal zone1 Container started
------------------------------------------

In the logs:
------------------------------------------
var/log/kern.log:Jan 8 01:10:42 azurill kernel: [ 716.131356] audit: type=1400 audit(1641604242.710:332): apparmor="DENIED" operation="file_inherit" namespace="root//lxd-juju-7c9753-0-lxd-3_<var-snap-lxd-common-lxd>" profile="/snap/snapd/14295/usr/lib/snapd/snap-confine" pid=32816 comm="snap-confine" family="netlink" sock_type="raw" protocol=15 requested_mask="send receive" denied_mask="send receive"
var/log/kern.log:Jan 8 01:10:42 azurill kernel: [ 716.193780] audit: type=1400 audit(1641604242.774:333): apparmor="DENIED" operation="file_inherit" namespace="root//lxd-juju-7c9753-0-lxd-3_<var-snap-lxd-common-lxd>" profile="snap-update-ns.lxd" name="/apparmor/.null" pid=32874 comm="6" requested_mask="wr" denied_mask="wr" fsuid=1000000 ouid=0
var/log/kern.log-Jan 8 01:10:46 azurill kernel: [ 720.362843] kauditd_printk_skb: 13 callbacks suppressed
------------------------------------------

Similar messages show up for the different containers on machine 0, but they are not quite the same.

Testrun:
https://solutions.qa.canonical.com/testruns/testRun/ab94b749-b99f-473b-8997-afa48c6815dd

Links to crashdumps:
https://oil-jenkins.canonical.com/artifacts/ab94b749-b99f-473b-8997-afa48c6815dd/index.html

Future occurrences of this bug can be found here:
https://solutions.qa.canonical.com/bugs/bugs/bug/1956981

Tags: cdo-qa
description: updated
Revision history for this message
Bas de Bruijne (basdbruijne) wrote :

This issue seems to be very active on jammy as well

summary: - Single lxd machine stuck on pending, creating container
+ Single lxd machine stuck on pending, Container started
Revision history for this message
Bas de Bruijne (basdbruijne) wrote :
Revision history for this message
Heather Lanigan (hmlanigan) wrote :

@basdbruijne, the solqa results in #2 do not show the same bug as this.

In the case of #2, the container starts, as does the juju agent on the container, however it has errors.

$ grep pending juju-crashdump-openstack-2022-10-20-07.32.21/juju_status.txt
5/lxd/5 pending 10.246.167.51 juju-dd3b16-5-lxd-5 ubuntu:20.04 zone3 Container started

The machine agent shutdown down starting with?
2022-10-20 03:46:29 DEBUG juju.worker.dependency engine.go:616 "unconverted-api-workers" manifold worker stopped: agent should be terminated

Revision history for this message
John A Meinel (jameinel) wrote :

Solutions QA is saying that they are also seeing failures to start containers on Jammy, and this is preventing them from doing their Openstack Jammy testing.

Changed in juju:
importance: Undecided → High
milestone: none → 2.9.38
status: New → Triaged
Revision history for this message
Bas de Bruijne (basdbruijne) wrote :
Download full text (8.2 KiB)

In testrun https://solutions.qa.canonical.com/testruns/testRun/fd79805c-8f0c-4965-af14-e01017439fe9 I looked around on the life env. Here, 2 machines are in this state:

```
Machine State Address Inst id Series AZ Message
0 started 10.246.167.190 solqa-lab1-server-07 jammy zone1 Deployed
0/lxd/0 started 10.246.167.149 juju-6ba804-0-lxd-0 jammy zone1 Container started
0/lxd/1 pending juju-6ba804-0-lxd-1 jammy zone1 Container started
0/lxd/2 started 10.246.164.253 juju-6ba804-0-lxd-2 jammy zone1 Container started
0/lxd/3 pending juju-6ba804-0-lxd-3 jammy zone1 Container started
0/lxd/4 started 10.246.166.148 juju-6ba804-0-lxd-4 jammy zone1 Container started
0/lxd/5 started 10.246.165.82 juju-6ba804-0-lxd-5 jammy zone1 Container started
0/lxd/6 started 10.246.167.96 juju-6ba804-0-lxd-6 jammy zone1 Container started
0/lxd/7 started 10.246.167.159 juju-6ba804-0-lxd-7 jammy zone1 Container started
0/lxd/8 started 10.246.166.215 juju-6ba804-0-lxd-8 jammy zone1 Container started
0/lxd/9 started 10.246.165.72 juju-6ba804-0-lxd-9 jammy zone1 Container started
0/lxd/10 started 10.246.164.203 juju-6ba804-0-lxd-10 jammy zone1 Container started
```

But logging on to the machines themselves shows no problem:
```
ubuntu@solqa-lab1-server-07:~$ sudo lxc list
To start your first container, try: lxc launch ubuntu:22.04
Or for a virtual machine: lxc launch ubuntu:22.04 --vm

+----------------------+---------+-----------------------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+----------------------+---------+-----------------------+------+-----------+-----------+
| juju-6ba804-0-lxd-0 | RUNNING | 10.246.173.8 (eth1) | | CONTAINER | 0 |
| | | 10.246.172.111 (eth1) | | | |
| | | 10.246.169.47 (eth0) | | | |
| | | 10.246.168.111 (eth0) | | | |
| | | 10.246.167.149 (eth2) | | | |
+----------------------+---------+-----------------------+------+-----------+-----------+
| juju-6ba804-0-lxd-1 | RUNNING | 10.246.176.28 (eth1) | | CONTAINER | 0 |
| | | 10.246.172.62 (eth0) | | | |
+----------------------+---------+-----------------------+------+-----------+-----------+
| juju-6ba804-0-lxd-2 | RUNNING | 10.246.172.251 (eth1) | | CONTAINER | 0 |
| | | 10.246.168.255 (eth0) | | | |
| | | 10.246.164.253 (eth2) | | | |
+----------------------+---------+-----------------------+------+-----------+-----------+
| juju-6ba804-0-lxd-3 | RUNNING | 10.246.169.48 (eth0) | | CONTAINER | 0 |
+----------------------+---------+-----------------------+------+-----------+-----------+
| juju-6ba804-0-lxd-4 | ...

Read more...

tags: added: cdo-qa
Changed in juju:
milestone: 2.9.38 → 2.9.39
Revision history for this message
Moises Emilio Benzan Mora (moisesbenzan) wrote :
John A Meinel (jameinel)
description: updated
Changed in juju:
milestone: 2.9.39 → 2.9.40
Changed in juju:
milestone: 2.9.40 → 2.9.41
Revision history for this message
Cristovao Cordeiro (cjdc) wrote (last edit ):

I confirm this happens too, consistently, when running the tutorial from https://juju.is/docs/sdk/build-and-deploy-minimal-machine-charm

$ juju version
2.9.38-ubuntu-amd64

Changed in juju:
milestone: 2.9.41 → 2.9.42
Changed in juju:
milestone: 2.9.42 → 2.9.43
Changed in juju:
milestone: 2.9.43 → 2.9.44
Changed in juju:
milestone: 2.9.44 → 2.9.45
Changed in juju:
milestone: 2.9.45 → 2.9.46
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.