[2.8/stable] caas-unit-init hangs forever while initializing the workload pod
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Triaged
|
Low
|
Unassigned |
Bug Description
tl;dr Trying to deploy the Operator framework charm on top of the microk8s with Juju 2.8, but the init container never reaches out to the "Ready" state, leaving the workload pod in the "Init:0/1" status forever. The same steps BUT using Juju 2.7 are working good, so I think it's a Juju issue.
Steps to reproduce:
1. create an aws vm, t2.xlarge, AMI ID ubuntu/
ubuntu@
microk8s v1.18.4 from Canonical✓ installed
ubuntu@
juju 2.8.0 from Canonical✓ installed
ubuntu@
Name Version Rev Tracking Publisher Notes
amazon-ssm-agent 2.3.714.0 1566 latest/stable/… aws✓ classic
core 16-2.45 9289 latest/stable canonical✓ core
core18 20200427 1754 latest/stable canonical✓ base
juju 2.8.0 12370 latest/stable canonical✓ classic
microk8s v1.18.4 1503 latest/stable canonical✓ classic
ubuntu@
ubuntu@
Cloning into 'charm-
remote: Enumerating objects: 60, done.
remote: Counting objects: 100% (60/60), done.
remote: Compressing objects: 100% (44/44), done.
remote: Total 795 (delta 25), reused 36 (delta 14), pack-reused 735
Receiving objects: 100% (795/795), 181.29 KiB | 914.00 KiB/s, done.
Resolving deltas: 100% (470/470), done.
ubuntu@
ubuntu@
Submodule 'mod/jinja' (https:/
Submodule 'mod/operator' (https:/
Cloning into '/home/
Cloning into '/home/
Submodule path 'mod/jinja': checked out 'b5f454559fd714
Submodule path 'mod/operator': checked out 'beca3da58af148
ubuntu@
Enabling DNS
Applying manifest
serviceaccount/
configmap/coredns created
deployment.
service/kube-dns created
clusterrole.
clusterrolebind
Restarting kubelet
DNS is enabled
Enabling Kubernetes Dashboard
Enabling Metrics-Server
clusterrole.
clusterrolebind
rolebinding.
apiservice.
serviceaccount/
deployment.
service/
clusterrole.
clusterrolebind
clusterrolebind
Metrics-Server is enabled
Applying manifest
serviceaccount/
service/
secret/
secret/
secret/
configmap/
role.rbac.
clusterrole.
rolebinding.
clusterrolebind
deployment.
service/
deployment.
If RBAC is not enabled access the dashboard using the default token retrieved with:
token=$(microk8s kubectl -n kube-system get secret | grep default-token | cut -d " " -f1)
microk8s kubectl -n kube-system describe secret $token
In an RBAC enabled setup (microk8s enable RBAC) you need to create a user with restricted
permissions as shown in:
https:/
The registry will be created with the default size of 20Gi.
You can use the "size" argument while enabling the registry, eg microk8s.enable registry:size=30Gi
Enabling default storage class
deployment.
storageclass.
serviceaccount/
clusterrole.
clusterrolebind
Storage will be available soon
Applying registry manifest
namespace/
persistentvolum
deployment.
service/registry created
The registry is enabled
Enabling default storage class
deployment.
storageclass.
serviceaccount/
clusterrole.
clusterrolebind
Storage will be available soon
Addon metrics-server is already enabled.
Enabling Ingress
namespace/ingress created
serviceaccount/
clusterrole.
role.rbac.
clusterrolebind
rolebinding.
configmap/
configmap/
configmap/
daemonset.
Ingress is enabled
ubuntu@
ubuntu@
ubuntu@
Connection to 18.184.45.5 closed.
<logged back>
ubuntu@
Since Juju 2 is being run for the first time, downloaded the latest public cloud information.
Creating Juju controller "mk8s" on microk8s/localhost
Creating k8s resources for controller "controller-mk8s"
Starting controller pod
Bootstrap agent now started
Contacting Juju controller at 10.152.183.61 to verify accessibility...
Bootstrap complete, controller "mk8s" is now available in namespace "controller-mk8s"
Now you can run
juju add-model <model-name>
to create a new model to deploy k8s workloads.
ubuntu@
ubuntu@
ubuntu@
Added 'lma' model on microk8s/localhost with credential 'microk8s' for user 'admin'
ubuntu@
ubuntu@
Deploying charm "local:
ubuntu@
############# Wait for some time to let it settle #########
ubuntu@
NAME READY STATUS RESTARTS AGE
pod/modeloperat
pod/prometheus-0 0/2 Init:0/1 0 110s
pod/prometheus-
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/
service/prometheus ClusterIP 10.152.183.105 <none> 80/TCP,443/TCP 111s
service/
service/
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.
NAME DESIRED CURRENT READY AGE
replicaset.
NAME READY AGE
statefulset.
statefulset.
ubuntu@
Name: prometheus-0
Namespace: lma
Priority: 0
Node: ip-172-
Start Time: Sat, 11 Jul 2020 12:50:15 +0000
Labels: controller-
Annotations: apparmor.
Status: Pending
IP: 10.1.11.12
IPs:
IP: 10.1.11.12
Controlled By: StatefulSet/
Init Containers:
juju-pod-init:
Container ID: containerd:
Image: jujusolutions/
Image ID: sha256:
Port: <none>
Host Port: <none>
Command:
/bin/sh
Args:
-c
export JUJU_DATA_
export JUJU_TOOLS_
mkdir -p $JUJU_TOOLS_DIR
cp /opt/jujud $JUJU_TOOLS_
initCmd=
if test -n "$initCmd"; then
$
else
exit 0
fi
State: Running
Started: Sat, 11 Jul 2020 12:50:15 +0000
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/lib/juju from juju-data-dir (rw)
/
Containers:
prometheus:
Container ID:
Image: prom/prometheus
Image ID:
Port: <none>
Host Port: <none>
Args:
-
-
-
-
-
-
-
-
-
-
-
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Liveness: http-get http://
Readiness: http-get http://
Environment: <none>
Mounts:
/
/prometheus from database-d3acf868 (rw)
/
/var/lib/juju from juju-data-dir (rw)
/
prometheus-nginx:
Container ID:
Image: nginx:1.19.0
Image ID:
Ports: 80/TCP, 443/TCP
Host Ports: 0/TCP, 0/TCP
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/
/
/var/lib/juju from juju-data-dir (rw)
/
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
database-
Type: PersistentVolum
ClaimName: database-
ReadOnly: false
juju-data-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
prometheus-
Type: ConfigMap (a volume populated by a ConfigMap)
Name: prometheus-
Optional: false
prometheus-
Type: ConfigMap (a volume populated by a ConfigMap)
Name: prometheus-
Optional: false
default-
Type: Secret (a volume populated by a Secret)
SecretName: default-token-lddqz
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 2m34s (x2 over 2m34s) default-scheduler running "VolumeBinding" filter plugin for pod "prometheus-0": pod has unbound immediate PersistentVolum
Normal Scheduled 2m33s default-scheduler Successfully assigned lma/prometheus-0 to ip-172-31-40-244
Normal Pulled 2m32s kubelet, ip-172-31-40-244 Container image "jujusolutions/
Normal Created 2m32s kubelet, ip-172-31-40-244 Created container juju-pod-init
Normal Started 2m32s kubelet, ip-172-31-40-244 Started container juju-pod-init
ubuntu@
ubuntu@
2020-07-11 12:50:16 INFO juju.cmd supercommand.go:91 running jujud [2.8.0 0 d816abe62fbf678
2020-07-11 12:50:16 DEBUG juju.cmd supercommand.go:92 args: []string{
Changed in juju: | |
assignee: | nobody → Yang Kelvin Liu (kelvin.liu) |
status: | New → Triaged |
status: | Triaged → In Progress |
Changed in juju: | |
importance: | Undecided → Critical |
milestone: | none → 2.8.2 |
Changed in juju: | |
assignee: | Yang Kelvin Liu (kelvin.liu) → nobody |
importance: | Critical → High |
status: | In Progress → Triaged |
Changed in juju: | |
milestone: | 2.8.2 → 2.8.3 |
Changed in juju: | |
milestone: | 2.8.4 → 2.9-beta1 |
Changed in juju: | |
milestone: | 2.9-beta1 → 2.9-rc1 |
Changed in juju: | |
milestone: | 2.9-rc1 → none |
importance: | High → Medium |
prometheus charm fails for me also, on a microk8s without rbac
deploying other k8s charms work, so there's some interaction there that needs debugging
also, since juju 2.6, you don't need to set operator-storage manually - juju detects the type of k8s cluster and uses the appropriate storage