[1.22] Charmed k8s fails to attach PV from csi-cinder - attachdetach-controller: Attach timeout for volume

Bug #1945259 reported by Nobuto Murata
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
CDK Addons
Fix Released
High
Cory Johns

Bug Description

The underlying Cinder volume gets stuck at "available" and never gets into "attaching" or "in-use", thus the PV is not gonna be available for a pod. For the record, the following step works with v1.21.5 so it's a regression in a broader context.

$ kubectl version --short
...
Server Version: v1.22.2

How to reproduce:

1. Deploy Charmed k8s with openstack-integrator on top of OpenStack

juju add-model k8s-on-openstack "openstack/${OS_REGION_NAME}"

wget -O ~ubuntu/k8s_bundle.yaml https://api.jujucharms.com/charmstore/v5/bundle/kubernetes-core/archive/bundle.yaml

# LP: #1936842
sed -i.bak -e 's/lxd:0/0/' ~ubuntu/k8s_bundle.yaml

# https://github.com/charmed-kubernetes/bundle/blob/master/overlays/openstack-lb-overlay.yaml
cat > ~ubuntu/openstack-lb-overlay.yaml <<EOF
applications:
  openstack-integrator:
    annotations:
      gui-x: "600"
      gui-y: "300"
    charm: cs:~containers/openstack-integrator
    num_units: 1
    trust: true
    to:
    - '0'
    options:
      lb-floating-network: ext_net
relations:
  - ['openstack-integrator:loadbalancer', 'kubernetes-master:loadbalancer']
  - ['openstack-integrator:clients', 'kubernetes-master:openstack']
  - ['openstack-integrator:clients', 'kubernetes-worker:openstack']
EOF

juju deploy --trust ~ubuntu/k8s_bundle.yaml \
    --overlay ~ubuntu/openstack-lb-overlay.yaml

2. Confirm the storage class of "cdk-cinder" is set up by the charm

$ kubectl describe sc
Name: cdk-cinder
IsDefaultClass: No
Annotations: juju.io/workload-storage=true,kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{"juju.io/workload-storage":"true"},"labels":{"cdk-addons":"true"},"name":"cdk-cinder"},"provisioner":"cinder.csi.openstack.org"}

Provisioner: cinder.csi.openstack.org
Parameters: <none>
AllowVolumeExpansion: <unset>
MountOptions: <none>
ReclaimPolicy: Delete
VolumeBindingMode: Immediate
Events: <none>

3. Create a PVC and create a pod with the attachment

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc-1
spec:
  storageClassName: cdk-cinder
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
EOF

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: my-pod-1
spec:
  volumes:
  - name: my-volume
    persistentVolumeClaim:
      claimName: my-pvc-1
  containers:
  - image: nginx
    name: nginx
    volumeMounts:
    - name: my-volume
      mountPath: /opt/volumes/my-volume
EOF

[Actual result]

attachdetach-controller AttachVolume.Attach failed for volume "pvc-e77ec6f7-75b3-4d4e-b9c8-7d48e7d8ca46" : Attach timeout for volume bfcab68e-ea62-4328-8f65-368175d668a2

$ kubectl describe pod

...
Events:
  Type Reason Age From Message
  ---- ------ ---- ---- -------
  Normal Scheduled 2m18s default-scheduler Successfully assigned default/my-pod-1 to juju-0998b2-k8s-on-openstack-1
  Warning FailedAttachVolume 18s attachdetach-controller AttachVolume.Attach failed for volume "pvc-e77ec6f7-75b3-4d4e-b9c8-7d48e7d8ca46" : Attach timeout for volume bfcab68e-ea62-4328-8f65-368175d668a2
  Warning FailedMount 15s kubelet Unable to attach or mount volumes: unmounted volumes=[my-volume], unattached volumes=[kube-api-access-zxvhf my-volume]: timed out waiting for the condition

$ openstack volume list --format yaml
- Attached to: []
  ID: bfcab68e-ea62-4328-8f65-368175d668a2
  Name: pvc-e77ec6f7-75b3-4d4e-b9c8-7d48e7d8ca46
  Size: 1
  Status: available

[Expected result - v1.21.5]

AttachVolume.Attach succeeded for volume

$ kubectl describe pod

...
Events:
  Type Reason Age From Message
  ---- ------ ---- ---- -------
  Normal Scheduled 55s default-scheduler Successfully assigned default/my-pod-1 to juju-d1d7f0-k8s-on-openstack-1
  Normal SuccessfulAttachVolume 47s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-ead5e7e8-11ac-490a-b126-423960457cac"
  Normal Pulling 38s kubelet Pulling image "nginx"
  Normal Pulled 30s kubelet Successfully pulled image "nginx" in 7.910156447s
  Normal Created 29s kubelet Created container nginx
  Normal Started 29s kubelet Started container nginx

$ openstack volume list --format yaml
- Attached to:
  - attached_at: '2021-09-28T02:02:45.000000'
    attachment_id: 9f44a7ea-bfaf-433a-b927-4ad176f9e108
    device: /dev/vdb
    host_name: vast-goose.maas
    id: cba11417-b18a-4d01-9f02-6f13004724c6
    server_id: f7c17617-6340-41ab-8c80-a508a5f3bfab
    volume_id: cba11417-b18a-4d01-9f02-6f13004724c6
  ID: cba11417-b18a-4d01-9f02-6f13004724c6
  Name: pvc-ead5e7e8-11ac-490a-b126-423960457cac
  Size: 1
  Status: in-use

Revision history for this message
Nobuto Murata (nobuto) wrote :

Attaching ~field-high. The current and immediate workaround is to downgrade to 1.21 for customer deployments, but it's not sustainable.

Revision history for this message
Nobuto Murata (nobuto) wrote :

$ kubectl logs -n kube-system csi-cinder-controllerplugin-0 csi-attacher | head -n 20
I0928 03:09:05.787036 1 main.go:91] Version: v2.2.1-0-gac961e8c
I0928 03:09:05.788599 1 connection.go:153] Connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock
W0928 03:09:15.788836 1 connection.go:172] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock
W0928 03:09:25.788725 1 connection.go:172] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock
W0928 03:09:35.788815 1 connection.go:172] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock
W0928 03:09:45.789166 1 connection.go:172] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock
W0928 03:09:55.788879 1 connection.go:172] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock
W0928 03:10:05.788820 1 connection.go:172] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock
W0928 03:10:15.788984 1 connection.go:172] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock
I0928 03:10:16.821695 1 common.go:111] Probing CSI driver for readiness
W0928 03:10:16.831477 1 metrics.go:146] metrics endpoint will not be started because `metrics-address` was not specified.
I0928 03:10:16.835145 1 controller.go:121] Starting CSI attacher
E0928 03:10:16.867895 1 reflector.go:156] k8s.io/client-go/informers/factory.go:135: Failed to list *v1beta1.VolumeAttachment: the server could not find the requested resource
E0928 03:10:16.868403 1 reflector.go:156] k8s.io/client-go/informers/factory.go:135: Failed to list *v1beta1.CSINode: the server could not find the requested resource
E0928 03:10:17.871894 1 reflector.go:156] k8s.io/client-go/informers/factory.go:135: Failed to list *v1beta1.VolumeAttachment: the server could not find the requested resource
E0928 03:10:17.872426 1 reflector.go:156] k8s.io/client-go/informers/factory.go:135: Failed to list *v1beta1.CSINode: the server could not find the requested resource
E0928 03:10:18.880122 1 reflector.go:156] k8s.io/client-go/informers/factory.go:135: Failed to list *v1beta1.VolumeAttachment: the server could not find the requested resource
E0928 03:10:18.880340 1 reflector.go:156] k8s.io/client-go/informers/factory.go:135: Failed to list *v1beta1.CSINode: the server could not find the requested resource
E0928 03:10:19.886758 1 reflector.go:156] k8s.io/client-go/informers/factory.go:135: Failed to list *v1beta1.VolumeAttachment: the server could not find the requested resource
E0928 03:10:19.887513 1 reflector.go:156] k8s.io/client-go/informers/factory.go:135: Failed to list *v1beta1.CSINode: the server could not find the requested resource

Revision history for this message
Nobuto Murata (nobuto) wrote :

$ kubectl api-resources | grep storage
csidrivers storage.k8s.io/v1 false CSIDriver
csinodes storage.k8s.io/v1 false CSINode
csistoragecapacities storage.k8s.io/v1beta1 true CSIStorageCapacity
storageclasses sc storage.k8s.io/v1 false StorageClass
volumeattachments storage.k8s.io/v1 false VolumeAttachment

Revision history for this message
Nobuto Murata (nobuto) wrote :

$ kubectl describe -n kube-system pod csi-cinder-controllerplugin | grep Image:
    Image: rocks.canonical.com:443/cdk/sig-storage/csi-attacher:v2.2.1
    Image: rocks.canonical.com:443/cdk/sig-storage/csi-provisioner:v1.6.1
    Image: rocks.canonical.com:443/cdk/sig-storage/csi-snapshotter:v2.1.3
    Image: rocks.canonical.com:443/cdk/sig-storage/csi-resizer:v0.5.1
    Image: rocks.canonical.com:443/cdk/sig-storage/livenessprobe:v2.1.0
    Image: rocks.canonical.com:443/cdk/k8scloudprovider/cinder-csi-plugin:v1.20.0

George Kraft (cynerva)
Changed in cdk-addons:
importance: Undecided → High
no longer affects: charmed-kubernetes-bundles
Changed in cdk-addons:
milestone: none → 1.22+ck1
status: New → Triaged
Revision history for this message
Cory Johns (johnsca) wrote :
Changed in cdk-addons:
assignee: nobody → Cory Johns (johnsca)
status: Triaged → In Progress
tags: added: backport-needed review-needed
George Kraft (cynerva)
Changed in cdk-addons:
status: In Progress → Fix Committed
tags: removed: review-needed
Revision history for this message
Cory Johns (johnsca) wrote :
tags: removed: backport-needed
George Kraft (cynerva)
Changed in cdk-addons:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.