Comment 0 for bug 1928018

Revision history for this message
Angie Wang (angiewang) wrote :

Brief Description
-----------------
After a reboot or lock/unlock of an AIO-SX, Armada pod stuck in an unknown state and does not recover.

Same issue with but this impacts Armada pod
https://bugs.launchpad.net/starlingx/+bug/1874858
https://bugs.launchpad.net/starlingx/+bug/1893977

Severity
--------
Medium

Steps to Reproduce
------------------
Apply stx-openstack application to an AIO-SX
system host-lock controller-0
system host-unlock controller-0

Expected Behavior
------------------
All pods should recover and be in a ready/running state shortly after the controller recovers.

Actual Behavior
----------------
Armada pod stuck in unknown state

Reproducibility
---------------
Intermittent - seen rarely

System Configuration
--------------------
AIO-SX

Branch/Pull Time/Commit
-----------------------
stx master

Timestamp/Logs
--------------
[2021-04-21 19:50:21,796] 314 DEBUG MainThread ssh.send :: Send 'kubectl get pod --all-namespaces --field-selector=status.phase=Running -o=wide | grep --color=never -v -E '([0-9])+/\1''
[2021-04-21 19:50:22,133] 436 DEBUG MainThread ssh.expect :: Output:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
armada armada-api-84f66996f6-ztjmv 0/2 Unknown 0 8h <none> controller-0 <none> <none>

  Warning FailedMount 105m kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[armada-etc armada-api-token-g846b pod-tmp pod-etc-armada]: timed out waiting for the condition
  Warning FailedMount 103m kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[pod-etc-armada armada-etc armada-api-token-g846b pod-tmp]: timed out waiting for the condition
  Warning FailedMount 97m kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[pod-tmp pod-etc-armada armada-etc armada-api-token-g846b]: timed out waiting for the condition
  Warning FailedMount 37m (x22 over 101m) kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[armada-api-token-g846b pod-tmp pod-etc-armada armada-etc]: timed out waiting for the condition
  Warning FailedMount 32m (x43 over 108m) kubelet, controller-0 MountVolume.SetUp failed for volume "armada-etc" : stat /var/lib/kubelet/pods/10faba32-eea1-4af5-91fa-7ce8072f7114/volumes/kubernetes.io~configmap/armada-etc: no such file or directory
  Warning FailedMount 18m kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[pod-tmp pod-etc-armada armada-etc armada-api-token-g846b]: timed out waiting for the condition
  Warning FailedMount 8m11s (x3 over 14m) kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[pod-etc-armada armada-etc armada-api-token-g846b pod-tmp]: timed out waiting for the condition
  Warning FailedMount 4m4s (x3 over 16m) kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[armada-etc armada-api-token-g846b pod-tmp pod-etc-armada]: timed out waiting for the condition
  Warning FailedMount 2m (x3 over 20m) kubelet, controller-0 Unable to attach or mount volumes: unmounted volumes=[armada-etc], unattached volumes=[armada-api-token-g846b pod-tmp pod-etc-armada armada-etc]: timed out waiting for the condition
  Warning FailedMount 103s (x18 over 22m) kubelet, controller-0 MountVolume.SetUp failed for volume "armada-etc" : stat /var/lib/kubelet/pods/10faba32-eea1-4af5-91fa-7ce8072f7114/volumes/kubernetes.io~configmap/armada-etc: no such file or directory

Test Activity
-------------
Sanity

Workaround
----------
Delete the unknown pod