pods do not get restarted in an AIO-DX system
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Douglas Henrique Koerich |
Bug Description
Brief Description
-----------------
Pods that are in a k8s deployment, daemonset, etc can be labeled as restart-
In an AIO-DX system however, the reboot can fail to occur if no node selector has been set, as the query for labeled pods depends on a field selector specifying the host the recovery script is running on.
The reboot will fail to occur if the script looks for labeled pods before the pod has been scheduled on the node the script is running on.
Severity
--------
Provide the severity of the defect.
Minor: System/Feature is usable with minor issue
Steps to Reproduce
------------------
- As part of a daemonset, label a pod with restart-
- Ensure the pod cannot be scheduled on the other AIO-DX node (label, taint, etc)
- Reboot the node the pod is scheduled on and observe the k8s-pod-recovery logs in /var/log/daemon.log
- Observe no log specifying the pod has been recovered
Expected Behavior
------------------
The pod should be recovered by the script
Actual Behavior
----------------
The pod may not be recovered by the script
Reproducibility
---------------
50/50
System Configuration
-------
AIO-DX
Branch/Pull Time/Commit
-------
master 2020-10-20
Test Activity
-------------
Developer Testing
Workaround
----------
Use an init container for the pod in question with a few second delay
or
restart the pod manually
CVE References
tags: | added: stx.networking |
Changed in starlingx: | |
importance: | Undecided → Medium |
status: | New → Triaged |
tags: | added: stx.5.0 |
Changed in starlingx: | |
assignee: | nobody → Cole Walker (cwalops) |
Changed in starlingx: | |
assignee: | Cole Walker (cwalops) → Douglas Henrique Koerich (dkoerich-wr) |
Changed in starlingx: | |
status: | Triaged → In Progress |
Changed in starlingx: | |
status: | In Progress → Fix Released |
Ref: https:/ /bugs.launchpad .net/starlingx/ +bug/1896631