cinder scheduler/backup using 100% CPU
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Cinder |
Incomplete
|
Undecided
|
Unassigned | ||
openstack-ansible |
Fix Released
|
Undecided
|
Kevin Carter |
Bug Description
Within master we're seeing cinder-
To fix the issue the only thing we've been able to do is restart the processes thought this has to be done manually. Stracing the scheduler process seems to indicate that it's cycling on epoll_ctl(6) but I've not been able to pinpoint why.
We are able to recreate the conditions seen in the gate by performing the following actions:
systemctl stop cinder-
lxc-destroy -fn $CINDER_
openstack-ansible lxc-container-
openstack-ansible os-cinder-
In monitoring the deployment it looks like the HUP (ansible service reload)[https:/
description: | updated |
description: | updated |
description: | updated |
Changed in openstack-ansible: | |
assignee: | nobody → Kevin Carter (kevin-carter) |
Changed in openstack-ansible: | |
assignee: | Kevin Carter (kevin-carter) → Andy McCrae (andrew-mccrae) |
status: | New → In Progress |
Changed in openstack-ansible: | |
assignee: | Andy McCrae (andrew-mccrae) → Kevin Carter (kevin-carter) |
no longer affects: | cinder |
Changed in cinder: | |
status: | New → Incomplete |
Triggered at the point where SIGHUP is called.
https:/ /docs.openstack .org/cinder/ latest/ upgrade. html#rolling- upgrade- process
My hunch is there is some loop that is not exiting correctly on this causing it to get in a hard loop and consume CPU. Hopefully should be easy to reproduce.