RPC receivers may starve periodic tasks

Bug #1308680 reported by Jim Rollenhagen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Triaged
Medium
Unassigned

Bug Description

When issuing many RPC tasks to the conductor, the heartbeat task (and probably other periodic tasks) runs much slower than intended.

In my testing, I am powering off many nodes in a loop:
    for node in nodes:
        power_off_node_via_api(node)

My conductor is set to heartbeat every 10 seconds, and does so until I run this loop. While running this loop, the heartbeat slows down to 3-5 *minutes*.

Theory is that the RPC receiver takes priority and does not yield to the periodic tasks.

aeva black (tenbrae)
Changed in ironic:
status: New → Triaged
importance: Undecided → Critical
Revision history for this message
aeva black (tenbrae) wrote :

Reproduced locally with the "fake" driver. See
  https://review.openstack.org/#/c/88076/

--------------------

This leverages the blocking nature of the mysql connection
to starve the RPC worker pool.

Steps to reproduce in a short amount of time:
- set the following in your config file:
  heartbeat_interval = 5
  heartbeat_timeout = 10
  rpc_thread_pool_size = 5
  rpc_conn_pool_size = 5
- register 6 nodes with the fake driver
- request 5 nodes to be power cycled
- watch output of 'ironic driver-list'.. after about 6 seconds, it will
  be empty
- request the 6th node to be power cycled
- observe error

Revision history for this message
aeva black (tenbrae) wrote :

On further thought, my demonstration is overly complex. All that is needed is for a single DB query to block longer than heartbeat_timeout seconds, and the problem (conductor appears to be offline) manifests.

Revision history for this message
aeva black (tenbrae) wrote :

further discussion and testing:
- use utils.execute(*['sleep', '10']) for testing, instead of DB query, to block a greenthread but not the whole process
- able to duplicate with settings otherwise the same

- problem is solved with rpc_thread_pool_size small (eg, 4) and conductor.worker_pool large (eg, 64).

Suggested fix:
- add new CONF option for conductor.worker_pool_size
- change default rpc_thread_pool_size

Changed in ironic:
assignee: nobody → Jim Rollenhagen (jim-rollenhagen)
Revision history for this message
aeva black (tenbrae) wrote :

partial fix implemented (apparently unintentionally) by
  https://review.openstack.org/#/c/88307/

Revision history for this message
Openstack Gerrit (openstack-gerrit) wrote : Fix proposed to ironic (master)

Fix proposed to branch: master
Review: https://review.openstack.org/89704

Changed in ironic:
status: Triaged → In Progress
Revision history for this message
aeva black (tenbrae) wrote :

Changing from critical -> high because there is a work around: just lower the rpc_thread_pool_size config option from its default of 64 to eg. 4.

Changed in ironic:
importance: Critical → High
Revision history for this message
Matt Wagner (matt-wagner) wrote :

I just ran into this. It took me a bit to figure out what was happening.

I did a basic devstack+Ironic install, and registered 80 VMs (pxe+ssh) with Ironic. I hit problems immediately, even before I tried to deploy. It looks like polling the status of 80 VMs is taking long enough that the heartbeat gets missed, and updated only every ~6 minutes.

It looks like the updates happen serially, not in parallel. This is checking 80 VMs; I can only imagine what would happen if you tried managing hundreds of servers and mixed in some IPMI timeouts or the like.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/93083

Changed in ironic:
assignee: Jim Rollenhagen (jim-rollenhagen) → Lucas Alvares Gomes (lucasagomes)
Revision history for this message
Robert Collins (lifeless) wrote :

@Matt - if the polling is serial, how is the heatbeat being blocked? I found setting a low rpc pool made real hardware super unreliable. See bug 1311401

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (master)

Reviewed: https://review.openstack.org/93083
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=805ee6bd1621c32325f66317cd01733eaf68d9d6
Submitter: Jenkins
Branch: master

commit 805ee6bd1621c32325f66317cd01733eaf68d9d6
Author: Lucas Alvares Gomes <email address hidden>
Date: Fri May 9 17:05:08 2014 +0100

    Run keepalive in a dedicated thread

    Periodic tasks are not concurrent, they run one after the other in
    a single greenthread so if a periodic tasks take a long time to run
    (e.g syncing the power state of the devices) the conductor will miss the
    keepalive and will be considered dead even tho it's not. This commit is
    adding the keepalive to a dedicated greenthread to avoid such problem.

    Partial-Bug: #1308680
    Change-Id: Ib4159e55a42f268c75f362180e145e637735c16d

Revision history for this message
Lucas Alvares Gomes (lucasagomes) wrote :

We have a partial fix in Ironic but the real fix needs to go into oslo please read the comments: https://review.openstack.org/93083

Changed in ironic:
assignee: Lucas Alvares Gomes (lucasagomes) → nobody
Revision history for this message
Dmitry Tantsur (divius) wrote :

Hi Lucas! As this bug is not closed and you're assigned, could you give some update on it?

Revision history for this message
Lucas Alvares Gomes (lucasagomes) wrote :

Yes, the problem is that all the peeriodic tasks run in a single greenthread, so if one periodic task was taking a long time to run (e.g syncing power states) the periodic task responsable for the heartbeat would delay and the conductor would be considered dead.

We landed a partial fix for it [1] which makes the heartbeat to run in a dedicated greenthread instead of being a periodic task, but the real fix for this problem should come from oslo, making periodic tasks to run in parallel.

[1] https://review.openstack.org/93083

Dmitry Tantsur (divius)
Changed in ironic:
status: In Progress → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on ironic (master)

Change abandoned by Jim Rollenhagen (<email address hidden>) on branch: master
Review: https://review.openstack.org/89704

Revision history for this message
Robert Collins (lifeless) wrote :

FWIW I encountered something looking just like this today, in VMs

aeva black (tenbrae)
Changed in ironic:
importance: High → Low
milestone: none → kilo-rc1
importance: Low → Medium
Revision history for this message
Jim Rollenhagen (jim-rollenhagen) wrote :

"but the real fix for this problem should come from oslo, making periodic tasks to run in parallel."

This clearly won't happen during RC1 -- we should decide if we want to bump this, or re-use Dmitry's hack that is currently in use for driver-level periodic tasks.

Devananda, thoughts?

Revision history for this message
John Stafford (john-stafford) wrote :

Per Deva: We can do a documented workaround for K cycle and move bug to L

Changed in ironic:
milestone: kilo-rc1 → liberty-1
Revision history for this message
Sam Betts (sambetts) wrote :

Oslo BP for parallel periodic tasks: https://review.openstack.org/#/c/134303/

Revision history for this message
Yuriy Zveryanskyy (yzveryanskyy) wrote :

Looks like invalid after futurist for periodic.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.