'hw:cpu_thread_policy=prefer' misbehaviour

Bug #1578155 reported by Ricardo Noriega
36
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Stephen Finucane
Newton
Fix Committed
Medium
Stephen Finucane

Bug Description

Description
===========

'hw:cpu_thread_policy=prefer' allocates vCPUs in pairs of sibling threads properly. An odd number of vCPUs will allocate pairs and a single one. That single one should not be isolated. So 20 available threads, shall be able to allocate 4 VMs of 5 vCPUs. When booting up the third VM, it is giving an error.

Steps to reproduce
==================

1.- Creating a flavor:

nova flavor-create pinning auto 1024 10 5
nova flavor-key pinning set hw:cpu_policy=dedicated
nova flavor-key pinning set hw:cpu_thread_policy=prefer
nova flavor-key pinning set hw:numa_nodes=1

2.- Booting up simple VMs:

nova boot testPin1 --flavor pinning --image cirros --nic net-id=$NET_ID

In my setup, I have 20 available threads:

  NUMANode L#0 (P#0 32GB)
    Socket L#0 + L3 L#0 (15MB)
      L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
        PU L#0 (P#0)
        PU L#1 (P#12)
      L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
        PU L#2 (P#2)
        PU L#3 (P#14)
      L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2
        PU L#4 (P#4)
        PU L#5 (P#16)
      L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3
        PU L#6 (P#6)
        PU L#7 (P#18)
      L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4
        PU L#8 (P#8)
        PU L#9 (P#20)
      L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5
        PU L#10 (P#10)
        PU L#11 (P#22)

  NUMANode L#1 (P#1 32GB) + Socket L#1 + L3 L#1 (15MB)
    L2 L#6 (256KB) + L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6
      PU L#12 (P#1)
      PU L#13 (P#13)
    L2 L#7 (256KB) + L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7
      PU L#14 (P#3)
      PU L#15 (P#15)
    L2 L#8 (256KB) + L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8
      PU L#16 (P#5)
      PU L#17 (P#17)
    L2 L#9 (256KB) + L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9
      PU L#18 (P#7)
      PU L#19 (P#19)
    L2 L#10 (256KB) + L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10
      PU L#20 (P#9)
      PU L#21 (P#21)
    L2 L#11 (256KB) + L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11
      PU L#22 (P#11)
      PU L#23 (P#23)

Using the cpu_thread_policy:prefer, the behaviour is ok for the first two VMs. So 5 threads are allocated in pairs.

[root@nfvsdn-04 ~(keystone_admin)]# virsh vcpupin 2
VCPU: CPU Affinity
----------------------------------
   0: 10
   1: 22
   2: 16
   3: 4
   4: 8

[root@nfvsdn-04 ~(keystone_admin)]# virsh vcpupin 3
VCPU: CPU Affinity
----------------------------------
   0: 17
   1: 5
   2: 3
   3: 15
   4: 11

However, eventhough there are enough threads in order to allocate another 2 VMs with the same flavor, I get the following error booting up the third VM:

INFO nova.filters Filtering removed all hosts for the request with instance ID 'cbb53e29-a7da-4c14-a3ad-4fb3aa04f101'. Filter results: ['RetryFilter: (start: 1, end: 1)', 'AvailabilityZoneFilter: (start: 1, end: 1)', 'RamFilter: (start: 1, end: 1)', 'C[r│omputeFilter: (start: 1, end: 1)', 'ComputeCapabilitiesFilter: (start: 1, end: 1)', 'ImagePropertiesFilter: (start: 1, end: 1)', 'CoreFilter: (start: 1, end: 1)', 'NUMATopologyFilter: (start: 1, endo: 0)']

There should be enough space for 4 VMs allocated with cpu_thread_policy=prefer flavor.

Expected result
===============

To have 4 VMs up&running with flavor 'pinning'.

Actual result
=============

3rd VM fails at scheduling.

Environment
==========

All-in-one environment.

Tags: numa
tags: added: numa
removed: cpu prefer thread
Changed in nova:
assignee: nobody → Vladik Romanovsky (vladik-romanovsky)
Changed in nova:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Artom Lifshitz (<email address hidden>) on branch: master
Review: https://review.openstack.org/344992
Reason: You got there first and have more reviews :)

tags: added: newton-rc-potential
melanie witt (melwitt)
Changed in nova:
importance: Undecided → Medium
Revision history for this message
Matt Riedemann (mriedem) wrote :

Was this a regression in Newton or prevent upgrades to Newton? Otherwise it shouldn't be tagged with newton-rc-potential?

tags: removed: newton-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/342709
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8361d8d6c315bb3ae71c3ff0147f7d5156bc46f3
Submitter: Jenkins
Branch: master

commit 8361d8d6c315bb3ae71c3ff0147f7d5156bc46f3
Author: Stephen Finucane <email address hidden>
Date: Tue Jul 19 14:01:53 2016 -0700

    Allow linear packing of cores

    Given the following single-socket, four-core, HT-enabled CPU topology:

       +---+---+ +---+---+ +---+---+ +---+---+
       | x | x | | x | | | x | | | | |
       +---+---+ +---+---+ +---+---+ +---+---+
         1 4 2 5 3 6 4 7

    Attempting to boot an instance with four cores and no explicit
    'cpu_thread_policy' should be successful, with cores 5,6,4,7 used.
    However, the current implementation of this implicit policy attempts to
    fit the same number of instance cores onto each host CPU. For example,
    a four core instance would result in either a 2*2 layout (two instance
    cores on each of two host CPUs), or a 1*4 layout (one instance core on
    each of four host CPUs). This may be correct behavior *where possible*,
    but if this is not possible then any and all cores should be used.

    Resolve this issue by adding a fallthrough case, whereby if the
    standard fitting policy fails, a linear assignment is used to properly
    fit the instance cores.

    Change-Id: I73f7f771b7514060f1f74066e3dea1da8fe74c21
    Closes-Bug: #1578155
    mitaka-backport-potential

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/373889

Matt Riedemann (mriedem)
Changed in nova:
assignee: Vladik Romanovsky (vladik-romanovsky) → Stephen Finucane (stephenfinucane)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/newton)

Reviewed: https://review.openstack.org/373889
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=2dcf8c22f845de6b4ae12ae3d75f89041b839e57
Submitter: Jenkins
Branch: stable/newton

commit 2dcf8c22f845de6b4ae12ae3d75f89041b839e57
Author: Stephen Finucane <email address hidden>
Date: Tue Jul 19 14:01:53 2016 -0700

    Allow linear packing of cores

    Given the following single-socket, four-core, HT-enabled CPU topology:

       +---+---+ +---+---+ +---+---+ +---+---+
       | x | x | | x | | | x | | | | |
       +---+---+ +---+---+ +---+---+ +---+---+
         1 4 2 5 3 6 4 7

    Attempting to boot an instance with four cores and no explicit
    'cpu_thread_policy' should be successful, with cores 5,6,4,7 used.
    However, the current implementation of this implicit policy attempts to
    fit the same number of instance cores onto each host CPU. For example,
    a four core instance would result in either a 2*2 layout (two instance
    cores on each of two host CPUs), or a 1*4 layout (one instance core on
    each of four host CPUs). This may be correct behavior *where possible*,
    but if this is not possible then any and all cores should be used.

    Resolve this issue by adding a fallthrough case, whereby if the
    standard fitting policy fails, a linear assignment is used to properly
    fit the instance cores.

    Change-Id: I73f7f771b7514060f1f74066e3dea1da8fe74c21
    Closes-Bug: #1578155
    (cherry picked from commit 8361d8d6c315bb3ae71c3ff0147f7d5156bc46f3)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 14.0.1

This issue was fixed in the openstack/nova 14.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 15.0.0.0b1

This issue was fixed in the openstack/nova 15.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/427119

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/mitaka)

Change abandoned by Stephen Finucane (<email address hidden>) on branch: stable/mitaka
Review: https://review.openstack.org/427119

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.