Compute service fails to restart if the vnic_type of a bound port changed from direct to macvtap (CVE-2022-37394)

Bug #1981813 reported by Balazs Gibizer
256
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
Balazs Gibizer
OpenStack Security Advisory
In Progress
Undecided
David Wilde

Bug Description

We have a downstream bug report with the following reproduction steps:

1) create a neutron port with vnic_type "direct"
2) create an instance with that port
3) after the instance is created successfully change the vnic_type of the bound port from "direct" to "macvtap". This is accepted by Neutron
4) wait until the nova instance info caches is healed by the periodic task in nova-compute
5) restart the nova-compute service.

Actual behavior
---------------
The nova-compute service fails to start with PciDeviceNotFoundById exception pointing to the PCI address of the VF the port is bound to on the host.

Expected behavior
-----------------
The nova-compute service should start up successfully.

Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service Traceback (most recent call last):
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service File "/opt/stack/nova/nova/pci/utils.py", line 167, in get_ifname_by_pci_address
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service dev_info = os.listdir(dev_path)
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service FileNotFoundError: [Errno 2] No such file or directory: '/sys/bus/pci/devices/0000:19:0a.7/net'
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service During handling of the above exception, another exception occurred:
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service Traceback (most recent call last):
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service File "/usr/local/lib/python3.10/site-packages/oslo_service/service.py", line 806, in run_service
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service service.start()
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service File "/opt/stack/nova/nova/service.py", line 159, in start
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service self.manager.init_host()
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service File "/opt/stack/nova/nova/compute/manager.py", line 1536, in init_host
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service self._init_instance(context, instance)
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service File "/opt/stack/nova/nova/compute/manager.py", line 1230, in _init_instance
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service self.driver.plug_vifs(instance, net_info)
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 1386, in plug_vifs
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service self.vif_driver.plug(instance, vif)
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service File "/opt/stack/nova/nova/virt/libvirt/vif.py", line 730, in plug
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service self.plug_hw_veb(instance, vif)
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service File "/opt/stack/nova/nova/virt/libvirt/vif.py", line 628, in plug_hw_veb
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service set_vf_interface_vlan(
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service File "/opt/stack/nova/nova/virt/libvirt/vif.py", line 99, in set_vf_interface_vlan
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service vf_ifname = pci_utils.get_ifname_by_pci_address(pci_addr)
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service File "/opt/stack/nova/nova/pci/utils.py", line 170, in get_ifname_by_pci_address
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service raise exception.PciDeviceNotFoundById(id=pci_addr)
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service nova.exception.PciDeviceNotFoundById: PCI device 0000:19:0a.7 not found
Jul 15 06:39:14 dell-r640-020 nova-compute[278453]: ERROR oslo_service.service

CVE References

Changed in nova:
assignee: nobody → Balazs Gibizer (balazs-gibizer)
tags: added: neutron pci
tags: added: compute
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/849985

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/850003

Changed in nova:
status: New → In Progress
information type: Public → Public Security
Revision history for this message
Jeremy Stanley (fungi) wrote : Re: Compute service fails to restart if the vnic_type of a bound port changed from direct to macvtap

Sean: Can you elaborate on why you believe this report represents an exploitable security vulnerability? Is it that a malicious user can change the vnic_type of a port under their control and leave a time-bomb for the next time the administrator restarts the compute service, resulting in that compute host being out of service (unable to stop/start running virtual machines) until the problem can be manually rectified?

Revision history for this message
Jeremy Stanley (fungi) wrote :

I caught up with Sean in IRC and he confirmed the situation is basically as I inferred above (exploitable by any normal authenticated user, not just limited to operator level accounts).

Changed in ossa:
status: New → Incomplete
Revision history for this message
Jeremy Stanley (fungi) wrote :

Since this report concerns a possible security risk, an incomplete
security advisory task has been added while the core security
reviewers for the affected project or projects confirm the bug and
discuss the scope of any vulnerability along with potential
solutions.

Revision history for this message
David Wilde (dave-wilde) wrote :

Title: Compute service fails to restart if the vnic_type of a bound port changed from direct to macvtap
Reporter: Balazs Gibizer (Red Hat)
Products: Nova
Affects: >=23.0.0

Description:
Balazs Gibizer with Red Hat reported a vulnerability in Nova's restart behavior when a Neutron port type is changed from "direct" to "macvtap". By creating a neutron port with vnic_type "direct", creating an instance bound to that port, and then changing the vnic_type of the bound port to "macvtap" an authenticated user may cause the compute service to fail to restart resulting in a possible denial of service.
Only Nova deployments configured with SR-IOV are affected.

Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

@David: Your summary look good to me.
@fungi: https://review.opendev.org/c/openstack/nova/+/850003 is proposed as a mitigation of the denial of service. With that patch nova will no longer fail to restart in the reported situation.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Thanks David!

Keep in mind that the title is what will appear in the list of advisories at https://security.openstack.org/ossalist.html and will be combined with the project name, OSSA number and CVE identifier in the subject line of advisories posted to widely-read public mailing lists, so shorter is better as long as it still uniquely captures the situation, sort of like a commit message title. Maybe something along the lines of "Changing vnic_type breaks compute service restart" instead (50 characters).

For the affected versions, we assume that a fix will be backported to all stable series currently in a "managed" state (so Wallaby, Xena and Yoga in this case) and that stable point releases will be tagged to include those. Since the most recent point releases are 23.2.1, 24.1.1 and 25.0.1 we indicate that the next possible release number for each of these is not affected like so: <23.2.2, >=24.0.0 <24.1.2, >=25.0.0 <25.0.2 (if the next point release on stable/yoga ends up being 25.1.0 instead and there is never a 25.0.2 that's fine, since 25.1.0 still falls into an unaffected range strictly speaking).

Revision history for this message
David Wilde (dave-wilde) wrote :

Thanks for the feedback Jeremy, especially the calculation for the point releases. That was confusing me but your explanation makes perfect sense. Here’s my updated description:

Title: Changing vnic_type breaks compute service restart
Reporter: Balazs Gibizer (Red Hat)
Products: Nova
Affects: <23.2.2, >=24.0.0 <24.1.2, >=25.0.0 <25.0.2

Description:
Balazs Gibizer with Red Hat reported a vulnerability in Nova's restart behavior when a Neutron port type is changed from "direct" to "macvtap". By creating a neutron port with vnic_type "direct", creating an instance bound to that port, and then changing the vnic_type of the bound port to "macvtap" an authenticated user may cause the compute service to fail to restart resulting in a possible denial of service.
Only Nova deployments configured with SR-IOV are affected.

Revision history for this message
Jeremy Stanley (fungi) wrote :

David's proposed impact description from comment #9 looks perfect to me. If there are no immediate objections, VMT members can proceed in requesting a CVE assignment based on that description while the master branch fix is under review, and then work on assembling an advisory once backports have been pushed.

David Wilde (dave-wilde)
Changed in ossa:
status: Incomplete → In Progress
assignee: nobody → David Wilde (dave-wilde)
David Wilde (dave-wilde)
summary: Compute service fails to restart if the vnic_type of a bound port
- changed from direct to macvtap
+ changed from direct to macvtap (CVE-2022-37394)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/849985
Committed: https://opendev.org/openstack/nova/commit/f8c91eb75fc5504a37fc3b4be1d65d33dbc9b511
Submitter: "Zuul (22348)"
Branch: master

commit f8c91eb75fc5504a37fc3b4be1d65d33dbc9b511
Author: Balazs Gibizer <email address hidden>
Date: Fri Jul 15 12:43:58 2022 +0200

    Reproduce bug 1981813 in func env

    Related-Bug: #1981813
    Change-Id: I9367b7ed475917bdb05eb3f209ce1a4e646534e2

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/850003
Committed: https://opendev.org/openstack/nova/commit/e43bf900dc8ca66578603bed333c56b215b1876e
Submitter: "Zuul (22348)"
Branch: master

commit e43bf900dc8ca66578603bed333c56b215b1876e
Author: Balazs Gibizer <email address hidden>
Date: Fri Jul 15 13:48:46 2022 +0200

    Gracefully ERROR in _init_instance if vnic_type changed

    If the vnic_type of a bound port changes from "direct" to "macvtap" and
    then the compute service is restarted then during _init_instance nova
    tries to plug the vif of the changed port. However as it now has macvtap
    vnic_type nova tries to look up the netdev of the parent VF. Still that
    VF is consumed by the instance so there is no such netdev on the host
    OS. This error killed the compute service at startup due to unhandled
    exception. This patch adds the exception handler, logs an ERROR and
    continue initializing other instances on the host.

    Also this patch adds a detailed ERROR log when nova detects that the
    vnic_type changed during _heal_instance_info_cache periodic.

    Closes-Bug: #1981813
    Change-Id: I1719f8eda04e8d15a3b01f0612977164c4e55e85

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 26.0.0.0rc1

This issue was fixed in the openstack/nova 26.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/yoga)

Related fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/nova/+/859312

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/yoga)

Fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/nova/+/859313

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/xena)

Related fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/859314

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/859315

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/wallaby)

Related fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/nova/+/859320

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/nova/+/859321

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/victoria)

Related fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/nova/+/869583

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/nova/+/869584

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/ussuri)

Related fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/nova/+/869585

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/nova/+/869586

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/train)

Related fix proposed to branch: stable/train
Review: https://review.opendev.org/c/openstack/nova/+/869673

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/c/openstack/nova/+/869674

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/nova/+/859312
Committed: https://opendev.org/openstack/nova/commit/4954f993680c75fd9d3d507f2dcd00300c9b3d44
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 4954f993680c75fd9d3d507f2dcd00300c9b3d44
Author: Balazs Gibizer <email address hidden>
Date: Fri Jul 15 12:43:58 2022 +0200

    Reproduce bug 1981813 in func env

    There stable/yoga only change in test_pci_sriov_servers.py due to
    unittest.mock switch[1] only happened in zed.

    [1] https://review.opendev.org/q/topic:unittest.mock+status:merged+project:openstack/nova

    Related-Bug: #1981813
    Change-Id: I9367b7ed475917bdb05eb3f209ce1a4e646534e2
    (cherry picked from commit f8c91eb75fc5504a37fc3b4be1d65d33dbc9b511)

tags: added: in-stable-yoga
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/nova/+/859313
Committed: https://opendev.org/openstack/nova/commit/a28c82719545d5c8ee7f3ff1361b3a796e05095a
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit a28c82719545d5c8ee7f3ff1361b3a796e05095a
Author: Balazs Gibizer <email address hidden>
Date: Fri Jul 15 13:48:46 2022 +0200

    Gracefully ERROR in _init_instance if vnic_type changed

    If the vnic_type of a bound port changes from "direct" to "macvtap" and
    then the compute service is restarted then during _init_instance nova
    tries to plug the vif of the changed port. However as it now has macvtap
    vnic_type nova tries to look up the netdev of the parent VF. Still that
    VF is consumed by the instance so there is no such netdev on the host
    OS. This error killed the compute service at startup due to unhandled
    exception. This patch adds the exception handler, logs an ERROR and
    continue initializing other instances on the host.

    Also this patch adds a detailed ERROR log when nova detects that the
    vnic_type changed during _heal_instance_info_cache periodic.

    Closes-Bug: #1981813
    Change-Id: I1719f8eda04e8d15a3b01f0612977164c4e55e85
    (cherry picked from commit e43bf900dc8ca66578603bed333c56b215b1876e)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/nova/+/859314
Committed: https://opendev.org/openstack/nova/commit/0c87681135cfb3ce61d2a0392928c1dbc1fe5fde
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 0c87681135cfb3ce61d2a0392928c1dbc1fe5fde
Author: Balazs Gibizer <email address hidden>
Date: Fri Jul 15 12:43:58 2022 +0200

    Reproduce bug 1981813 in func env

    There stable/yoga only change in test_pci_sriov_servers.py due to
    unittest.mock switch[1] only happened in zed.

    [1] https://review.opendev.org/q/topic:unittest.mock+status:merged+project:openstack/nova

    Related-Bug: #1981813
    Change-Id: I9367b7ed475917bdb05eb3f209ce1a4e646534e2
    (cherry picked from commit f8c91eb75fc5504a37fc3b4be1d65d33dbc9b511)
    (cherry picked from commit 4954f993680c75fd9d3d507f2dcd00300c9b3d44)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/nova/+/859315
Committed: https://opendev.org/openstack/nova/commit/1a98a1a650d065a8ab3e1c474f3b9fd537dc2206
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 1a98a1a650d065a8ab3e1c474f3b9fd537dc2206
Author: Balazs Gibizer <email address hidden>
Date: Fri Jul 15 13:48:46 2022 +0200

    Gracefully ERROR in _init_instance if vnic_type changed

    If the vnic_type of a bound port changes from "direct" to "macvtap" and
    then the compute service is restarted then during _init_instance nova
    tries to plug the vif of the changed port. However as it now has macvtap
    vnic_type nova tries to look up the netdev of the parent VF. Still that
    VF is consumed by the instance so there is no such netdev on the host
    OS. This error killed the compute service at startup due to unhandled
    exception. This patch adds the exception handler, logs an ERROR and
    continue initializing other instances on the host.

    Also this patch adds a detailed ERROR log when nova detects that the
    vnic_type changed during _heal_instance_info_cache periodic.

    Closes-Bug: #1981813
    Change-Id: I1719f8eda04e8d15a3b01f0612977164c4e55e85
    (cherry picked from commit e43bf900dc8ca66578603bed333c56b215b1876e)
    (cherry picked from commit a28c82719545d5c8ee7f3ff1361b3a796e05095a)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 24.2.0

This issue was fixed in the openstack/nova 24.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 25.1.0

This issue was fixed in the openstack/nova 25.1.0 release.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Even though the patch for stable/wallaby has not merged, we could go ahead and issue an advisory for this now that the branch has transitioned to extended maintenance. What does everyone think?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/train)

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/train
Review: https://review.opendev.org/c/openstack/nova/+/869673
Reason: stable/train branch of nova projects' have been tagged as End of Life. All open patches have to be abandoned in order to be able to delete the branch.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/train
Review: https://review.opendev.org/c/openstack/nova/+/869674
Reason: stable/train branch of nova projects' have been tagged as End of Life. All open patches have to be abandoned in order to be able to delete the branch.

To post a comment you must log in.
This report contains Public Security information  
Everyone can see this security related information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.