[OSSA-2023-003] Unauthorized volume access through deleted volume attachments (CVE-2023-2088)

Bug #2004555 reported by Jan Wasilewski
294
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
Undecided
Unassigned
OpenStack Compute (nova)
Fix Released
Undecided
Unassigned
Antelope
Fix Released
Undecided
Unassigned
Wallaby
Fix Committed
Undecided
Unassigned
Xena
Fix Committed
Undecided
Unassigned
Yoga
Fix Released
Undecided
Unassigned
Zed
Fix Released
Undecided
Unassigned
OpenStack Security Advisory
Fix Released
High
Jeremy Stanley
OpenStack Security Notes
Fix Released
High
Jeremy Stanley
glance_store
Fix Released
Undecided
Unassigned
kolla-ansible
In Progress
Undecided
Unassigned
Zed
Fix Released
Undecided
Unassigned
os-brick
In Progress
Undecided
Unassigned

Bug Description

Hello OpenStack Security Team,

I’m writing to you, as we faced a serious security breach in OpenStack functionality(correlated a bit with libvirt, iscsi and huawei driver). I was going through OSSA documents and correlated libvirt notes, but I couldn't find something similar. It is not related to https://security.openstack.org/ossa/OSSA-2020-006.html

In short: we observed that newly created cinder volume(1GB size) was attached to compute node instance, but an instance recognized it as a 115GB volume, which(this 115GB volume) in fact was connected to another instance on the same compute node.

[1. Test environment]
Compute node: OpenStack Ussuri configured with Huawei dorado as a storage backend(configuration driver is available here: https://docs.openstack.org/cinder/rocky/configuration/block-storage/drivers/huawei-storage-driver.html)
Packages:
v# dpkg -l | grep libvirt
ii libvirt-clients 6.0.0-0ubuntu8.16 amd64 Programs for the libvirt library
ii libvirt-daemon 6.0.0-0ubuntu8.16 amd64 Virtualization daemon
ii libvirt-daemon-driver-qemu 6.0.0-0ubuntu8.16 amd64 Virtualization daemon QEMU connection driver
ii libvirt-daemon-driver-storage-rbd 6.0.0-0ubuntu8.16 amd64 Virtualization daemon RBD storage driver
ii libvirt-daemon-system 6.0.0-0ubuntu8.16 amd64 Libvirt daemon configuration files
ii libvirt-daemon-system-systemd 6.0.0-0ubuntu8.16 amd64 Libvirt daemon configuration files (systemd)
ii libvirt0:amd64 6.0.0-0ubuntu8.16 amd64 library for interfacing with different virtualization systems
ii nova-compute-libvirt 2:21.2.4-0ubuntu1 all OpenStack Compute - compute node libvirt support
ii python3-libvirt 6.1.0-1 amd64 libvirt Python 3 bindings

# dpkg -l | grep qemu
ii ipxe-qemu 1.0.0+git-20190109.133f4c4-0ubuntu3.2 all PXE boot firmware - ROM images for qemu
ii ipxe-qemu-256k-compat-efi-roms 1.0.0+git-20150424.a25a16d-0ubuntu4 all PXE boot firmware - Compat EFI ROM images for qemu
ii libvirt-daemon-driver-qemu 6.0.0-0ubuntu8.16 amd64 Virtualization daemon QEMU connection driver
ii qemu 1:4.2-3ubuntu6.23 amd64 fast processor emulator, dummy package
ii qemu-block-extra:amd64 1:4.2-3ubuntu6.23 amd64 extra block backend modules for qemu-system and qemu-utils
ii qemu-kvm 1:4.2-3ubuntu6.23 amd64 QEMU Full virtualization on x86 hardware
ii qemu-system-common 1:4.2-3ubuntu6.23 amd64 QEMU full system emulation binaries (common files)
ii qemu-system-data 1:4.2-3ubuntu6.23 all QEMU full system emulation (data files)
ii qemu-system-gui:amd64 1:4.2-3ubuntu6.23 amd64 QEMU full system emulation binaries (user interface and audio support)
ii qemu-system-x86 1:4.2-3ubuntu6.23 amd64 QEMU full system emulation binaries (x86)
ii qemu-utils 1:4.2-3ubuntu6.23 amd64 QEMU utilities

# dpkg -l | grep nova
ii nova-common 2:21.2.4-0ubuntu1 all OpenStack Compute - common files
ii nova-compute 2:21.2.4-0ubuntu1 all OpenStack Compute - compute node base
ii nova-compute-kvm 2:21.2.4-0ubuntu1 all OpenStack Compute - compute node (KVM)
ii nova-compute-libvirt 2:21.2.4-0ubuntu1 all OpenStack Compute - compute node libvirt support
ii python3-nova 2:21.2.4-0ubuntu1 all OpenStack Compute Python 3 libraries
ii python3-novaclient 2:17.0.0-0ubuntu1 all client library for OpenStack Compute API - 3.x

# dpkg -l | grep multipath
ii multipath-tools 0.8.3-1ubuntu2 amd64 maintain multipath block device access

# dpkg -l | grep iscsi
ii libiscsi7:amd64 1.18.0-2 amd64 iSCSI client shared library
ii open-iscsi 2.0.874-7.1ubuntu6.2 amd64 iSCSI initiator tools

# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.4 LTS"

Instance OS: Debian-11-amd64

[2. Test scenario]
Already created instance with two volumes attached. First - 10GB for root system, second - 115GB used as vdb. Recognized by compute node as vda - dm-11, vdb - dm-9:

# virsh domblklist 90fas439-fc0e-4e22-8d0b-6f2a18eee5c1
 Target Source
----------------------
 vda /dev/dm-11
 vdb /dev/dm-9

# multipath -ll
(...)
36e00084100ee7e7ed6ad25d900002f6b dm-9 HUAWEI,XSG1
size=115G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  |- 14:0:0:4 sdm 8:192 active ready running
  |- 15:0:0:4 sdo 8:224 active ready running
  |- 16:0:0:4 sdl 8:176 active ready running
  `- 17:0:0:4 sdn 8:208 active ready running
(...)
36e00084100ee7e7ed6acaa2900002f6a dm-11 HUAWEI,XSG1
size=10G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  |- 14:0:0:3 sdq 65:0 active ready running
  |- 15:0:0:3 sdr 65:16 active ready running
  |- 16:0:0:3 sdp 8:240 active ready running
  `- 17:0:0:3 sds 65:32 active ready running

Creating a new instance, with the same OS guest system and 10GB root volume. After successful deployment, creating a new volume(1GB) size and attaching it to newly created instance. We can see after that:

# multipath -ll
(...)
36e00084100ee7e7ed6ad25d900002f6b dm-9 HUAWEI,XSG1
size=115G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  |- 14:0:0:10 sdao 66:128 failed faulty running
  |- 14:0:0:4 sdm 8:192 active ready running
  |- 15:0:0:10 sdap 66:144 failed faulty running
  |- 15:0:0:4 sdo 8:224 active ready running
  |- 16:0:0:10 sdan 66:112 failed faulty running
  |- 16:0:0:4 sdl 8:176 active ready running
  |- 17:0:0:10 sdaq 66:160 failed faulty running
  `- 17:0:0:4 sdn 8:208 active ready running

This way at instance we were able to see a new drive - not 1GB, but 115GB -> so it seems it was incorrectly attached and this way we were able to destroy some data on that volume.

Additionaly we were able to see many errors like that in compute node logs:

# dmesg -T | grep dm-9
[Fri Jan 27 13:37:42 2023] blk_update_request: critical target error, dev dm-9, sector 62918760 op 0x1:(WRITE) flags 0x8800 phys_seg 2 prio class 0
[Fri Jan 27 13:37:42 2023] blk_update_request: critical target error, dev dm-9, sector 33625152 op 0x1:(WRITE) flags 0x8800 phys_seg 6 prio class 0
[Fri Jan 27 13:37:46 2023] blk_update_request: critical target error, dev dm-9, sector 66663000 op 0x1:(WRITE) flags 0x8800 phys_seg 5 prio class 0
[Fri Jan 27 13:37:46 2023] blk_update_request: critical target error, dev dm-9, sector 66598120 op 0x1:(WRITE) flags 0x8800 phys_seg 5 prio class 0
[Fri Jan 27 13:37:51 2023] blk_update_request: critical target error, dev dm-9, sector 66638680 op 0x1:(WRITE) flags 0x8800 phys_seg 12 prio class 0
[Fri Jan 27 13:37:56 2023] blk_update_request: critical target error, dev dm-9, sector 66614344 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 0
[Fri Jan 27 13:37:56 2023] blk_update_request: critical target error, dev dm-9, sector 66469296 op 0x1:(WRITE) flags 0x8800 phys_seg 24 prio class 0
[Fri Jan 27 13:37:56 2023] blk_update_request: critical target error, dev dm-9, sector 66586472 op 0x1:(WRITE) flags 0x8800 phys_seg 3 prio class 0
(...)

Unfortunately we do not know what is a perfect test-scenario for it as we could face such issue in less than 2% of our tries, but it looks like a serious security breach.

Additionally we observed that linux kernel is not fully clearing a device allocation(from volume detach), so some of drives names are visible in an output, i.e. lsblk command. Then, after new volume attachment we can see such names(i.e. sdao, sdap, sdan and so on) are reusable by that drive and wrongly mapped by multipath/iscsi to another drive and this way we hit an issue.
Our question is why linux kernel of compute node is not removing devices allocation and this way is leading to a scenario like that? Maybe this can be a solution here.

Thanks in advance for your help and understanding. In case when more details is needed, do not hesitate to contact me.

CVE References

Revision history for this message
Jeremy Stanley (fungi) wrote :

Since this report concerns a possible security risk, an incomplete
security advisory task has been added while the core security
reviewers for the affected project or projects confirm the bug and
discuss the scope of any vulnerability along with potential
solutions.

description: updated
Changed in ossa:
status: New → Incomplete
Revision history for this message
Dan Smith (danms) wrote :

I feel like this is almost certainly something that will require involvement from the cinder people. Nova's part in the volume attachment is pretty minimal, in that we get stuff from cinder, pass it to brick, and then configure the guest with the block device we're told (AFAIK). Unless we're messing up the last step, I think it's likely this is not just a Nova thing. Should we add cinder or brick as an affected project or just add some cinder people to the bug here?

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

> Should we add cinder or brick as an affected project or just add some cinder people to the bug here?

I'd be in favor of adding the cinder project which would pull the cinder coresec team, right?

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

In the meantime, could you please provide us the block device mapping information that's stored in the DB and ideally the cinder-side attachment information ?

Putting the bug report to Incomplete, please mark its status back to New when you reply.

Changed in nova:
status: New → Incomplete
Revision history for this message
Jan Wasilewski (janwasilewski) wrote :
Download full text (45.5 KiB)

Hi,

below you can find requested information from OpenStack DB. There is no issue right now, but maybe historical tracking could list to some hint? Anyway, issue was related with /dev/vdb drive for instance: 128f1398-a7c5-48f8-8bbc-a132e3e2d556 -> in DB output you can observe that size of volume is 15GB, when directly from instance it was reported as 115GB(so vdb of second instance presented in this output)

mysql> select * from block_device_mapping where instance_uuid = '90fda439-fc0e-4e22-8d0b-6f2a18eeb9c1';
+---------------------+---------------------+------------+--------+-------------+-----------------------+--------------------------------------+--------------------------------------+-------------+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+---------+-------------+------------------+--------------+-------------+----------+------------+----------+------+--------------------------------------+--------------------------------------+-------------+
| created_at | updated_at | deleted_at | id | device_name | delete_on_termination | snapshot_id | volume_id | volume_size | no_device | connection_info | instance_uuid | deleted | source_type | destination_type | guest_format | device_type | disk_bus | boot_index | image_id | ta...

Changed in nova:
status: Incomplete → New
Revision history for this message
Jeremy Stanley (fungi) wrote :

I've added Cinder as an effected project (though maybe it should be os-brick?) and subscribed the Cinder security reviewers for additional input.

Revision history for this message
Rajat Dhasmana (whoami-rajat) wrote :

Hi,

Based on the given information, the strange part is same multipath device is used for the old and new volume 36e00084100ee7e7ed6ad25d900002f6b

36e00084100ee7e7ed6ad25d900002f6b dm-9 HUAWEI,XSG1
size=115G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  |- 14:0:0:4 sdm 8:192 active ready running
  |- 15:0:0:4 sdo 8:224 active ready running
  |- 16:0:0:4 sdl 8:176 active ready running
  `- 17:0:0:4 sdn 8:208 active ready running

36e00084100ee7e7ed6ad25d900002f6b dm-9 HUAWEI,XSG1
size=115G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  |- 14:0:0:10 sdao 66:128 failed faulty running
  |- 14:0:0:4 sdm 8:192 active ready running
  |- 15:0:0:10 sdap 66:144 failed faulty running
  |- 15:0:0:4 sdo 8:224 active ready running
  |- 16:0:0:10 sdan 66:112 failed faulty running
  |- 16:0:0:4 sdl 8:176 active ready running
  |- 17:0:0:10 sdaq 66:160 failed faulty running
  `- 17:0:0:4 sdn 8:208 active ready running

Also it's interesting to note that the paths under the multipath device (sdm, sdo, sdl, sdn) with LUN ID: 4 are also used by the second multipath device whereas it should use LUN 10 paths (which is currently in failed faulty status).

This looks multipath related but it would be helpful if we can get the os-brick logs for this 1GB volume attachment to understand if os-brick is doing something that is resulting in this.

I would also recommend to cleanup the system with any leftover devices of past failed detachments (i.e. flush and remove mpath devices not belonging to any instance) that might be interfering with this. Although I'm not certain if that's the case, it's still to cleanup those devices.

Revision history for this message
Gorka Eguileor (gorka) wrote :
Download full text (3.2 KiB)

Hi,

I think I know what happened, but there are some things that don't match unless
somebody has manually changed some things in the host (like cleaning up
multipaths).

Bit of context:

- SCSI volumes (iSCSI and FC) on Linux are NEVER removed automatically by the
  kernel and must always be removed explicitly. This means that they will
  remain in the system even if the remote connection is severed, unless
  something in OpenStack removes it.

- The os-brick library has a strong policy of not removing devices from the
  system if flushing fails during detach, to prevent data loss.

  The `disconnect_volume` method in the os-brick library has an additional
  parameter called `force` to allow callers to ignore flushing errors and
  ensure that the devices are being removed. This is useful when after failing
  the detach the volume is either going to be deleted or into error status.

I don't have the logs, but from what you said my guess is that this is what has
happened:

- Volume with SCSI ID 36e00084100ee7e7ed6ad25d900002f6b was attached to that
  host on LUN 10 at some point since the last reboot (sdao, sdap, sdan, sdaq).

- When detaching the volume from the host using os-brick the operation failed
  and it wasn't removed, yet Nova still called Cinder to unexport and unmap the
  volume. At this point LUN 10 is free on the Huawei array and the volume is
  no longer attacheable, but /dev/sda[o-q] are still present, and their SCSI_ID
  are still known to multipathd.

- Nova asked Cinder to attach the volume again, and the volume is mapped to LUN
  4 (which must have been available as well) and it successfully attaches (sdm,
  sdo, sdl, sdn), appears as a multipath, and is used by the VM.

- Nova asks Cinder to export and map the new 1GB volume, and Huawai maps it to
  LUN 10, at this point iSCSI detects that the remote LUNs are back and
  reconnects to them, which makes the multipathd path checker detect sdao,
  sdap, sdan, sdaq are alive on the compute host and they are added to the
  existing multipath device mapper using their known SCSI ID.

You should find out why the detach actually failed, but I think I see multiple
issues:

- Nova:

  - Should not call Cinder to unmap a volume if the os-brick to disconnect the
    volume has failed, as we know this will leave leftover devices that can
    cause issues like this.

  - If it's not already doing it, Nova should call disconnect_volume method
    from os-brick passing force=True when the volume is going to be deleted.

- os-brick:

  - Should try to detect when the newly added devices are being added to a
    multipath device mapper that has live paths to other LUNs and fail if that
    is the case.

  - As an improvement over the previous check, os-brick could forcefully remove
    those devices that are in the wrong device mapper, force a refresh of their
    SCSI IDs and add them back to multipathd to form a new device mapper.
    Though personally this is a non trivial and maybe potentially problematic
    feature.

In other words, the source of the problem is probably Nova, but os-brick should
try to prevent these possible data leaks.

Cheers,
Gorka.

[1]: https://github.com/opens...

Read more...

Revision history for this message
Dan Smith (danms) wrote :

I don't see in the test scenario description that any instances had to be deleted or volumes disconnected for this to happen. Maybe the reporter can confirm with logs if this is the case?

I'm still chasing down the nova calls, but we don't ignore anything in the actual disconnect other than "volume not found". I need to follow that up to where we call cinder to see if we're ignoring a failure.

When you say "nova should call disconnect_volume with force=true if the volume is going to be deleted... I'm not sure what you mean by this. Do you mean if we're disconnecting because of *instance* delete and are sure that we don't want to let a failure hold us up? I would think this would be dangerous because just deleting an instance doesn't mean you don't care about the data in the volume.

It seems to me that if brick *has* the information available to it to avoid connecting a volume to the wrong location, that it's the thing that needs to guard against this. Nova has no knowledge of the things underneath brick, so we don't know that wires are going to get crossed. Obviously if we can do stuff to avoid even getting there, then we should.

Revision history for this message
Jan Wasilewski (janwasilewski) wrote :

Hi,

I'm just wondering if there is a chance for me to try to reproduce an issue again with all debug flags set to on. Should I turn on this flag on controllers(cinder, nova) or compute node logs(with debug flags set to on) should be enough to further troubleshoot this issue? If yes, please let me know which flags are needed here, just to speed up further troubleshooting. As I said - this case is not easy to reproduce, I can't even say what is a trigger here, but we faced it 3 or 4 times already.

Thanks in advance for reply and your helps so far.

Best regards,
Jan

Revision history for this message
Gorka Eguileor (gorka) wrote :

Apologies if I wasn't clear enough.

The disconnect call I say it's probably being ignored/swallowed is the one to os-brick, not Cinder. In other words, Nova first calls os-brick to disconnect the volume from the compute host and then always considers this as successful (at least in some scenarios, probably instance destruction). Since it always considers in those scenarios that local disconnect was successful it calls Cinder to unmap/unexport the volume.

The force=True parameter to os-brick's disconnect_volume should only be added when the BDM for the volume has the delete on disconnect flag thingy.

OS-Brick has the information, the problem is that multipathd is the one that is adding the leftover devices that have been reused to the multipath device mapper.

Revision history for this message
Gorka Eguileor (gorka) wrote :

A solution/workaround would be to change /etc/multipath.conf and set "recheck_wwid" to yes.

I haven't actually tested it myself, but the documentation explicitly calls out that it's used to solve this specific issue: "If set to yes, when a failed path is restored, the multipathd daemon rechecks the path WWID. If there is a change in the WWID, the path is removed from the current multipath device, and added again as a new path. The multipathd daemon also checks the path WWID again if it is manually re-added."

I believe this is probably something that is best fixed at the deployment tool level. For example extending the multipathing THT template code [1] to support "recheck_wwid" and defaulting it to yes instead to no like multipath.conf does.

[1]: https://opendev.org/openstack/tripleo-heat-templates/commit/906d03ea19a4446ed198c321f68791b7fa6e0c47

Revision history for this message
Dan Smith (danms) wrote :

Okay, thanks for the clarification.

Yeah, recheck_wwid seems like it should *always* be on to prevent potentially reconnecting to the wrong thing!

Revision history for this message
Jeremy Stanley (fungi) wrote :

If that configuration ends up being the recommended solution, we might want to consider drafting a brief security note with guidance for deployers and maintainers of deployment tooling.

Unless I misunderstand the conditions necessary, it sounds like it would be challenging for a malicious user to force this problem to occur. Is that the current thinking? If so, we could probably safely work on the actual text of the note in public.

Revision history for this message
melanie witt (melwitt) wrote :

> The disconnect call I say it's probably being ignored/swallowed is the one to os-brick, not Cinder. In other words, Nova first calls os-brick to disconnect the volume from the compute host and then always considers this as successful (at least in some scenarios, probably instance destruction). Since it always considers in those scenarios that local disconnect was successful it calls Cinder to unmap/unexport the volume.

I just checked and indeed Nova will ignore a volume disconnect error in the case of an instance being deleted [1]:

    try:
        self._disconnect_volume(context, connection_info, instance)
    except Exception as exc:
        with excutils.save_and_reraise_exception() as ctxt:
            if cleanup_instance_disks:
                # Don't block on Volume errors if we're trying to
                # delete the instance as we may be partially created
                # or deleted
                ctxt.reraise = False
                LOG.warning(
                    "Ignoring Volume Error on vol %(vol_id)s "
                    "during delete %(exc)s",
                    {'vol_id': vol.get('volume_id'),
                     'exc': encodeutils.exception_to_unicode(exc)},
                    instance=instance)

In all other scenarios, Nova will not proceed further if the disconnect was not successful.

If Nova does proceed past _disconnect_volume(), it will later call Cinder API to delete the attachment [2]. I assume that is what does the unmap/unexport.

[1] https://github.com/openstack/nova/blob/1bf98f128710c374a0141720a7ccc21f5d1afae0/nova/virt/libvirt/driver.py#L1445-L1459 (ussuri)
[2] https://github.com/openstack/nova/blob/1bf98f128710c374a0141720a7ccc21f5d1afae0/nova/compute/manager.py#L2922 (ussuri)

Revision history for this message
Jan Wasilewski (janwasilewski) wrote :

I believe it can be a bit challenging for ubuntu users to introduce recheck_wwid parameter. What I checked already, this parameter is available for multipath-tools, but the package which provides it is on-board with ubuntu 22.04LTS. Older ubuntu releases do not have this possibility and gives an error:
/etc/multipath.conf line XX, invalid keyword: recheck_wwid

I made such assumption based on release documentation:
- for ubuntu 20.04: https://manpages.ubuntu.com/manpages/focal/en/man5/multipath.conf.5.html
- for ubuntu 22.04: https://manpages.ubuntu.com/manpages/jammy/en/man5/multipath.conf.5.html

So it seems that partially Yoga, but fully Zed OS release can take such parameter directly, but older releases should manage such change differently.

I know that OpenStack code is independent of linux distros, but just wanted to add this info here, as worth to consider.

Revision history for this message
Gorka Eguileor (gorka) wrote :

I don't know if my assumption is correct or not, because I can't reproduce the multipath device mapper situation from the report (some failed some active) no matter how much I force things to break in different ways.

Since each iSCSI storage backend behaves differently I don't know if I can't reproduce it because the difference in behavior or because the way I'm trying to reproduce it is different. It may even be that multipathd is different in my system.

Unfortunately I don't know if the host where that happened had leftover devices before the leak happened, or what the SCSI IDs of the 2 volumes involved really are.

From os-brick's connect_volume perspective what it did is the right thing, because when it looked at the multipath device containing the newly connected devices it was dm-9, so that's the one that it should return.

How multipath ended up with 2 different volumes in the same device mapper, I don't know.

I don't think "recheck_wwid" would solve the issue because os-brick would be too fast in finding the multipath and it wouldn't give enough time for multipathd to activate the paths and form a new device mapper.

In any case I strongly believe that nova should never proceed to delete the cinder attachment if detaching with os-brick fails because that usually implies data loss.

The exception would be when the cinder volume is going to be delete after disconnecting it, and in that case the disconnect call to os-brick should be always forced, since data loss is irrelevant.

That would ensure that compute nodes are not left with leftover devices that could cause problems.

I'll see if I can find a reasonable improvement in os-brick that would detect this issues and fail the connection, although it's probably going to be a bit of a mess.

Revision history for this message
Jan Wasilewski (janwasilewski) wrote :
Download full text (6.3 KiB)

@Gorka Eguileor: I can try to reproduce this case with recheck_wwid option set to true when a valid package of multipath-tools will be available for ubuntu 20.04.

What I can add is that it happened only for one compute node, but I've seen similar warnings in other compute nodes in dmesg -T output, which looks dangerously, but so far I haven't faced similar issue there:

[Thu Feb 9 14:28:16 2023] scsi_io_completion: 42 callbacks suppressed
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#2 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#2 Sense Key : Illegal Request [current]
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#2 Add. Sense: Logical unit not supported
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#2 CDB: Read(10) 28 00 03 bf ff 00 00 00 08 00
[Thu Feb 9 14:28:16 2023] print_req_error: 42 callbacks suppressed
[Thu Feb 9 14:28:16 2023] print_req_error: I/O error, dev sdgr, sector 62914304
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#2 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#2 Sense Key : Illegal Request [current]
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#2 Add. Sense: Logical unit not supported
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#2 CDB: Read(10) 28 00 03 bf ff 00 00 00 01 00
[Thu Feb 9 14:28:16 2023] print_req_error: I/O error, dev sdgr, sector 62914304
[Thu Feb 9 14:28:16 2023] buffer_io_error: 30 callbacks suppressed
[Thu Feb 9 14:28:16 2023] Buffer I/O error on dev sdgr1, logical block 62686976, async page read
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#3 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#3 Sense Key : Illegal Request [current]
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#3 Add. Sense: Logical unit not supported
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#3 CDB: Read(10) 28 00 03 bf ff 01 00 00 01 00
[Thu Feb 9 14:28:16 2023] print_req_error: I/O error, dev sdgr, sector 62914305
[Thu Feb 9 14:28:16 2023] Buffer I/O error on dev sdgr1, logical block 62686977, async page read
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#4 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#4 Sense Key : Illegal Request [current]
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#4 Add. Sense: Logical unit not supported
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#4 CDB: Read(10) 28 00 03 bf ff 02 00 00 01 00
[Thu Feb 9 14:28:16 2023] print_req_error: I/O error, dev sdgr, sector 62914306
[Thu Feb 9 14:28:16 2023] Buffer I/O error on dev sdgr1, logical block 62686978, async page read
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#5 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#5 Sense Key : Illegal Request [current]
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#5 Add. Sense: Logical unit not supported
[Thu Feb 9 14:28:16 2023] sd 15:0:0:98: [sdgr] tag#5 CDB: Read(10) 28 00 03 bf ff 03 00 00 01 00
[Thu Feb 9 14:28:16 2023] print_req...

Read more...

Revision history for this message
Gorka Eguileor (gorka) wrote :

Don't bother trying with recheck_wwid, as it won't work due to the speed of os-brick.

Revision history for this message
Gorka Eguileor (gorka) wrote :

I have finally been able to reproduce the issue.

So far I have been able to identify 3 different ways to create similar situations to the reported one, and it was what I thought, leftover devices from a 'nova delete' call.

Took me longer to figure it out because it requires an iSCSI Cinder driver that uses shared targets, and the one I use doesn't.

After I locally modified the cinder driver code to do target sharing and then force a disconnect error on specific Nova calls to os-brick I was able to work it out.

I have a local patch that detects these issues and fixes them the best it can, but I wouldn't like to backport that, because the fixing is a bit scary as a backport.

So I'll split the code into 2 patches:

- The backportable patch that detects and prevents the connection if a potential leak is detected. To fix this manual intervention will be necessary.

- Another patch that extends the previous code to try to fix things when possible.

Revision history for this message
melanie witt (melwitt) wrote :

> In any case I strongly believe that nova should never proceed to delete the cinder attachment if detaching with os-brick fails because that usually implies data loss.

> The exception would be when the cinder volume is going to be delete after disconnecting it, and in that case the disconnect call to os-brick should be always forced, since data loss is irrelevant.

> That would ensure that compute nodes are not left with leftover devices that could cause problems.

Understood. I guess that must mean that the reported bug scenario is a volume that is *not* delete_on_termination=True attached to an instance that is being deleted.

I think we could probably propose a patch in nova to not delete the attachment if it's instance delete + not delete_on_termination.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Hi Melanie,

In my opinion there should be 2 code changes to prevent leaving devices behind:

- Instance deletion operation should fail like the normal volume-detach call when the disconnect_volume call fails, even if the instance is left in a "weird" state, manual intervention is usually necessary to fix things.
  This manual intervention does not necessarily mean doing something to the volume, it can be fixing the network.

- Any Cinder volume with delete_on_termination=True should have the os-brick call to disconnect_volume with "force=True, ignore_errors=True" parameters.
  The tricky part here is that not all os-brick connectors support the force parameter, so when the call fails we have to decide whether to halt the operation and wait for human intervention, or just log it and continue as we are doing today.
  We could make an effort in os-brick to increase coverage of the force parameter.

Thanks,
Gorka.

Revision history for this message
Dan Smith (danms) wrote :

Our policy is that instance delete should never fail, and I think that's the experience the users expect. Perhaps we need to still mark the instance deleted immediately and continue retrying the volume detach in a periodic until it succeeds, but that's the only thing I can see working.

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

Agree with Dan, we shouldn't raise an exception on instance delete but rather possibly make some status available for knowing whether the volume was eventually detached.

For example, we accept to delete an instance if the compute goes down (as the user may not know that the underlying compute is in a bad state) and we only delete the instance when the compute is back.

That being said, I don't really see how we can easily fix this in a patch as we should discuss this correctly. Would a LOG statement adverting that the volume connection is still present would help ?

Revision history for this message
melanie witt (melwitt) wrote :
Download full text (3.2 KiB)

We definitely should not allow a delete to fail from a user's perspective.

My suggestion of a patch to not delete an attachment when detach fails during instance delete if delete_on_termination=False is intended to be better than what we have today, not necessarily to be perfect.

We could consider doing a periodic like Dan mentions. We already do similar with our "cleanup running deleted instances" periodic. The volume attachment cleanup could be hooked into that if it doesn't already do it.

From what I can tell, our periodic is already capable of taking care of it, but it's not enabled [1][2]:

    elif action == 'reap':
        LOG.info("Destroying instance with name label "
                 "'%s' which is marked as "
                 "DELETED but still present on host.",
                 instance.name, instance=instance)
        bdms = objects.BlockDeviceMappingList.get_by_instance_uuid(
            context, instance.uuid, use_slave=True)
        self.instance_events.clear_events_for_instance(instance)
        try:
            self._shutdown_instance(context, instance, bdms,
                                    notify=False)
            self._cleanup_volumes(context, instance, bdms,
                                  detach=False)

    def _cleanup_volumes(self, context, instance, bdms, raise_exc=True,
                         detach=True):
        original_exception = None
        for bdm in bdms:
            if detach and bdm.volume_id:
                try:
                    LOG.debug("Detaching volume: %s", bdm.volume_id,
                              instance_uuid=instance.uuid)
                    destroy = bdm.delete_on_termination
                    self._detach_volume(context, bdm, instance,
                                        destroy_bdm=destroy)
                except Exception as exc:
                    original_exception = exc
                    LOG.warning('Failed to detach volume: %(volume_id)s '
                                'due to %(exc)s',
                                {'volume_id': bdm.volume_id, 'exc': exc})

            if bdm.volume_id and bdm.delete_on_termination:
                try:
                    LOG.debug("Deleting volume: %s", bdm.volume_id,
                              instance_uuid=instance.uuid)
                    self.volume_api.delete(context, bdm.volume_id)
                except Exception as exc:
                    original_exception = exc
                    LOG.warning('Failed to delete volume: %(volume_id)s '
                                'due to %(exc)s',
                                {'volume_id': bdm.volume_id, 'exc': exc})
        if original_exception is not None and raise_exc:
            raise original_exception

Currently we're calling _cleanup_volumes with detach=False. Not sure what the reason for that is but if we determine there should be no problems with it, we can change it to detach=True in combination with not deleting the attachment on instance delete if delete_on_termination=False.

[1] https://github.com/openstack/nova/blob/a2964417822bd1a4a83fa5c27282d2be1e18868a/nova/compute/manager.py#L10579
[2] https://github.com/openstack/nova/blob/a2964417822bd1a4a83f...

Read more...

Revision history for this message
Gorka Eguileor (gorka) wrote :

What is the reason why Nova has the policy that deleting the instance should never fail?

I'm talking about the instance record, not the VM itself, because I agree that the VM should always be deleted to free resources.

From my perspective deleting the instance record would result in a very weird user experience and in users manually creating the same situation we are trying to avoid.

- User requests instance deletion
- Calls to disconnect_volume fails
- Nova removes everything it can and at the end even the instance record, while it keeps trying to disconnect the device in the background.
- User wants to use the volume again but sees that it's in-use in Cinder
- Looks for the instance in Nova thinking that something may have gone wrong, but not seeing it there thinks it's a problem between cinder and nova.
- Runs the `cinder delete-attachment` command to return the volume to available state.

We end up in the same situation as we were before, with leftover devices.

Revision history for this message
Dan Smith (danms) wrote :

Because the user wants to delete a thing in our supposed "elastic infrastructure". They want their quota back, they want to stop being billed for it, they want the IP for use somewhere else, or whatever. They don't care that we can't delete it because of some backend failure - that's not their problem. That's why we have the ability to queue the delete even if the compute is down - that's how important it is.

It's also not at all about deleting the VM, it's about the instance going away from the perspective of the user (i.e. marking the instance record as deleted). The instance record is what determines if they're billed for it, if their quota is used, etc. We "charge" the user the same whether the VM is running or not. Further, even if we have stopped the VM, we cannot re-assign the resources committed to that VM until the deletion completes in the backend. Another scenario that infuriates operators is "I've deleted a thing, the compute node should be clear, but the scheduler tells me I can't boot something else there."

Your example workflow is exactly why I feel like the solution to this problem can't (entirely) be one of preventing a delete if we fail to detach. Because the admins will just force-delete/detach/reset-state/whatever until things free up (as I would expect to do myself). Especially if the user is demanding that they get their quota back, stop being billed, and/or attach the volume somewhere else.

It seems to me that there *must* be some way to ensure that we never attach a volume to the wrong place. Regardless of how we get there, there must be some positive affirmation that we're handing precious volume data to the right person.

Revision history for this message
Gorka Eguileor (gorka) wrote :

The quota/billing issue is a matter of Nova code. In cinder we resolve it by having a flag for resources (volume and snapshots) to reflect whether they consume quota or not.

The same thing could be done in Nova to reflect what resources are actually consumed by the instance (IPs, VMs, GPUs, etc) and therefore billable.

Users not caring about backend errors would be, in my opinion, naive thinking on their part, since they DO CARE about their persistent data being properly written and they want to avoid data loss, data corruption, and data leakage above all else.

I assume users would also want to have a consistent view of their resources, so if a volume says it's attached to an instance the instance should still exist, otherwise there is an invalid reference.

Data leak/corruption may be prevented in some cases with the code I'm working on for os-brick (although some drivers are missing the feature required), but that won't prevent data loss. For that Nova would need to do the sensible thing.

I'm going to do some additional testings today, because this report is about something that happens accidentally, but I believe there is a way to actually exploit this to gain access to other users data. Though fixing that would require yet another bunch of code.

In other words, there are 3 different to fix here:

- Nova doing the right thing to prevent data corruption/leak/loss.
- os-brick detection of the right volume to prevent data leak.
- Prevent intentional data leak.

Revision history for this message
Jeremy Stanley (fungi) wrote :

If there is indeed a way for a normal user (not an operator) of the environment to cause this information leak to happen and then take advantage of it, we should find a way to prevent at least that aspect before making this report public.

If it's not a condition that a normal user can intentionally cause to happen, then it's probably fine to fix this in public instead.

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

Gorka, Nova even doesn't really know about the Cinder backends, it just uses os-brick.

So, when Nova asks to attach a volume, only os-brick knows whether it's the right volume. That's why I think it's important to have brick to be able to say 'no'.

Revision history for this message
Dan Smith (danms) wrote :

Right, we have to trust os-brick to give us a block device that is actually the thing we're supposed to attach to the guest.

I'm really concerned about what sounds like a very loose association between what we pass to brick from cinder and what we get back from brick in terms of a block device. Isn't there some way for brick to walk the multipath device and the backing iSCSI/FC devices to check WWNs or something to ensure that it's consistent and points to what we expect?

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

> If there is indeed a way for a normal user (not an operator) of the environment to cause this information leak to happen and then take advantage of it, we should find a way to prevent at least t hat aspect before making this report public.

Well, I'm trying hard to find a possible attack vector from a malicious user and I don't see any.
I don't disagree with the bug report as it can potentially leak data to any instance, but I don't know how someone could take benefit of this information.

Here, I'm just one voice and I leave others to chime in, but I'm in favor of making this report public so we can discuss the potential solutions with the stakeholders and any operator having concerns about it.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Let me summarize things:

1. The source of the problem reported in this bug is that Nova has been doing something wrong since forever. I've been bringing this up for the past 7 years, and every single time we end up in the same place, nova giving priority to instance deletion over everything else.

2. There are some things that os-brick can do to try to detect when Nova doesn't do its job right, but this is equivalent to a taxi driver asking passengers to learn to fall because the car is not going to stop when they want to get off. It's a lot harder to do and it doesn't sound all that reasonable.

3. There is an attack vector that can be exploited and it's pretty easy to do (I've done it locally) but it's separate from the issue reported here and it hasn't existed for as long as the that one. I would resolve this in a different way than the workaround mentioned in #2.

Seeing as we are back to the same conversation of the past 7 years, we'll probably end up in the same place, so I'll just do my best to resolve the attack vector and also introduce code to resolve Nova's mistakes.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Oh, I failed to clarify something. The user exploit case can be made secure (as far as I can tell), but for the scenario in this bug's description, the only secure solution is fixing nova, the os-brick code I'm working on will only reduce the window were the data is leaked or can be corrupted.

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

Gorka, I don't want to debate on projects's responsibility, but I'd rather focus on the data leakage, which is the subject of this security report.

The fact that a volume detach can leave residue if a flush error occurs is certainly not ideal, but this isn't a security problem *UNTIL* the remaining devices are reused.
To me, it appears that the data leal occurs on the attach and not on the detach and I'd rather prefer to see os-brick avoiding this situation.

That being said, I think Melanie, Dan and I agreed on trying to find a way to asynchronously clean up the devices (see comments #24 #25 and #27) and that can be discussed publicly, but again, this won't help with the data leakage that occurs on the attach command.

Revision history for this message
Dan Smith (danms) wrote :

Okay Gorka and I just had a nice long chat about things and I think we made some progress on understanding the (several) ways we can get into this situation and came up with some action items. I'll try to summarize here and I'll look for Gorka to correct me if I get anything wrong.

I think that we're now on the same page that delete of a running instance is much more of a forceful act than some might think, and that we expect to try to be graceful with that, but with a limited amount of patience before we kill it with fire. That maps to us actually always calling force=True when we do the detachment. Even with force=True, brick *tries* to flush and disconnect gracefully, but if it can't, will cut things off at the knees. Thus, if we did force=True now, we wouldn't get into the situation the bug describes because we would *definitely* have cleaned up at that point.

It sounds like there are some robustification steps that can be made in brick to do more validation of the full chain from instance->multipathd->iscsi->volume when we're doing attachments to try to avoid getting into the situation described by this bug, so Gorka is going to work on that.

Gorka also described another way to get into this situation, which is much more exploitable by the user, and I'll let him describe it in more detail. But the short story is that cinder should not let users delete attachments for instances that nova says are running (i.e. not deleted).

Multipathd, while well-intentioned, also has some behavior that is counterproductive when recovering from various situations where paths to a device get disconnected. Enabling the recheck_wwid thing in multipathd should be a recommended flag to have enabled to reduce the likelihood of that happening. Especially in the case where nova has allowed a blind delete due to a downed compute node, we need multipathd to not "help" by reattaching things without extra checks.

So, the action items roughly are:

1. Nova should start passing force=True in our call to brick detach for instance delete
2. Recommend the recheck_wwid flag for multipathd, and get deployment tools to enable it
3. Robustification of brick's attach workflow to do some extra sanity checks
4. Cinder should refuse to allow users to delete an attachment for an active volume

Based on the cinder user-exploitable attack vector, it sounds to me like we should keep this bug private on that basis until we have at least the cinder/nova validation step in place. We could create another one for just that scenario, but publicizing the accidental scenario and discussion we have in this bug now might be enough of a suggestion that more people would figure out the user-oriented attack.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Sylvain, the data leak/corruption presented in this bug report is caused by the detach on the nova side.

It may happen when we do the attach, but it is 100% caused by the detach problem, so just focusing on the attach part is not right considering the RCA is the leftover devices from the detach.

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

Gorka, I eventually understood all the problems we have and what Dan wrote at comment #38 look good to me as action items.

Yeah, we need to keep this bug private for a bit until we figure out a solid plan for fixing those 4 items and yeah, we need to both force-delete the attachment while we also try to solidify the attachment calls.

Revision history for this message
melanie witt (melwitt) wrote :

I'm attaching a potential patch for nova to use force=True when calling os-brick disconnect_volume() when an instance is being deleted.

Only the libvirt and hyperv drivers are calling os-brick disconnect_volume() that I found, and it's part of the driver.destroy() path.

This change ended up being larger than expected ... I aimed to add basic test coverage for passing the force kwarg through and there are a lot of volume drivers.

If anyone wants something changed or otherwise finds issues in the patch, please let me know.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Hi Melanie,

I have tried the patch and works as expected, resolving the most common case of having leftover devices on Compute nodes. Thanks!!

Dan mentioned that the delete of an instance is more in line with a power removal of a computer than a shutdown, and that's why using `force=True` makes sense because it will try to do it cleanly if possible but data loss is possible.

I looked at the API docs [1] for the delete operation and I don't see this idea stated there. Should we update the docs to explicitly state that deleting an instance can result in data loss?

Cheers.

[1]: https://docs.openstack.org/api-ref/compute/?expanded=detach-a-volume-from-an-instance-detail,show-a-detail-of-a-volume-attachment-detail,show-server-action-details-detail,delete-server-detail#delete-server

Revision history for this message
melanie witt (melwitt) wrote :

Hi Gorka,

Thank you for trying out the patch!

I agree more detailed docs could be helpful and have proposed a doc update for review:

  https://review.opendev.org/c/openstack/nova/+/874188

Revision history for this message
Gorka Eguileor (gorka) wrote :

This is the patch I've prepared for Cinder to prevent users from exploiting the data leak issue or even to unintentionally leave leftover devices by deleting the cinder attachment record.

With the nova patch and this one we cover most of the scenarios, but not all, since I've been told that there are scenarios where an instance is deleted without contact with the actual

I have to cleanup the os-brick code, write the unit tests, and see how the "recheck_wwid" multipath config option interacts with it.

I also have to try and see if the issue also happens in FC, in which case I would need to modify the os-brick patch and also write a new one to add support for the "force" parameter in the "disconnect_volume" method.

Since there are some calls to Nova I would appreciate reviews from the Nova team to confirm that I didn't miss anything.

Revision history for this message
Gorka Eguileor (gorka) wrote :

I can't reproduce the issue using FC with an HPE 3PAR array, debugging it I found that the compute node receives a signal after the LUN has been remapped (this didn't happen in my iSCSI tests):

 Feb 17 13:05:20 localhost.localdomain kernel: sd 3:0:0:0: Power-on or device reset occurred
 Feb 17 13:05:20 localhost.localdomain kernel: sd 3:0:1:0: Power-on or device reset occurred

This is detected as a "change" in the block device:

  Feb 17 13:05:20 localhost.localdomain systemd-udevd[158430]: 3:0:1:0: /usr/lib/udev/rules.d/60-block.rules:8 ATTR '/sys/devices/pci0000:00/0000:00:05.0/host3/rport-3:0-4/target3:0:1/3:0:1:0/block/sdb/uevent' writing 'change'

Which triggers the code that uses an SCSI command to get the volume's WWID and then updates sysfs to reflect it.

  Feb 17 13:05:20 localhost.localdomain systemd-udevd[158430]: sdb: /usr/lib/udev/rules.d/60-persistent-storage.rules:66 Importing properties from results of 'scsi_id --export --whitelisted -d /dev/sdb'

After that rule another one for multipath is triggered to tell multipathd that it needs to check a device:

  Feb 17 13:05:20 localhost.localdomain systemd-udevd[158430]: sdb: /usr/lib/udev/rules.d/62-multipath.rules:36 Importing properties from results of '/sbin/multipath -u sdb'

Multipathd detects that the WWID has changed (because sysfs has been updated):

  Feb 17 13:05:20 localhost.localdomain multipathd[7007]: sdb: path wwid changed from '360002ac00000000000000b740000741c' to '360002ac00000000000000b750000741c'

And then reconfigures the old multipath device mapper to remove this device:

  Feb 17 13:05:20 localhost.localdomain multipathd[7007]: 360002ac00000000000000b740000741c: reload [0 2097152 multipath 1 queue_if_no_path 1 alua 1 1 service-time 0 3 1 8:0 1 8:48 1 8:32 1]
  Feb 17 13:05:20 localhost.localdomain multipathd[7007]: check_removed_paths: sdb: freeing path in removed state
  Feb 17 13:05:20 localhost.localdomain multipathd[7007]: 8:16: path removed from map 360002ac00000000000000b740000741c

And finally the new device mapper is formed:

  Feb 17 13:05:21 localhost.localdomain multipathd[7007]: sda [8:0]: path added to devmap 360002ac00000000000000b750000741c

I don't know if this is standard FCP behavior or if this is storage array specific and other storage arrays may not behave like this. I'm trying to get access to a different FC array to confirm.

Revision history for this message
Sean McGinnis (sean-mcginnis) wrote :

> I can't reproduce the issue using FC with an HPE 3PAR array, debugging it I found that the compute node receives a signal after the LUN has been remapped

This makes sense. On fibre channel fabrics, any time a LUN is added or removed an RSCN (https://en.wikipedia.org/wiki/Registered_state_change_notification) is sent out. That should signal to the HBA that it needs to recheck what it has access to, where in this case it will realize that the device it used to have access to is no longer present and trigger the cleanup of the device.

So in this case we are somewhat protected by the storage protocol itself.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Thanks Sean.

Those RSCNs should be the equivalente of the iSCSI AEN messages, that usually trigger the automatic scan of LUNs on the initiator side.

Those aren't happening in OpenStack iSCSI because I added a feature in Open iSCSI that we use os-brick to be able to disable them and only allow manual scans, that way we don't get leftover devices on the compute node when there's a race condition: a volume mapping to that compute happens on Cinder right after a 'volume_disconnect' has happened on that same compute node.

I'll have to check why we haven't seen that situation in FC, because if it's detecting new LUNs and acting on them then we should also get leftover devices

The only explanation I can think of is that maybe in FC the scan is not for all the LUNs but only the LUNs that are currently present in the host.

Simon from Pure is looking to see if he can give me access to a system to double check it also behaves like that.

Revision history for this message
Gorka Eguileor (gorka) wrote :

I did some additional testings related to my latest comment and the results are:

- LUN change notifications do not trigger a rescan in FCP, which is good because then we cannot have race conditions between detach and attach. That had been our understanding so far.

- The message that prevents the leak with FCP by triggering the udev rule is the "Power-on Reset" SCSI Sense code that is sent from the array, so I still need to check if this is common practice or not. Tomorrow I'll check it in one of Pure's arrays.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Bad news, as I feared the "Power-on Reset" that was "saving" us in FCP is not standard, and Pure storage array using FCP does not send it.

This means that we are not safe for FC and need to fix these issues in that os-brick connector as well. :-(

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

FWIW, I reviewed melanie's patch in comment #41 and I'm ~+2 with it.

Yeah, it was a larger than expected patch given we need to modify the signature for all the volume drivers :)

Gorka, do you want me to review your patch too, even if you found some issues with FC backends ?

Revision history for this message
Gorka Eguileor (gorka) wrote :

Sylvian, yes please, I would appreciate your review, as the Cinder patch is agnostic to the protocol.

The FC issue was relevant for the leak-prevention os-brick patch that I've been working on.

Attached patch for os-brick adds support to "disconnect_volume" on the FC connector for the "force" parameter. This is necessary for the Nova patch to also cover the FC cases.

Revision history for this message
Gorka Eguileor (gorka) wrote :

This patch is the os-brick leak prevention code that tries to detect and prevent data leak and corruption. It applies on top of the previous os-brick FC patch.

As I see it we have multiple situations that can lead to leak/corruption:

- The CVE that any normal user can exploit: Addressed by the Cinder patch.

- Unintended issue caused when deleting an instance if the detach fails: Addressed by the Nova and os-brick FC patches.

- Other scenarios: Such as when an instance is destroyed without access to the compute node and then access to the node is restored and we work with it without manually cleaning things up. This is covered by the os-brick large patch.

I would say that the current 4 patches cover 99% of the problematic cases. We can cover another 0.5% of the cases if we add "recheck_wwid yes" to multipath.conf when using the latest os-brick patch, but that's something we can work in the open in tripleo.

This last os-brick patch is kind of a big one, which together with the things it does makes it a bit risky to backport it, so it may be wise to not backport it right away.

In other words, in my opinion we should just backport the cinder, nova, and FC os-brick patch.

Revision history for this message
Nick Tait (nickthetait) wrote :

It is not apparent to me who is waiting on what right now.

Gorka, could you help me better understand what is required for an attacker to exploit this? I made a rough guess at CVSS score: https://www.first.org/cvss/calculator/3.1#CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H
* Could this be executed remotely?
* What is the level of complexity to exploit?
* Could an attacker exploit this multiple times and eventually gain control of all images within the OpenStack deployment?
* Attacker would need at least a basic user account right?

Fungi, what are your thoughts on security classification? Possibly A or B1? Is it too early to pick a disclosure date?

Revision history for this message
Jeremy Stanley (fungi) wrote :

We have attached patches at this point for cinder, nova and (2 for) os-brick. It's not yet clear that there's consensus from the reviewers on this bug that the proposed fixes are sufficient and appropriate for backporting (at least to officially maintained stable branches, so as far back as stable/xena right now). Assuming the chosen fixes are suitable for backport, class A seems like the closest fit based on hints in comments #35 and #38 that there is an easily-exploitable condition for a normal user of the environment (but as of yet the details have not been explained that I've seen here). Of course, before I can attempt to summarize this set of risks into an appropriate impact description, we'll need more information on that.

Following our current 90-day maximum embargo policy we have at most 8 weeks to figure this out, but of course it would be better to have it over and done with at the soonest opportunity. Basically if we can get consensus on the patches and a clearer explanation for the exploit scenarios and possible mitigations, then I'll apply for a CVE assignment from MITRE with that information. In parallel, we'll need clean patches for all of the above fixes backported at least as far as stable/xena. Once we have all that, we'll pick a disclosure date roughly a week out and send advance copies of the description and patches to downstream stakeholders so they can begin preparing their own packages.

Note that an additional wrinkle is the looming OpenStack 2023.1 coordinated release, which means that stable/2023.1 branches have already been created and we'll need backports from master to those as well (though I expect they'll be identical to the master branch patches in most cases). We'll also need to make sure to list the OpenStack 2023.1 release versions as affected since I highly doubt we'll publish in time to make one of the final RCs.

Revision history for this message
Dan Smith (danms) wrote :

I think this is *network* not *local* right? A user can trigger this via the API. They have to be authenticated, so they can't just be some random person, but they can cause the system to give them access to *other* users' data. Doesn't that also mean the "scope" is "changed"? Meaning, my guess is that it should have this scoring:

https://www.first.org/cvss/calculator/3.1#CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:C/C:H/I:H/A:H

Gorka, I haven't tested your patch myself, but you and I did discuss it earlier. Looking at it now, I'm wondering: how cinder can redirect or check with nova for a regular volume detach? If nova is the one doing the volume detach (via cinder) how does cinder know not to just redirect back to nova (creating a loop)? Is there some state cascade that we rely on to know that the detach has gone through nova at some point?

Revision history for this message
Gorka Eguileor (gorka) wrote :

Hi Nick,

> It is not apparent to me who is waiting on what right now.

I'm waiting on reviews, though Rajat suggested to me that I do a video session to explain the whole issue to facilitate reviews and assesment.

> * Could this be executed remotely?

Yes a normal user with normal credentials can exploit it.

> * What is the level of complexity to exploit?

Trivial.

Basically create a VM, attach one of your volumes to it, ask Cinder to delete the attachment record for the volume, then wait for another volume from any user to be attached to the same host and read the data.

This only works for iSCSI drivers that share targets, and some FC drivers.

> * Could an attacker exploit this multiple times and eventually gain control of all images within the OpenStack deployment?

The attacker would have access to volumes as long as they are present on the host.
So if owner of the volume detaches it, or the instance is migrated to another host, then access to the volume is lost.

* Attacker would need at least a basic user account right?

Yes

Revision history for this message
Gorka Eguileor (gorka) wrote :
Download full text (3.4 KiB)

Hi Jeremy,

There are multiple cases/scenarios captured in this bug:

- User exploitable scenario.
- Unintentional scenarios that can happen after destroying a VM with an
  attached volume fails to cleanly detach the volume.
- Other scenarios.

The summary of the user exploitable vulnerability would be something like:

A normal user can get access other users/projects volumes that are connected to
the same compute host where they are running an instance.

This issue doesn't affect every OpenStack deployment, for the exploit to
work there needs to be the right combination of nova configuration,
storage transport protocol, cinder driver approach to mapping volumes,
and storage array behavior.

I don't have access to all storage types supported by OpenStack, so I've
only looked into: iSCSI, FCP, NVMe-oF, and RBD.

It is my believe that this only affects SCSI based transport protocols
(iSCSI and FCP) and only under the following conditions:

- For iSCSI the Cinder driver needs to be using what we call shared
  targets: the same iSCSI target and portal tuple is used to present
  multiple volumes on a compute host.

- For FCP it depends on the storage array:
  - Pure: Affected.
  - 3PAR: Unaffected, because it sends the "Power-on" message that
    triggers a udev rule that tells multipathd to make appropriate
    changes.

The way to reproduce the issue is very straightforward, it's all about
telling Cinder to delete an attachment record from a volume attached to
a VM instead of doing the detachment the right way via Nova. Then when
the next volume from that same backend is attached to the host our VM
will have access to it.

I'll give the steps using a devstack deployment, but the same would
happen on a Triple-O deployment.

The only pre-requirement is that Cinder is configured to use one of the
storage array and driver combinations that is affected by this, as this
happens both in single paths as well as multipath attachments.

Steps for the demo user to have access to a volume owned by the admin
user to have access :

  $ . openrc demo demo
  $ nova boot --flavor cirros256 --image cirros-0.5.2-x86_64-disk --nic none myvm
  $ cinder create --name demo 1
  $ openstack server add volume myvm demo

  # The next 2 lines are the exploit which delete the attachment record
  $ attach_id=`openstack --os-volume-api-version=3.33 volume attachment list -c ID -f value`
  $ cinder --os-volume-api-version=3.27 attachment-delete $attach_id

  $ . openrc admin admin
  $ nova boot --flavor cirros256 --image cirros-0.5.2-x86_64-disk --nic none admin_vm
  $ cinder create --name admin 1
  $ openstack server add volume admin_vm admin

  # Both VMs use the same volume, so the demo VM can read the admin volume
  $ sudo virsh domblklist instance-00000001
  $ sudo virsh domblklist instance-00000002

The patches that have been submitted are related to the different
scenarios/cases described before:

- User exploitable scenario ==> Cinder patch
- Unintentional scenarios that can happen after destroying a VM with an
  attached volume fails to cleanly detach the volume ==> Nova and small
  os-brick patch
- Other scenarios ==> Huge os-brick patch

The "recheck_wwid yes...

Read more...

Revision history for this message
Gorka Eguileor (gorka) wrote :

Hi Dan,

> how cinder can redirect or check with nova for a regular volume detach?

The code is using the the "service_token" field from the context to detect if the request is coming from an OpenStack service (nova or glance), and if that's the case it processes the request.

If it's not coming from a service it does a couple of checks to allow manual cleanup requests. So it allows user delete attachment calls under the following circumstances:

- If the attachment record doesn't have an instance id.
- If the attachment record doesn't have connection information.
- If it has an instance, but the instance doesn't exist in Nova.
- If the attachment record in Nova's instance has a different ID from the one in the attachment.

Revision history for this message
melanie witt (melwitt) wrote (last edit ):
Download full text (12.0 KiB)

I did a little testing of the cinder patch with a local devstack, looking for any way I could delete the cinder attachment without going through nova.

Unfortunately I found it appears I can bypass the redirect by sending the X-Service-Token header with my regular token. So it looks like we need to do a little more to validate whether it's nova calling. Not sure if we can maybe pull nova's user_id from keystone and then verify that as well or instead? Or maybe there is some other better way?

(later) Update: I dug around and found out why it's possible to easily fake a service token and it's because [keystone_authtoken] option "service_token_roles_required" defaults to False since Ocata [1] and remains so today:

"""
Upgrade Notes

Set the service_token_roles to a list of roles that services may have. The likely list is service or admin. Any service_token_roles may apply to accept the service token. Ensure service users have one of these roles so interservice communication continues to work correctly. When verified, set the service_token_roles_required flag to True to enforce this behaviour. This will become the default setting in future releases.
"""

By default any authenticated user can send their valid token as a "X-Service-Token" and keystone will accept it as a valid service token.

If I however set in cinder.conf:

[keystone_authtoken]
service_token_roles_required = True

My below repro attempt will be rejected with:

{"error": {"code": 401, "title": "Unauthorized", "message": "The request you have made requires authentication."}}

So either way we need a different way to verify whether it is nova calling DELETE /attachments/{attachment_id}.

[1] https://docs.openstack.org/releasenotes/keystonemiddleware/ocata.html#new-features

Repro steps:

Show that user "demo" does not have any service roles:

$ source openrc admin admin
$ openstack user list -f json
[
  {
    "ID": "a34218d9c4774df18a713ee8718eded7",
    "Name": "demo"
  }
]
$ openstack role assignment list --user a34218d9c4774df18a713ee8718eded7 --name -f json
[
  {
    "Role": "member",
    "User": "demo@Default",
    "Group": "",
    "Project": "invisible_to_admin@Default",
    "Domain": "",
    "System": "",
    "Inherited": false
  },
  {
    "Role": "anotherrole",
    "User": "demo@Default",
    "Group": "",
    "Project": "demo@Default",
    "Domain": "",
    "System": "",
    "Inherited": false
  },
  {
    "Role": "creator",
    "User": "demo@Default",
    "Group": "",
    "Project": "demo@Default",
    "Domain": "",
    "System": "",
    "Inherited": false
  },
  {
    "Role": "member",
    "User": "demo@Default",
    "Group": "",
    "Project": "demo@Default",
    "Domain": "",
    "System": "",
    "Inherited": false
  }
]

Begin repro:

$ source openrc demo demo
$ openstack volume create --size 1 test2004555 -f json
{
  "attachments": [],
  "availability_zone": "nova",
  "bootable": "false",
  "consistencygroup_id": null,
  "created_at": "2023-03-16T23:22:16.147130",
  "description": null,
  "encrypted": false,
  "id": "d66c2b17-a1ac-4bc8-a543-c36a829a9b7b",
  "multiattach": false,
  "name": "test2004555",
  "properties": {},
  "replication_status": null,
  "size": 1,...

Revision history for this message
Gorka Eguileor (gorka) wrote :

Hi Melanie,

Thank you very much for testing the Cinder code, finding the loophole, and providing such detailed instructions.

I incorrectly assumed that keystonemiddleware would not only check that the service token in the header is valid, but that it was actually that of a service role.

I have changed the code to actually check that between the roles from the service token (if a valid one is provided) is actually that of a service.

I'll look on Monday if the new approach also works on older releases (in case we need a different approach for the backports) and also for Glance using Cinder as a backend (in case glance is not sending the service token).

Cheers.

Revision history for this message
Nick Tait (nickthetait) wrote :

Dan, OK I agree with you on network exploitable.

The CVSS user gide gives a relevant example of scope change. See item 1 of section 3.5 on https://www.first.org/cvss/v3.1/user-guide. So in this case attackers might gain access to another user's images, they do not get influence over more components of openstack (for example keystone or glance).

Given this I'm leaning towards a score of 8.8 https://www.first.org/cvss/calculator/3.1#CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

Revision history for this message
melanie witt (melwitt) wrote :

Hi Gorka,

I tried the new version of the cinder patch and it's working well from the nova point of view.

The new check for the service role prevents the X-Service-Token header bypass and there should not be any way to fake the roles because the roles on the RequestContext are extracted only from a validated token response from keystone which will return the roles internally associated with the token. (I tried sending my own X-Service-Roles header and it [correctly] did not work).

Other than that, upon review I noticed there are a few typos in the unit tests in the patch, for example "mock_action_get.assesrt_called_once_with". Because the mocks are MagicMocks this will be a mocked function call that doesn't check anything and will not fail.

I also got one unit test fail when I ran them (test__redirect_detach_to_nova_if_needed*) locally but it's possible that's something unrelated in my environment.

Thank you for fixing up the patch so fast!

Revision history for this message
Nick Tait (nickthetait) wrote :

Thanks Gorka and Melanie for your development & testing efforts!

Quick quesion: Would it be possible for an administrator to disable deletion via cinder? this might serve as a mitigation

I took a crack at further condensing the vuln details below.

Impact: An openstack user could gain control of volumes from other users/projects. However, the scope of exposed images is limited to the compute host where the instance is running. Only SCSI based transport protocols are believed to be affected, but not all storage types have been tested.

Affected storage types: iSCSI and FCP
Unaffected storage types: NVMe-oF and RBD

Preconditions:
- For iSCSI the Cinder driver needs to be using "shared targets" where the same iSCSI target and portal tuple is used to present multiple volumes on a compute host.

- For FCP it depends on the storage array:
  - Pure: Affected.
  - 3PAR: Unaffected.

Attack scenario:
Use cinder to delete an attachment record from a volume which has already been attached to a VM

Revision history for this message
Gorka Eguileor (gorka) wrote :

Thanks Melanie for catching those.

I had forgotten to update the tests and there were also some mistakes in the unit tests due to the misspellings.

I have deleted the old cinder patch and attached and updated one fixing the unit tests issues.

The code works as expected with Glance using Cinder as a backend as well.
Now I'll see if this approach works with older releases, since I don't know when services started sending the service token to each other.

Revision history for this message
Gorka Eguileor (gorka) wrote :

I just realized that the cinder-patch needs improvements, because the presence of a service token in the request (and by extension the service roles in the context) depends on the deployment options, and some deployments may not have the "send_service_user_token" configured.

I'll give the patch another thought and add code for that scenario.
My initial idea is to check current actions on the instance to determine if the request is coming from the service or not, though I'm not familiar with all the nova actions that can trigger a cinder detach action.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Hi Nick,

I've been looking at possible mitigations without code changes and there is a way with configuration changes and policy changes. Steps would be:

1- Configure cinder and nova to use the "service_user" and to send the token ("send_service_user_token") [1]
2- Get the service uuid for the cinder and nova service users
3- If using Cinder as a glance backend, get the uuid for the "cinder_store_user_name" from the glance configuration and ensure that the user has the service role.
4- Write the /etc/cinder/policy.yaml file

Assuming that the user names for each of the services match the service name we can get their uuid with:
  $ openstack user show nova -f value -c id
  $ openstack user show cinder -f value -c id
  $ openstack user show glance -f value -c id

The policy I would recommend writing is:
  "is_nova_service": "service_user_id:<nova_service_uuid> or user_id:<nova_service_uuid>"
  "is_cinder_service": "service_user_id:<cinder_service_uuid> or user_id:<cinder_service_uuid>"
  "is_glance_service": "service_user_id:<cinder_store_user_name_uuid> or user_id:<cinder_store_user_name_uuid>"
  "is_service": "rule:is_nova_service or rule:is_glance_service or rule:is_cinder_service"
  "volume:attachment_delete": "rule:admin_api or (rule:admin_or_owner and rule:is_service) or role:service"

A much smaller policy is possible, but I like the one above and is the one that have tested. This one probably works as well, assuming everything has been configured as mentioned above:
  "volume:attachment_delete": "rule:admin_api or (rule:admin_or_owner and (service_user_id:<nova_service_uuid> or service_user_id:<cinder_service_uuid> or role:service))"

These policies don't prevent:
- Admins shooting themselves in the foot
- Unintentional issues like the one originally reported in this case.

They should prevent the user induced vulnerability.

Cheers,
Gorka.

[1]: https://docs.openstack.org/cinder/latest/configuration/block-storage/service-token.html

Revision history for this message
Gorka Eguileor (gorka) wrote :

Hi Nick,

I like your vulnerability details, though there are a couple of small comments I'd like to make:

- "user could gain control of volumes" ==> It's more like they can gain read/write access to the volumes, but not control, because they cannot delete the volumes, take snapshots, etc.

- "the scope of exposed images" ==> This may be misleading, because when I hear the word "images" in the context of OpenStack I think of Glance images, not Cinder volumes.

- I feel like we are singling out Pure as the only affected FCP driver just because that's the one I could get my hands on. Maybe we can rephrase it:
  - Drivers using FCP will be affected unless the array sends the "Power-on Reset" SCSI Sense code when mapping the volume. In our limited testings only a 3PAR array sent it, but this doesn't mean that all 3PARs will do.

Cheers,
Gorka.

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

I quite like Gorkar's policy workarounds using the service_user tokens. That would help our operators to just modify their configurations without needing to upgrade some z-release and then the exploit wouldn't be possible.

I also looked at https://bugs.launchpad.net/nova/+bug/2004555/+attachment/5656303/+files/cinder-2004555.patch and I'm quite OK with it, but I have a concern : if we want to backport it, then we could only do it down to only Xena as 2.89 is only there in that release.
https://docs.openstack.org/nova/latest/reference/api-microversion-history.html#microversion-2-89

For this specific reason, unless we change the fix to use other APIs from Nova that are more older (but honestly, I don't really know which ones) or we explain in the vulnerability details that you need to use the policy workarounds if you're older than Xena.

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

@Gorka: nice work finding the policy-based workaround!

The service_* properties have been exposed in oslo.context since 2.12.0 (Ocata) (commit 2eafb0eb6b0898), which, coincidentally is when the Attachments API that allows the exploit was introduced.

oslo.policy has been supporting a yaml policy file since 1.10.0 (Newton) (commit 83d209e9ed1a1f7f70) , so we'd only need to provide an example yaml file.

One thing we should mention is that for safety, the policy file should be explicitly mentioned in the configuration file for each service as the value of the [oslo_policy] policy_file option. That's because since Queens, if a policy_file isn't found, the policies defined in code are used, and until Wallaby or Xena, the default value for policy_file in most services was policy.json (which would mean that a policy.yaml file would be ignored in the default configuration). Likewise, in recent releases, a policy.json file is ignored in the default configuration, so it's safest to configure this explicitly.

Revision history for this message
melanie witt (melwitt) wrote :

> I just realized that the cinder-patch needs improvements, because the presence of a service token in the request (and by extension the service roles in the context) depends on the deployment options, and some deployments may not have the "send_service_user_token" configured.

Hm. I wonder if we could instead only check whether the user requesting has the "service" role (if "service" in RequestContext.roles) or is a member of the "service" project? And leave the service_token part out of it.

Technically a deployment could give any project or role to their service users (and omit any) ... so I'm not sure whether it's reasonable to assume any of the project names or role names or user names.

I just can't think of another real way to verify the identity of the caller other than openstack credentials. There has to be a source of truth for verifying the identity of any caller.

> I'll give the patch another thought and add code for that scenario.
My initial idea is to check current actions on the instance to determine if the request is coming from the service or not, though I'm not familiar with all the nova actions that can trigger a cinder detach action.

I'm not sure how nova actions could be a reliable way to know if nova called the detach API. There isn't a unique identifier sent to cinder that cinder could use to validate a request matches a server action. Each server action contains the request_id that performed it, but that wouldn't get sent to cinder unless it's sent as the global_request_id. Nova will send the request_id as the global_request_id only if there is not a global_request_id already in the RequestContext. So that wouldn't work if anyone sent a global_request_id when they called nova.

Other than that, you could only try to correlate the request based on server action timestamp unless I'm missing something.

Revision history for this message
Dan Smith (danms) wrote :

I definitely think that relying on server actions for something as important as this is a bad idea. We could easily change, break, or reorder code in that path without having any idea of the security implications...

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

following melwitt, comment #70

> Hm. I wonder if we could instead only check whether the user
> requesting has the "service" role (if "service" in
> RequestContext.roles) or is a member of the "service" project? And
> leave the service_token part out of it.

I'm afraid that if we try to figure out the source of the request
ourselves somehow, we'll be subject to some kind of request forgery
exploit. I think sticking with the service_token is the safest course
of action. The upside is that all the concerned services use the
keystone middleware that supports send_service_token, so other than
configuring each service correctly, there's no software upgrade or
anything involved.

> Technically a deployment could give any project or role to their
> service users (and omit any) ... so I'm not sure whether it's
> reasonable to assume any of the project names or role names or user
> names.

I agree with you completely here, and for the reasons you state, we
won't be able to provide a script to do this automatically. We'll
have to provide clear documentation of how to configure this correctly.
But the plus side is that while the send_user_token stuff may not be
configured at a site, the must be some kind of service user configured
for each service (at least I think so?), and we can refer to the config
options by name in explaining what to do to configure send_user_token
and make the policy file changes.

Revision history for this message
melanie witt (melwitt) wrote :

> I'm afraid that if we try to figure out the source of the request
> ourselves somehow, we'll be subject to some kind of request forgery
> exploit. I think sticking with the service_token is the safest course
> of action. The upside is that all the concerned services use the
> keystone middleware that supports send_service_token, so other than
> configuring each service correctly, there's no software upgrade or
> anything involved.

Yeah sorry, I was responding to the idea that we would have to: 1) accommodate the scenario where the deployer has *not* configured send_service_user_token and 2) accommodate it *without* requiring any config change by the deployer. If we can't require a config change, then I was saying maybe we could check for the "service" role (plain role).

I would much rather be able to use the service_token and require deployers to have send_service_user_token configured in order to be protected from this vulnerability. But it was not clear to me how far we can go with what action we can require from deployers when they install the update containing the exploit mitigation. If we require:

  [service_user]
  send_service_user_token = True

what will we do if it's False (the default)? Make nova services exit if it's not set to True with an error logged to say it's now required? If we don't do anything, when nova calls the detach API it would create a loop, as Dan mentioned in an earlier comment.

> I agree with you completely here, and for the reasons you state, we
> won't be able to provide a script to do this automatically. We'll
> have to provide clear documentation of how to configure this correctly.
> But the plus side is that while the send_user_token stuff may not be
> configured at a site, the must be some kind of service user configured
> for each service (at least I think so?), and we can refer to the config
> options by name in explaining what to do to configure send_user_token
> and make the policy file changes.

I would expect that even if a deployment has left [service_user]send_service_user_token = False, they would have some form of service user set up in keystone. But I am not 100% sure whether it's possible to run openstack without any dedicated service users today.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Just a reminder, our embargo policy promises a maximum of 90 days from initial report of a suspected vulnerability, and per the preamble in the bug description, that's... "This embargo shall not extend past 2023-05-03 and will be made public by or on that date even if no fix is identified."

That's four weeks from yesterday, so ideally we'll have fixes and an advisory ready to provide advance copies to downstream stakeholders at least a full week prior to that, which basically gives us only three weeks to wrap up the debate over patches and prepare all relevant backports (at least as far back as stable/yoga since stable/xena will be transitioning to extended maintenance before then, but also backporting to stable/xena if possible would be nicer to our users).

Revision history for this message
melanie witt (melwitt) wrote :

Thanks Jeremy.

IMHO there's not a clearly great solution here that will work for every deployment configuration. So I think we'll have to choose the least bad option, unfortunately.

Dan and I chatted about this bug today and I will try to summarize what we talked about to try and move things forward. We don't have much time ...

Of the options we have:

1) Redirect all non-service user DELETE /attachments requests to Nova

Problems with it:

* Requires non-default deployment configuration [1]

a) There must be a 'service' role in keystone and it must be assigned to the Nova and Glance users

b) The Cinder service must be configured to enforce service token roles:
[keystone_authtoken]
service_token_roles_required = true
service_token_roles = service (this is the default)

c) The Nova service must be configured to send service tokens:
[service_user]
send_service_user_token = true
(plus username, password, project, etc)

* Consequence of not having the non-default configuration:

There would be a forever loop between Nova and Cinder when Nova attempts any DELETE /attachments calls.

2) Reject all non-service user DELETE /attachments requests

Problems with it:

a-c) Same as option 1)

* Consequence of not having the non-default configuration:

All DELETE /attachments requests will be rejected by Cinder until the deployment is configured as required.

3) Do not accept DELETE /attachments requests on the public API endpoint

Problems with it:

a) Nova would need to be configured to call the private API endpoint for DELETE /attachments

* Consequence of not having the non-default configuration:

All DELETE /attachments requests will be rejected/ignored by Cinder until the deployment is configured as required.

4) Change default Cinder API policy to admin-only for DELETE /attachments

a) The Nova and Glance users must be configured as admin users

* Consequence of not having the non-default configuration:

All DELETE /attachments requests will be rejected/ignored by Cinder until the deployment is configured as required.

5) Other ideas?

Please feel free to correct me if I've got anything wrong here.

[1] https://docs.openstack.org/keystone/latest/admin/manage-services.html#service-users

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

As a security workaround, I'd recommend option #4 for operators wanting to be quickly safe until we find a better solution.

Revision history for this message
Gorka Eguileor (gorka) wrote :

I could be wrong, but option #4 shouldn't work, because the requests from Nova come with the user credentials, not with the nova or glance users.

Revision history for this message
Gorka Eguileor (gorka) wrote :

The new Cinder patch changes our approach to reject the dangerous requests with 409 error and also protects the volume action REST API endpoint that has 2 operations that could be used for the attack.

The commit message has more details.

Revision history for this message
melanie witt (melwitt) wrote :

> I could be wrong, but option #4 shouldn't work, because the requests from Nova come with the user credentials, not with the nova or glance users.

No, you are right, sorry. For some reason I had been thinking Nova called the attachment delete API with an elevated RequestContext but it doesn't.

So option #4 (if I've not made another mistake!) would have to be instead:

4) Change default Cinder API policy (in the code) to admin-only for DELETE /attachments and terminate_connection APIs and also change the Nova code to use elevated RequestContext when calling the terminate_connection and attachment_delete APIs.

I'm probably missing something but with this option a configuration change would not be needed. It would however obviously allow admins to delete attachments without going through Nova.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Forgot to update the release notes in my previous Cinder patch. Updated it now with upgrades and critical section notes.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Forgot to remove 2 methods that were no longer being used in the cinder patch.

Revision history for this message
Nick Tait (nickthetait) wrote :

Spoke with Dan Smith today and finally understood just how urgent this issue is. This revised my scoring to a 9.1 https://www.first.org/cvss/calculator/3.1#CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:C/C:H/I:L/A:L

Tentatively reserved CVE-2023-2088. Jeremy, if you still want to get a CVE direct from mitre I'll reject my one, no big deal.

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

Reviewed the cinder patch (5fe7d14c097260). Code and tests look good. Just a few minor things:

api-ref/source/v3/attachments.inc
nit: s/cinder api/Block Storage API/

cinder/exception.py
nit: s/through nova/using the Compute API/

releasenote:
1. in 'critical': s/token services/service tokens/
2. in 'security': s/other/another/
3. in 'upgrade': s/service it's/service if it's/
4. in 'upgrade': the role, option, and section names should be in double-backticks (they're in single backticks, which will render as italics instead of monospace font)
actually, forget 3 & 4 and maybe rewrite the upgrade section slightly:

upgrade:
  - |
    Nova must be `configured to send service tokens
    <https://docs.openstack.org/cinder/latest/configuration/block-storage/service-token.html>`_
    **and** cinder must be configured to recognize at least one of the roles
    that the nova service user has been assigned in keystone. By default,
    cinder will recognize the ``service`` role, so if the nova service user
    is assigned a differently named role in your cloud, you must adjust your
    cinder configuration file (``service_token_roles`` configuration option
    in the ``keystone_authtoken`` section). If nova and cinder are not
    configured correctly in this regard, detaching volumes will no longer
    work (`Bug #2004555 <https://bugs.launchpad.net/cinder/+bug/2004555>`_).

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

Another comment about the cinder patch: I looked through the tempest and cinder-tempest-plugin tests, and the only one I found that could be affected is test_attach_detach_volume_to_instance:
https://opendev.org/openstack/tempest/src/commit/3c7eebaaf35c9e8a3f00c76cd1741457bdec9fab/tempest/api/volume/test_volumes_actions.py#L39-L55

This test should now raise a 409 when detach is called. I'm not sure what the best way to handle this is. Possibly talk to the QA team and merge a skip test bug #2004555 now, and then fix the test as soon as the cinder patch lands?

Revision history for this message
Dan Smith (danms) wrote :

I dunno that referencing an embargoed bug is really the best plan before disclosure. However, I suspect we could convince them to just do it without a strong justification if we explain (privately) what's going on.

However, I think that the race is really to disclosure and getting patches up and not necessarily a race to land them, right? If we had a patch ready to go to do the skip (or just fix the test), we could pre-arrange with them to get it +2+W on the same timeline as everything else. With proper Depends-On linkage, that should be okay right?

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

Yeah, yeah, my point was that we need a skip test with some kind of acceptable notation. We can't just fix the test because the cinder patch can't pass tempest with the current test, and tempest with a fixed test will be broken until the cinder patch lands.

So we'll need (I can post these patches):
1. tempest skip patch: cinder patch goes green for tempest with depends-on this patch
2. tempest fix patch: should be green with depends-on(cinder patch)

We post 1, 2, and cinder patch simultaneously to show that everything works, and then the merge order will be 1, cinder patch, 2.

If that sounds OK, I'll attach the patches and then we can add a tempest core to this bug.

Revision history for this message
Dan Smith (danms) wrote :

Yeah, I think that's the best plan.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Latest changes to the cinder patch:

- Updated the exception message
- Rewrote the api-ref section for the delete attachment
- Added missing api-ref text for the terminate connection and the force detach actions
- Added a docstring to the `is_service` method
- Amended the commit message that had a second Change-Id
- Updated the release notes as per comment #85
- Added an issues section to the release notes

The only code change in this patch should be the error message returned with the 409 error.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Brian, as far as I know mentioned test should not fail, because devstack deploys Nova to send the service token.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Resolve conflict with latest master code

Revision history for this message
Dan Smith (danms) wrote :

Gorka, I think the test *will* fail because we're not actually using nova there. We're creating an attachment with a server_id directly and then trying to detach it as a user. It's basically testing the seam and scenario that we're changing here.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Quick update.

The test wouldn't have failed if it were creating the attachment directly in Cinder and then deleting the attachment, even if it had the instance uuid, because Cinder would see that there is no nova instance or that the instance doesn't have the volume attached or that it's not using the attachment record.

Unfortunately we've found, looking at that tempest test, that there is yet another way to detach volumes in Cinder: using the "os-detach" volume action. So I need to update the cinder patch to also protect that endpoint.

We have also determined that Glance could expose the wrong image contents because it's not passing `force=True` on the os-brick `disconnect_volume`. This should be a small patch, and Brian will be working on it.

Revision history for this message
Dan Smith (danms) wrote (last edit ):

Okay, but it *does* create a server in nova and uses that uuid for the attachment. So if cinder does check nova, it will find an instance. I haven't looked deeply at the cinder patch, but you're saying because nova doesn't think the instance is actually attached to the volume, it will allow the delete? If so, then cool.

Presumably we want to also have a tempest test added to ensure that if we create/attach through nova and try to delete the attachment as a user, we get the expected 409. Not critical before disclosure I suppose, but I think we probably want that eventually.

Revision history for this message
Gorka Eguileor (gorka) wrote :

+1 to adding tempest tests to confirm that dangerous calls are not allowed (failing by getting 409, 401, or 403 errors) depending on the configuration options.

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

Adding glance_store patch.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Updated Cinder patch that also covers the `detach` volume action.

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

Adding glance_store patch.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Since the writeup for this is going to be extremely involved, it doesn't make much sense to draft and review it in bug comments. Let's use https://etherpad.opendev.org/p/LqauNAF4pXDKBChwh8fVAFYNTEiYxFLycA8RuvWEztYsumj4 to assemble our thoughts in preparation for future publication.

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

Reviewed the os-brick FC force disconnect support patch. Release note reads well and generates correctly. Code and tests LGTM.

The only thing I noticed was in os_brick/initiator/linuxscsi.py ... this could be a follow up or you could just ignore this comment: in the multipath_del_map() method you added, if mpath isn't deleted until the last iteration of the loop, we won't log the debug message at line 774, which could be misleading when troubleshooting because sometimes on success, we get a log message and sometimes we don't.

Revision history for this message
melanie witt (melwitt) wrote :

Added release note, docs, and upgrade status check to the nova patch.

Not sure if the above should be a separate patch, I can split the patch if so.

Revision history for this message
Dan Smith (danms) wrote :

Melanie, the updated nova patch looks pretty good to me and thanks for adding the nova-status check. I was thinking we'd do that after, but it's definitely nice to have it right away. I agree the patch is pretty massive right now, and under normal circumstances I'd split it up of course. However, I imagine some would argue the backporting will be easier and faster as a monolith.

One other thing, I think we should add the service user stuff to the install guide sections as well. I think that's where people likely get started for the bare minimum required config, so I think they'd probably find it weird to have a required chunk of config tucked at the bottom of the admin guide (which is linked from the top under "maintenance"). What do you think?

Revision history for this message
Jeremy Stanley (fungi) wrote :

It's not just the backporting that's easier with fewer patches. Keep in mind that we're going to be distributing advance copies of these to downstream stakeholders (like cloud operators and the private linux-distros mailing list) as file attachments in E-mail, so the more patches those recipients need to juggle and worry about sequencing the greater the risk something goes wrong for them.

Revision history for this message
melanie witt (melwitt) wrote :

Ack Dan and Jeremy, that was kind of my thinking too, that normally we would split it up but that keeping the number of patches to a minimum may be the right move for this.

Dan, initially I wasn't sure where required config should go in the docs so I just picked something. I agree the install guide would be better, so I'll move it there.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Additional tempest negative tests to verify that the detach, force detach, terminate connection, and attachment delete operations are protected.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Added doc changes and modified the patch so that tempest runs without changes (at least tox -eintegrated-storage)

Revision history for this message
Ghanshyam Mann (ghanshyammann) wrote :

From tempest test 'test_attach_detach_volume_to_instance' and similar test in that file seems not very valid tests though they pass. I tested the behavior when attachment is done via cinder and nova does not know about it even test create and pass the valid/existing server id. I tested the same and it is clear that in such case, user cannot use that attachment (list, show, write on volume etc) from nova perspective.

These were very old tests and should have written from cinder standalone service perspective where no server creation/passing needed. Irrespective of this bug we should modify these test as close to user operations.

For any other test failing, we can check what operation test verify and based on that we can go for the skip test way. Below is the process for Tempest test skip/modification to land service side bug
- https://docs.openstack.org/tempest/latest/HACKING.html#bug-fix-on-core-project-needing-tempest-changes

Revision history for this message
melanie witt (melwitt) wrote :

Added service user token configuration instructions to the install guides.

Revision history for this message
melanie witt (melwitt) wrote :

nova-2004555.patch applies cleanly to Bobcat, Antelope, Zed, Yoga
nova-2004555-xena.patch applies cleanly to Xena, Wallaby

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

reviewed cinder patch cc20649efa7383f495

Primary issue is that (at least on my reading) the api-ref and the commit message/release note conflict over whether users are allowed to make the 3 action API calls. The api-ref says that they can, with success dependent on satisfying a safe delete (as for the Attachments API delete call), but the commit message/relnote say the action calls are service-only. The code looks like it's implementing what the api-ref says.

docs: changes are good, read well, render correctly in HTML, links all work (only discuss configuration, so not affected by the above)

api-ref: nit: os-detach, os-force_detach are missing the 409 in the response codes list (only mentioning it because you have it for the os-terminate_connection action)

cinder/volume/api.py
nit: line 2583 (reason=) s/atachment/attachment/

cinder/tests/unit/api/contrib/test_admin_actions.py
nit: line 1040: mock_deletion_allowed returns True, but the real function either raises or returns None; I think it would be better to return None from the mock

cinder/tests/unit/api/v3/test_attachments.py
nit: test_attachment_deletion_allowed_service_call() relies on the default keystone_auth.service_token_roles containing 'service' (which is fine), but since there was talk at some point of hard-coding a 'service' check, it would be good to have a test that uses a non-default value for the config opt

cinder/tests/unit/policies/test_volume_actions.py
in test_owner_can_attach_detach_volume, line 973: I think you deleted 'body = {"os-detach":{}}' by mistake and not the test is checking 2 attach calls

release note: if you revise, the single backticks produce italics; you need double backticks for monospace font

The code in volume/api.py looks fine and the tests are thorough

Revision history for this message
Gorka Eguileor (gorka) wrote :

Ghanshyam, test "test_attach_detach_volume_to_instance" is now a valid test, because it's something we want to confirm that will work, as it is contemplated as one of the cases were a user deleting an attachment record is acceptable.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Updated patch to support force disconnect on FC driver.
Changes:
- Always display the log message
- Easier to read (using the retry decorator)
- Exponential backoff between retries

Revision history for this message
Gorka Eguileor (gorka) wrote :

Brian thanks for the review, I have updated the patch with your suggestions (comment #111).

I may have changed the phrasing in a later patch than the one you reviewed, but I believe that in the latest one the api-ref, commit message, comments in code, and release note, all say the same thing, I just used different wording since the audience is different.

For example the release note is very brief and it reads: "cinder now rejects user attachment delete requests for attachments that are being used by nova instances".
Being used by a nova instance means that the instance exists, that it has the volume attached, and that the volume attachment in the instance is using that particular attachment.

Good call on the missing 409 response codes in the api-ref, unintentional deleted line on the test, etc.

Revision history for this message
Gorka Eguileor (gorka) wrote :
Revision history for this message
Gorka Eguileor (gorka) wrote :
Revision history for this message
Gorka Eguileor (gorka) wrote :
Revision history for this message
Gorka Eguileor (gorka) wrote :
Revision history for this message
Gorka Eguileor (gorka) wrote :
Revision history for this message
Gorka Eguileor (gorka) wrote :
Revision history for this message
Gorka Eguileor (gorka) wrote :
Revision history for this message
Gorka Eguileor (gorka) wrote :
Revision history for this message
Rajat Dhasmana (whoami-rajat) wrote :

Hi Brian,

One comment regarding the glance store patch, we also have another disconnect_volume call in the attachment_state_manager[1] so we will need to pass the force flag there as well.
This file is for handling multiattach volumes where we only disconnect from os-brick if we are on the last attachment.

I also checked the backport patches by Gorka for Zed, Yoga and Xena and they do handle this case so we shouldn't require revised backports.

[1] https://github.com/openstack/glance_store/blob/6741951591ca7d6144f6089678df8cee4f0a7030/glance_store/common/attachment_state_manager.py#L233

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

I should note that the latest cinder-master patch (f6f4b77b21393539d0e45b6d4d8df31ee4f7f0ef) LGTM and passes all the usual tests locally (pep8, docs, releasenotes, api-ref; and unit, functional in py39 and py310).

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

@Rajat: some refactoring by an excellent software engineer (i.e., you) in 2023.1 restructured the code so that the multiattach manager is actually calling the method that now contains the force, so it only needs to be changed in that one place.

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

Verified that glance_store-2004555.patch applies cleanly to stable/2023.1

Revision history for this message
Gorka Eguileor (gorka) wrote :
Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

Revisions to osbrick-fc patch LGTM. Nice restructuring of multipath_del_map(). Reviewed 570df49db9de30 (master to zed).

Revision history for this message
Jeremy Stanley (fungi) wrote :

We're 5 weekdays away from our self-imposed publication deadline, so unless we've got the text and backported patches ready to distribute downstream tomorrow, we should probably push that out. Our vulnerability management policy[*] states, "Embargoes for privately-submitted reports of suspected vulnerabilities shall not last more than 90 days, except under unusual circumstances." While we didn't start making headway on this report as early as I would have preferred, the cross-project nature of the problem and broad impact does qualify as an "unusual circumstance" in my opinion so I'm proposing we extend the deadline by a week to Wednesday, May 10 in order to complete proper due diligence of review and testing of the proposed solutions. Are there any objections?

As for where we are now... It appears we have consensus and no new concerns raised on patches for the cinder, glance_store, nova, os-brick and tempest repositories, with backports as far as the stable/xena branch (even though we assume stable/yoga will be the oldest non-EM branch by the time we publish, since stable/xena was supposed to reach EM a week ago). For the document, we seem to have most of the details filled in but are still working to finalize the prose for accuracy and clarity. I think once everyone following is happy enough with what's there, we'll be ready to pick a publication date and distribute advance copies of the document and patches to our downstream stakeholders.

Revision history for this message
Jeremy Stanley (fungi) wrote :

(Sorry, I forgot to footnote the relevant policy URL.)

[*] https://security.openstack.org/repos-overseen.html#requirements

Revision history for this message
Jeremy Stanley (fungi) wrote :

And just a reminder to anyone who missed the link in comment #100, we're using https://etherpad.opendev.org/p/LqauNAF4pXDKBChwh8fVAFYNTEiYxFLycA8RuvWEztYsumj4 to brainstorm and refine messaging for the upcoming publication.

Revision history for this message
Nick Tait (nickthetait) wrote :

No complaints from me on delaying disclosure date.

Revision history for this message
Nick Tait (nickthetait) wrote :

FYI that launchpad seems to display comment numbers that are out of order. I believe the content is correctly ordered and dated, but just the numberings are wrong. At one point I saw 68, 69, 70, 16, 17, 18 ... 38, 39, 40, 71, 72, 73 but currently it shows me 13, 14, 41, 42 ... 93, 94, 16, 17 ... 39, 40, 95, 96

Revision history for this message
melanie witt (melwitt) wrote :

Added cherry-pick and conflicts lines to commit message.

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

I think the text in https://etherpad.opendev.org/p/LqauNAF4pXDKBChwh8fVAFYNTEiYxFLycA8RuvWEztYsumj4 is ready to go. Everyone might want to take one last read through to make sure we haven't missed anything.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Seems like we have consensus on the draft text in the etherpad sufficient for me to assemble downstream advance notice and publication, and agreement on the approach in the supplied patches and backports. On the assumption that the testing being performed by involved parties has turned up no additional problems, I propose that we schedule publication for 15:00 UTC on Wednesday, May 10 with 5 business day advance notification to downstream stakeholders on Wednesday, May 3. Are there any objections?

description: updated
summary: - [ussuri] Wrong volume attachment - volumes overlapping when connected
- through iscsi on host
+ Unauthorized volume access through deleted volume attachments
+ (CVE-2023-2088)
Changed in ossa:
status: Incomplete → In Progress
importance: Undecided → High
assignee: nobody → Jeremy Stanley (fungi)
Revision history for this message
Dan Smith (danms) wrote : Re: Unauthorized volume access through deleted volume attachments (CVE-2023-2088)

No objection from me.

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

I barely was able to look at this bug report given the long discussions, but I'm eventually OK with the etherpad, since it explains both the problem, a workaround and then the fix.
Operators can then choose between upgrading their services or modifying their existing enviroments.

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

The description of the short-term mitigation strategy via policy/config change is clear and has been tested, so I think we're ready to go.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Please consider the contents of the draft etherpad we were using effectively frozen as of now, since I'm working on incorporating it into the advance notification for downstream stakeholders in preparation for sending later today. If you notice any significant problems with the information or text, please raise it here in the bug report since we'll have to consider whether we treat further corrections as errata. Same goes for any adjustments to the patches currently attached to the bug report. Thanks!

Revision history for this message
Jeremy Stanley (fungi) wrote :

Just to confirm, the osbrick-leak-2004555-master.patch attachment in comment #123 is not mentioned in the writeup, so I'm assuming it was supplanted by the other osbrick-fc-2004555-master_to_zed.patch patch from comment #130, but please let me know if that's a mistaken assumption on my part.

Revision history for this message
Alan Bishop (alan-bishop) wrote :

I wish I could offer more details and a definitive answer, but Gorka is on holiday this week and so I have to take a stab at answering this one. I believe the osbrick-fc-2004555-master_to_zed.patch patch from comment #130 may not supersede the osbrick-leak-2004555-master.patch attachment in comment #123. I think the former (the -fc patch) is an additional patch that adds "force detach" to FC connections. That is, it only applies to FC connections. The osbrick-leak-2004555-master.patch is also part of the solution, and is distinct from the FC-only patch. Hopefully Brian and/or Rajat can confirm or deny my response.

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

@Jeremy: the osbrick-leak patch addresses some possible corner cases, but is too risky to backport as it may cause regressions. Our current thinking is that it should be worked on in public as a patch to master after this issue has been made public, and can go through the normal review and CI process. (To answer your question, it wasn't supplanted by the osbrick-fc patch, it is a child of that patch.)

Revision history for this message
Jeremy Stanley (fungi) wrote :

Thanks Alan and Brian for clarification on the leak patch. I didn't attach it to the downstream notification, which seems to have been the right call. It makes sense to treat that as a master branch only hardening fix after publication.

As for the downstream notification, it took a little longer than I intended to massage it into the shape of our templated communications and map/copy the patches to the branch-specific name format we've standardized, but it was sent to our private embargo-notice ML and the private linux-distros ML a little before 01:00 UTC.

Revision history for this message
Jeremy Stanley (fungi) wrote :

A quick reminder: We're scheduled to make this information public at 15:00 UTC tomorrow (Wednesday, May 10). I'll be switching the bug report to Public Security a few minutes before that, so the devs involved can start pushing patches/backports into Gerrit at that time (I'll also comment on the bug letting everyone know to start). Once everything has been pushed to Gerrit, so that we know what the change URLs are for all of them, I can publish the advisory and accompanying security note to the security.openstack.org site and relevant mailing lists. Timely attention to getting those changes pushed is, therefore, greatly appreciated.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Since we have a lot of patches to get pushed for this, I've gone ahead and opened the bug up about 30 minutes early. Please begin pushing the fixes/backports to Gerrit at your earliest opportunity so I can include the links for them in our advisory publication. Thanks!

description: updated
information type: Private Security → Public Security
Changed in ossn:
assignee: nobody → Jeremy Stanley (fungi)
importance: Undecided → High
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to glance_store (master)
Changed in glance-store:
status: New → In Progress
Changed in cinder:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/cinder/+/882835

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/cinder/+/882836

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/zed)

Fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/cinder/+/882837

Jeremy Stanley (fungi)
summary: - Unauthorized volume access through deleted volume attachments
- (CVE-2023-2088)
+ [OSSA-2023-003] Unauthorized volume access through deleted volume
+ attachments (CVE-2023-2088)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/yoga)

Fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/cinder/+/882838

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/cinder/+/882839

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to os-brick (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/os-brick/+/882840

Changed in os-brick:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-brick (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/os-brick/+/882841

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to os-brick (stable/2023.1)

Related fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/os-brick/+/882843

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to os-brick (stable/zed)

Related fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/os-brick/+/882844

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to os-brick (stable/yoga)

Related fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/os-brick/+/882846

Changed in nova:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/882847

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to os-brick (stable/xena)

Related fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/os-brick/+/882848

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to glance_store (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/glance_store/+/882851

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/882852

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to glance_store (stable/zed)

Fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/glance_store/+/882853

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to glance_store (stable/yoga)

Fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/glance_store/+/882854

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to glance_store (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/glance_store/+/882855

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/nova/+/882858

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/2023.1)

Related fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/nova/+/882859

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/zed)

Fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/nova/+/882860

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/zed)

Related fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/nova/+/882861

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/yoga)

Fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/nova/+/882863

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/yoga)

Related fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/nova/+/882864

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/882867

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/xena)

Related fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/882868

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/nova/+/882869

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/wallaby)

Related fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/nova/+/882870

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ossa (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/ossa/+/882879

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ossa (master)

Reviewed: https://review.opendev.org/c/openstack/ossa/+/882879
Committed: https://opendev.org/openstack/ossa/commit/d62fe374e42538e11abc9b34f5c38258e8279f40
Submitter: "Zuul (22348)"
Branch: master

commit d62fe374e42538e11abc9b34f5c38258e8279f40
Author: Jeremy Stanley <email address hidden>
Date: Wed May 10 14:39:22 2023 +0000

    Add OSSA-2023-003 (CVE-2023-2088)

    Change-Id: Iab9cca074c2928dbecbe512f813fe421a744c592
    Closes-Bug: #2004555

Changed in ossa:
status: In Progress → Fix Released
Jeremy Stanley (fungi)
Changed in ossn:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to glance_store (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/glance_store/+/882851
Committed: https://opendev.org/openstack/glance_store/commit/a7eed0263e436f841a3c277e051bdc6d6e07447d
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit a7eed0263e436f841a3c277e051bdc6d6e07447d
Author: Brian Rosmaita <email address hidden>
Date: Tue Apr 18 11:22:27 2023 -0400

    Add force to os-brick disconnect

    In order to be sure that devices are being removed from the host,
    we should be using the 'force' parameter with os-brick's
    disconnect_volume() method.

    Closes-bug: #2004555
    Change-Id: I63d09ad9ef465bc154c85a9ea125449c039d1b90
    (cherry picked from commit 1d8033e54e009bbc4408f6e16aec4f6c01687c91)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to glance_store (master)

Reviewed: https://review.opendev.org/c/openstack/glance_store/+/882834
Committed: https://opendev.org/openstack/glance_store/commit/1d8033e54e009bbc4408f6e16aec4f6c01687c91
Submitter: "Zuul (22348)"
Branch: master

commit 1d8033e54e009bbc4408f6e16aec4f6c01687c91
Author: Brian Rosmaita <email address hidden>
Date: Tue Apr 18 11:22:27 2023 -0400

    Add force to os-brick disconnect

    In order to be sure that devices are being removed from the host,
    we should be using the 'force' parameter with os-brick's
    disconnect_volume() method.

    Closes-bug: #2004555
    Change-Id: I63d09ad9ef465bc154c85a9ea125449c039d1b90

Changed in glance-store:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to glance_store (stable/2023.1)

Related fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/glance_store/+/882892

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to os-brick (master)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/882840
Committed: https://opendev.org/openstack/os-brick/commit/570df49db9de3030e658619138588b836c007f8c
Submitter: "Zuul (22348)"
Branch: master

commit 570df49db9de3030e658619138588b836c007f8c
Author: Gorka Eguileor <email address hidden>
Date: Wed Mar 1 13:08:16 2023 +0100

    Support force disconnect for FC

    This patch adds support for the force and ignore_errors on the
    disconnect_volume of the FC connector like we have in the iSCSI
    connector.

    Related-Bug: #2004555
    Change-Id: Ia74ecfba03ba23de9d30eb33706245a7f85e1d66

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to os-brick (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/882843
Committed: https://opendev.org/openstack/os-brick/commit/ffb76e10bca1a2b76dd48780e8b4402f02dc1775
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit ffb76e10bca1a2b76dd48780e8b4402f02dc1775
Author: Gorka Eguileor <email address hidden>
Date: Wed Mar 1 13:08:16 2023 +0100

    Support force disconnect for FC

    This patch adds support for the force and ignore_errors on the
    disconnect_volume of the FC connector like we have in the iSCSI
    connector.

    Related-Bug: #2004555
    Change-Id: Ia74ecfba03ba23de9d30eb33706245a7f85e1d66
    (cherry picked from commit 570df49db9de3030e658619138588b836c007f8c)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to glance_store (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/glance_store/+/882853
Committed: https://opendev.org/openstack/glance_store/commit/e9d2509926445fd95c9bba9e1cacacb85a5e58af
Submitter: "Zuul (22348)"
Branch: stable/zed

commit e9d2509926445fd95c9bba9e1cacacb85a5e58af
Author: Brian Rosmaita <email address hidden>
Date: Tue Apr 18 11:22:27 2023 -0400

    Add force to os-brick disconnect

    In order to be sure that devices are being removed from the host,
    we should be using the 'force' parameter with os-brick's
    disconnect_volume() method.

    Closes-bug: #2004555
    Change-Id: I63d09ad9ef465bc154c85a9ea125449c039d1b90
    (cherry picked from commit 1d8033e54e009bbc4408f6e16aec4f6c01687c91)
    (cherry picked from commit a7eed0263e436f841a3c277e051bdc6d6e07447d)
    Conflicts:
            glance_store/_drivers/cinder/base.py
            glance_store/tests/unit/cinder/test_base.py

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to glance_store (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/glance_store/+/882854
Committed: https://opendev.org/openstack/glance_store/commit/28301829777d4b1d2d7bca59fda108158d2ad6ca
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 28301829777d4b1d2d7bca59fda108158d2ad6ca
Author: Brian Rosmaita <email address hidden>
Date: Tue Apr 18 11:22:27 2023 -0400

    Add force to os-brick disconnect

    In order to be sure that devices are being removed from the host,
    we should be using the 'force' parameter with os-brick's
    disconnect_volume() method.

    Closes-bug: #2004555
    Change-Id: I63d09ad9ef465bc154c85a9ea125449c039d1b90
    (cherry picked from commit 1d8033e54e009bbc4408f6e16aec4f6c01687c91
    (cherry picked from commit a7eed0263e436f841a3c277e051bdc6d6e07447d
    Conflicts: glance_store/_drivers/cinder/base.py
            glance_store/tests/unit/cinder/test_base.py
    (cherry picked from commit e9d2509926445fd95c9bba9e1cacacb85a5e58af)
    Conflicts:
            glance_store/tests/unit/test_cinder_base.py

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to glance_store (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/glance_store/+/882855
Committed: https://opendev.org/openstack/glance_store/commit/1f447bc184500e070cbfcada76b0ea51104919b1
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 1f447bc184500e070cbfcada76b0ea51104919b1
Author: Brian Rosmaita <email address hidden>
Date: Tue Apr 18 11:22:27 2023 -0400

    Add force to os-brick disconnect

    In order to be sure that devices are being removed from the host,
    we should be using the 'force' parameter with os-brick's
    disconnect_volume() method.

    Closes-bug: #2004555
    Change-Id: I63d09ad9ef465bc154c85a9ea125449c039d1b90
    (cherry picked from commit 1d8033e54e009bbc4408f6e16aec4f6c01687c91
    (cherry picked from commit a7eed0263e436f841a3c277e051bdc6d6e07447d
    Conflicts: glance_store/_drivers/cinder/base.py
            glance_store/tests/unit/cinder/test_base.py
    (cherry picked from commit e9d2509926445fd95c9bba9e1cacacb85a5e58af)
    Conflicts: glance_store/tests/unit/test_cinder_base.py
    (cherry picked from commit 28301829777d4b1d2d7bca59fda108158d2ad6ca)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to os-brick (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/882846
Committed: https://opendev.org/openstack/os-brick/commit/111b3931a2db1d5be4ebe704bf26c34fa9408483
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 111b3931a2db1d5be4ebe704bf26c34fa9408483
Author: Gorka Eguileor <email address hidden>
Date: Wed Mar 1 13:08:16 2023 +0100

    Support force disconnect for FC

    This patch adds support for the force and ignore_errors on the
    disconnect_volume of the FC connector like we have in the iSCSI
    connector.

    Related-Bug: #2004555
    Change-Id: Ia74ecfba03ba23de9d30eb33706245a7f85e1d66
    (cherry picked from commit 570df49db9de3030e658619138588b836c007f8c)
    Conflicts:
            os_brick/initiator/connectors/fibre_channel.py

tags: added: in-stable-yoga
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/882847
Committed: https://opendev.org/openstack/nova/commit/db455548a12beac1153ce04eca5e728d7b773901
Submitter: "Zuul (22348)"
Branch: master

commit db455548a12beac1153ce04eca5e728d7b773901
Author: melanie witt <email address hidden>
Date: Wed Feb 15 22:37:40 2023 +0000

    Use force=True for os-brick disconnect during delete

    The 'force' parameter of os-brick's disconnect_volume() method allows
    callers to ignore flushing errors and ensure that devices are being
    removed from the host.

    We should use force=True when we are going to delete an instance to
    avoid leaving leftover devices connected to the compute host which
    could then potentially be reused to map to volumes to an instance that
    should not have access to those volumes.

    We can use force=True even when disconnecting a volume that will not be
    deleted on termination because os-brick will always attempt to flush
    and disconnect gracefully before forcefully removing devices.

    Closes-Bug: #2004555

    Change-Id: I3629b84d3255a8fe9d8a7cea8c6131d7c40899e8

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/882852
Committed: https://opendev.org/openstack/nova/commit/41c64b94b0af333845e998f6cc195e72ca5ab6bc
Submitter: "Zuul (22348)"
Branch: master

commit 41c64b94b0af333845e998f6cc195e72ca5ab6bc
Author: melanie witt <email address hidden>
Date: Tue May 9 03:11:25 2023 +0000

    Enable use of service user token with admin context

    When the [service_user] section is configured in nova.conf, nova will
    have the ability to send a service user token alongside the user's
    token. The service user token is sent when nova calls other services'
    REST APIs to authenticate as a service, and service calls can sometimes
    have elevated privileges.

    Currently, nova does not however have the ability to send a service user
    token with an admin context. This means that when nova makes REST API
    calls to other services with an anonymous admin RequestContext (such as
    in nova-manage or periodic tasks), it will not be authenticated as a
    service.

    This adds a keyword argument to service_auth.get_auth_plugin() to
    enable callers to provide a user_auth object instead of attempting to
    extract the user_auth from the RequestContext.

    The cinder and neutron client modules are also adjusted to make use of
    the new user_auth keyword argument so that nova calls made with
    anonymous admin request contexts can authenticate as a service when
    configured.

    Related-Bug: #2004555

    Change-Id: I14df2d55f4b2f0be58f1a6ad3f19e48f7a6bfcb4

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to os-brick (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/882844
Committed: https://opendev.org/openstack/os-brick/commit/e00d3ca753db6f60d58a5c0d4b6675b5bea8fc72
Submitter: "Zuul (22348)"
Branch: stable/zed

commit e00d3ca753db6f60d58a5c0d4b6675b5bea8fc72
Author: Gorka Eguileor <email address hidden>
Date: Wed Mar 1 13:08:16 2023 +0100

    Support force disconnect for FC

    This patch adds support for the force and ignore_errors on the
    disconnect_volume of the FC connector like we have in the iSCSI
    connector.

    Related-Bug: #2004555
    Change-Id: Ia74ecfba03ba23de9d30eb33706245a7f85e1d66
    (cherry picked from commit 570df49db9de3030e658619138588b836c007f8c)

tags: added: in-stable-zed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to glance_store (stable/zed)

Related fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/glance_store/+/882907

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to glance_store (stable/yoga)

Related fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/glance_store/+/882908

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/nova/+/882858
Committed: https://opendev.org/openstack/nova/commit/efb01985db88d6333897018174649b425feaa1b4
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit efb01985db88d6333897018174649b425feaa1b4
Author: melanie witt <email address hidden>
Date: Wed Feb 15 22:37:40 2023 +0000

    Use force=True for os-brick disconnect during delete

    The 'force' parameter of os-brick's disconnect_volume() method allows
    callers to ignore flushing errors and ensure that devices are being
    removed from the host.

    We should use force=True when we are going to delete an instance to
    avoid leaving leftover devices connected to the compute host which
    could then potentially be reused to map to volumes to an instance that
    should not have access to those volumes.

    We can use force=True even when disconnecting a volume that will not be
    deleted on termination because os-brick will always attempt to flush
    and disconnect gracefully before forcefully removing devices.

    Closes-Bug: #2004555

    Change-Id: I3629b84d3255a8fe9d8a7cea8c6131d7c40899e8
    (cherry picked from commit db455548a12beac1153ce04eca5e728d7b773901)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.opendev.org/c/openstack/cinder/+/882835
Committed: https://opendev.org/openstack/cinder/commit/6df1839bdf288107c600b3e53dff7593a6d4c161
Submitter: "Zuul (22348)"
Branch: master

commit 6df1839bdf288107c600b3e53dff7593a6d4c161
Author: Gorka Eguileor <email address hidden>
Date: Thu Feb 16 15:57:15 2023 +0100

    Reject unsafe delete attachment calls

    Due to how the Linux SCSI kernel driver works there are some storage
    systems, such as iSCSI with shared targets, where a normal user can
    access other projects' volume data connected to the same compute host
    using the attachments REST API.

    This affects both single and multi-pathed connections.

    To prevent users from doing this, unintentionally or maliciously,
    cinder-api will now reject some delete attachment requests that are
    deemed unsafe.

    Cinder will process the delete attachment request normally in the
    following cases:

    - The request comes from an OpenStack service that is sending the
      service token that has one of the roles in `service_token_roles`.
    - Attachment doesn't have an instance_uuid value
    - The instance for the attachment doesn't exist in Nova
    - According to Nova the volume is not connected to the instance
    - Nova is not using this attachment record

    There are 3 operations in the actions REST API endpoint that can be used
    for an attack:

    - `os-terminate_connection`: Terminate volume attachment
    - `os-detach`: Detach a volume
    - `os-force_detach`: Force detach a volume

    In this endpoint we just won't allow most requests not coming from a
    service. The rules we apply are the same as for attachment delete
    explained earlier, but in this case we may not have the attachment id
    and be more restrictive. This should not be a problem for normal
    operations because:

    - Cinder backup doesn't use the REST API but RPC calls via RabbitMQ
    - Glance doesn't use this interface anymore

    Checking whether it's a service or not is done at the cinder-api level
    by checking that the service user that made the call has at least one of
    the roles in the `service_token_roles` configuration. These roles are
    retrieved from keystone by the keystone middleware using the value of
    the "X-Service-Token" header.

    If Cinder is configured with `service_token_roles_required = true` and
    an attacker provides non-service valid credentials the service will
    return a 401 error, otherwise it'll return 409 as if a normal user had
    made the call without the service token.

    Closes-Bug: #2004555
    Change-Id: I612905a1bf4a1706cce913c0d8a6df7a240d599a

Changed in cinder:
status: In Progress → Fix Released
Revision history for this message
Maksim Malchuk (mmalchuk) wrote :

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/882893

Changed in kolla-ansible:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to kolla-ansible (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/882941

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on kolla-ansible (master)

Change abandoned by "Sven Kieske <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/882941
Reason: duplicate of https://review.opendev.org/c/openstack/kolla-ansible/+/882893

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/cinder/+/882836
Committed: https://opendev.org/openstack/cinder/commit/dd6010a9f7bf8cbe0189992f0848515321781747
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit dd6010a9f7bf8cbe0189992f0848515321781747
Author: Gorka Eguileor <email address hidden>
Date: Thu Feb 16 15:57:15 2023 +0100

    Reject unsafe delete attachment calls

    Due to how the Linux SCSI kernel driver works there are some storage
    systems, such as iSCSI with shared targets, where a normal user can
    access other projects' volume data connected to the same compute host
    using the attachments REST API.

    This affects both single and multi-pathed connections.

    To prevent users from doing this, unintentionally or maliciously,
    cinder-api will now reject some delete attachment requests that are
    deemed unsafe.

    Cinder will process the delete attachment request normally in the
    following cases:

    - The request comes from an OpenStack service that is sending the
      service token that has one of the roles in `service_token_roles`.
    - Attachment doesn't have an instance_uuid value
    - The instance for the attachment doesn't exist in Nova
    - According to Nova the volume is not connected to the instance
    - Nova is not using this attachment record

    There are 3 operations in the actions REST API endpoint that can be used
    for an attack:

    - `os-terminate_connection`: Terminate volume attachment
    - `os-detach`: Detach a volume
    - `os-force_detach`: Force detach a volume

    In this endpoint we just won't allow most requests not coming from a
    service. The rules we apply are the same as for attachment delete
    explained earlier, but in this case we may not have the attachment id
    and be more restrictive. This should not be a problem for normal
    operations because:

    - Cinder backup doesn't use the REST API but RPC calls via RabbitMQ
    - Glance doesn't use this interface

    Checking whether it's a service or not is done at the cinder-api level
    by checking that the service user that made the call has at least one of
    the roles in the `service_token_roles` configuration. These roles are
    retrieved from keystone by the keystone middleware using the value of
    the "X-Service-Token" header.

    If Cinder is configured with `service_token_roles_required = true` and
    an attacker provides non-service valid credentials the service will
    return a 401 error, otherwise it'll return 409 as if a normal user had
    made the call without the service token.

    Closes-Bug: #2004555
    Change-Id: I612905a1bf4a1706cce913c0d8a6df7a240d599a
    (cherry picked from commit 6df1839bdf288107c600b3e53dff7593a6d4c161)
    Conflicts:
            cinder/exception.py

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/nova/+/882859
Committed: https://opendev.org/openstack/nova/commit/1f781423ee4224c0871ab4aafec191bb2f7ef0e4
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit 1f781423ee4224c0871ab4aafec191bb2f7ef0e4
Author: melanie witt <email address hidden>
Date: Tue May 9 03:11:25 2023 +0000

    Enable use of service user token with admin context

    When the [service_user] section is configured in nova.conf, nova will
    have the ability to send a service user token alongside the user's
    token. The service user token is sent when nova calls other services'
    REST APIs to authenticate as a service, and service calls can sometimes
    have elevated privileges.

    Currently, nova does not however have the ability to send a service user
    token with an admin context. This means that when nova makes REST API
    calls to other services with an anonymous admin RequestContext (such as
    in nova-manage or periodic tasks), it will not be authenticated as a
    service.

    This adds a keyword argument to service_auth.get_auth_plugin() to
    enable callers to provide a user_auth object instead of attempting to
    extract the user_auth from the RequestContext.

    The cinder and neutron client modules are also adjusted to make use of
    the new user_auth keyword argument so that nova calls made with
    anonymous admin request contexts can authenticate as a service when
    configured.

    Related-Bug: #2004555

    Change-Id: I14df2d55f4b2f0be58f1a6ad3f19e48f7a6bfcb4
    (cherry picked from commit 41c64b94b0af333845e998f6cc195e72ca5ab6bc)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to glance_store (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/glance_store/+/882980

Revision history for this message
Jeremy Stanley (fungi) wrote :

I was contacted privately by an operator who read the advisory and was unable to reproduce the failure in their iSCSI based deployment. They suspect that the fact they're not relying on multipathd is protecting them from the vulnerability. Is anyone able to confirm whether this affects iSCSI environments without multipathd? If it doesn't, I'll look into issuing an errata update clarifying the scope of the vulnerability further.

Revision history for this message
Dan Smith (danms) wrote :

I think it does *not* depend on multipathd, but as noted in the text, it doesn't apply to *all* iSCSI deployments for various reasons.

Revision history for this message
Gorka Eguileor (gorka) wrote :

I just realized that for iSCSI based systems only those using "shared targets" are affected, like I mentioned on comment #57. We forgot to mention it in the final errata.

Regarding the multipathing, using multipathing there are additional issues that can lead to leaks, and we expect all production environments to use multipathing.

It could also be that they haven't properly checked it or that their storage system is issuing the "Power-on or device reset Unit Attention event" that prevents the issue from happening, like I observed in an HPE 3PAR FC system.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/cinder/+/882837
Committed: https://opendev.org/openstack/cinder/commit/cb4682fb836912225c5da1536108a0d05fd5c46e
Submitter: "Zuul (22348)"
Branch: stable/zed

commit cb4682fb836912225c5da1536108a0d05fd5c46e
Author: Gorka Eguileor <email address hidden>
Date: Thu Feb 16 15:57:15 2023 +0100

    Reject unsafe delete attachment calls

    Due to how the Linux SCSI kernel driver works there are some storage
    systems, such as iSCSI with shared targets, where a normal user can
    access other projects' volume data connected to the same compute host
    using the attachments REST API.

    This affects both single and multi-pathed connections.

    To prevent users from doing this, unintentionally or maliciously,
    cinder-api will now reject some delete attachment requests that are
    deemed unsafe.

    Cinder will process the delete attachment request normally in the
    following cases:

    - The request comes from an OpenStack service that is sending the
      service token that has one of the roles in `service_token_roles`.
    - Attachment doesn't have an instance_uuid value
    - The instance for the attachment doesn't exist in Nova
    - According to Nova the volume is not connected to the instance
    - Nova is not using this attachment record

    There are 3 operations in the actions REST API endpoint that can be used
    for an attack:

    - `os-terminate_connection`: Terminate volume attachment available at
    - `os-detach`: Detach a volume
    - `os-force_detach`: Force detach a volume

    In this endpoint we just won't allow anything that is not coming from a
    service. This should not be a problem because:

    - Cinder backup doesn't use the REST API but RPC calls via RabbitMQ
    - Glance doesn't use this interface

    Checking whether it's a service or not is done at the cinder-api level
    by checking that the service user that made the call has at least one of
    the roles in the `service_token_roles` configuration. These roles are
    retrieved from keystone by the keystone middleware using the value of
    the "X-Service-Token" header.

    If Cinder is configured with `service_token_roles_required = true` and
    an attacker provides non-service valid credentials the service will
    return a 401 error, otherwise it'll return 409 as if a normal user had
    made the call without the service token.

    Closes-Bug: #2004555
    Change-Id: I612905a1bf4a1706cce913c0d8a6df7a240d599a
    (cherry picked from commit 6df1839bdf288107c600b3e53dff7593a6d4c161)
    Conflicts:
            cinder/exception.py
    (cherry picked from commit dd6010a9f7bf8cbe0189992f0848515321781747)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/cinder/+/882838
Committed: https://opendev.org/openstack/cinder/commit/a66f4afa22fc5a0a85d5224a6b63dd766fef47b1
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit a66f4afa22fc5a0a85d5224a6b63dd766fef47b1
Author: Gorka Eguileor <email address hidden>
Date: Thu Feb 16 15:57:15 2023 +0100

    Reject unsafe delete attachment calls

    Due to how the Linux SCSI kernel driver works there are some storage
    systems, such as iSCSI with shared targets, where a normal user can
    access other projects' volume data connected to the same compute host
    using the attachments REST API.

    This affects both single and multi-pathed connections.

    To prevent users from doing this, unintentionally or maliciously,
    cinder-api will now reject some delete attachment requests that are
    deemed unsafe.

    Cinder will process the delete attachment request normally in the
    following cases:

    - The request comes from an OpenStack service that is sending the
      service token that has one of the roles in `service_token_roles`.
    - Attachment doesn't have an instance_uuid value
    - The instance for the attachment doesn't exist in Nova
    - According to Nova the volume is not connected to the instance
    - Nova is not using this attachment record

    There are 3 operations in the actions REST API endpoint that can be used
    for an attack:

    - `os-terminate_connection`: Terminate volume attachment
    - `os-detach`: Detach a volume
    - `os-force_detach`: Force detach a volume

    In this endpoint we just won't allow most requests not coming from a
    service. The rules we apply are the same as for attachment delete
    explained earlier, but in this case we may not have the attachment id
    and be more restrictive. This should not be a problem for normal
    operations because:

    - Cinder backup doesn't use the REST API but RPC calls via RabbitMQ
    - Glance doesn't use this interface

    Checking whether it's a service or not is done at the cinder-api level
    by checking that the service user that made the call has at least one of
    the roles in the `service_token_roles` configuration. These roles are
    retrieved from keystone by the keystone middleware using the value of
    the "X-Service-Token" header.

    If Cinder is configured with `service_token_roles_required = true` and
    an attacker provides non-service valid credentials the service will
    return a 401 error, otherwise it'll return 409 as if a normal user had
    made the call without the service token.

    Closes-Bug: #2004555
    Change-Id: I612905a1bf4a1706cce913c0d8a6df7a240d599a
    (cherry picked from commit 6df1839bdf288107c600b3e53dff7593a6d4c161)
    Conflicts:
            cinder/exception.py
    (cherry picked from commit dd6010a9f7bf8cbe0189992f0848515321781747)
    (cherry picked from commit cb4682fb836912225c5da1536108a0d05fd5c46e)
    Conflicts:
            cinder/exception.py

Revision history for this message
Jeremy Stanley (fungi) wrote :

Can someone in Red Hat Security please switch the assigned CVE to published status? A down-side to the VMT not getting CVE assignments directly through MITRE is that MITRE apparently also refuses to process requests to switch them to public once we publish our advisories. It would be very nice for this to not still be in "reserved" state, as we're now 48 hours past the original publication.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/glance_store 4.4.0

This issue was fixed in the openstack/glance_store 4.4.0 release.

Revision history for this message
Nick Tait (nickthetait) wrote :

I did submit the record to MITRE yesterday, its waiting on them to be reviewed/posted.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Thanks Nick. I notified MITRE about the publication on Wednesday when we posted it (per our process this normally works when we were the ones to originally request the assignment from them), but they responded today telling me to talk to you, so I suppose it's in Limbo for the time being.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to kolla-ansible (stable/wallaby)

Related fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/883017

Revision history for this message
Nick Tait (nickthetait) wrote :
Revision history for this message
Zakhar Kirpichenko (kzakhar) wrote :
Download full text (7.5 KiB)

The following packages were updated on Wallaby compute nodes:

python3-nova:amd64 (3:23.2.2-0ubuntu1~cloud1, 3:23.2.2-0ubuntu1~cloud2),
python3-os-brick:amd64 (4.3.3-0ubuntu1~cloud0, 4.3.3-0ubuntu1~cloud1),
nova-compute-libvirt:amd64 (3:23.2.2-0ubuntu1~cloud1, 3:23.2.2-0ubuntu1~cloud2),
nova-common:amd64 (3:23.2.2-0ubuntu1~cloud1, 3:23.2.2-0ubuntu1~cloud2),
os-brick-common:amd64 (4.3.3-0ubuntu1~cloud0, 4.3.3-0ubuntu1~cloud1),
nova-compute-kvm:amd64 (3:23.2.2-0ubuntu1~cloud1, 3:23.2.2-0ubuntu1~cloud2),
nova-compute:amd64 (3:23.2.2-0ubuntu1~cloud1, 3:23.2.2-0ubuntu1~cloud2)

nova-compute is now unable to detach volumes from instances:

2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server [req-470d3e0e-e59c-40c5-9597-6649c08add16 046191f8ebfd4695b3387a5ead3a9a55 85945271df8b4a6f9d37c37e4e52958d - default default] Exception during message handling: TypeError: disconnect_volume() got an unexpected keyword argument 'force'
2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming
2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/dispatcher.py", line 309, in dispatch
2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/dispatcher.py", line 229, in _do_dispatch
2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/nova/exception_wrapper.py", line 71, in wrapped
2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server _emit_versioned_exception_notification(
2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in __exit__
2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server self.force_reraise()
2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in force_reraise
2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server raise self.value
2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/nova/exception_wrapper.py", line 63, in wrapped
2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server return f(self, context, *args, **kw)
2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/nova/compute/utils.py", line 1434, in decorated_function
2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs)
2023-05-13 05:53:00.128 3219193 ERROR oslo_messaging.rpc.server...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/nova/+/882860
Committed: https://opendev.org/openstack/nova/commit/8b4b99149a35663fc11d7d163082747b1b210b4d
Submitter: "Zuul (22348)"
Branch: stable/zed

commit 8b4b99149a35663fc11d7d163082747b1b210b4d
Author: melanie witt <email address hidden>
Date: Wed Feb 15 22:37:40 2023 +0000

    Use force=True for os-brick disconnect during delete

    The 'force' parameter of os-brick's disconnect_volume() method allows
    callers to ignore flushing errors and ensure that devices are being
    removed from the host.

    We should use force=True when we are going to delete an instance to
    avoid leaving leftover devices connected to the compute host which
    could then potentially be reused to map to volumes to an instance that
    should not have access to those volumes.

    We can use force=True even when disconnecting a volume that will not be
    deleted on termination because os-brick will always attempt to flush
    and disconnect gracefully before forcefully removing devices.

    Closes-Bug: #2004555

    Change-Id: I3629b84d3255a8fe9d8a7cea8c6131d7c40899e8
    (cherry picked from commit db455548a12beac1153ce04eca5e728d7b773901)
    (cherry picked from commit efb01985db88d6333897018174649b425feaa1b4)

Revision history for this message
Jeremy Stanley (fungi) wrote :

Zakhar: Make sure your packages include the nova patch for OSSA-2023-003 errata #1: https://review.opendev.org/q/I3629b84d3255a8fe9d8a7cea8c6131d7c40899e8

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/nova/+/882861
Committed: https://opendev.org/openstack/nova/commit/0d6dd6c67f56c9d4ed36246d14f119da6bca0a5a
Submitter: "Zuul (22348)"
Branch: stable/zed

commit 0d6dd6c67f56c9d4ed36246d14f119da6bca0a5a
Author: melanie witt <email address hidden>
Date: Tue May 9 03:11:25 2023 +0000

    Enable use of service user token with admin context

    When the [service_user] section is configured in nova.conf, nova will
    have the ability to send a service user token alongside the user's
    token. The service user token is sent when nova calls other services'
    REST APIs to authenticate as a service, and service calls can sometimes
    have elevated privileges.

    Currently, nova does not however have the ability to send a service user
    token with an admin context. This means that when nova makes REST API
    calls to other services with an anonymous admin RequestContext (such as
    in nova-manage or periodic tasks), it will not be authenticated as a
    service.

    This adds a keyword argument to service_auth.get_auth_plugin() to
    enable callers to provide a user_auth object instead of attempting to
    extract the user_auth from the RequestContext.

    The cinder and neutron client modules are also adjusted to make use of
    the new user_auth keyword argument so that nova calls made with
    anonymous admin request contexts can authenticate as a service when
    configured.

    Related-Bug: #2004555

    Change-Id: I14df2d55f4b2f0be58f1a6ad3f19e48f7a6bfcb4
    (cherry picked from commit 41c64b94b0af333845e998f6cc195e72ca5ab6bc)
    (cherry picked from commit 1f781423ee4224c0871ab4aafec191bb2f7ef0e4)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to kolla-ansible (stable/xena)

Related fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/883110

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to kolla-ansible (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/kolla-ansible/+/883017
Committed: https://opendev.org/openstack/kolla-ansible/commit/a77ea13ef1991543df29b7eea14b1f91ef26f858
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit a77ea13ef1991543df29b7eea14b1f91ef26f858
Author: Sean Mooney <email address hidden>
Date: Wed May 10 20:58:47 2023 +0100

    always add service_user section to nova.conf

    As of I3629b84d3255a8fe9d8a7cea8c6131d7c40899e8 nova
    now requires the service_user section to be configured
    to address CVE-2023-2088. This change adds
    the service user section to the nova.conf template in
    the nova and nova-cell roles.

    Related-Bug: #2004555
    Signed-off-by: Sven Kieske <email address hidden>
    Change-Id: I2189dafca070accfd8efcd4b8cc4221c6decdc9f

tags: added: in-stable-wallaby
Revision history for this message
Dan Smith (danms) wrote :

Zakhar, which volume driver are you using?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to kolla-ansible (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/kolla-ansible/+/883110
Committed: https://opendev.org/openstack/kolla-ansible/commit/03c12abbcc107bfec451f4558bc97d14facae01c
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 03c12abbcc107bfec451f4558bc97d14facae01c
Author: Sean Mooney <email address hidden>
Date: Wed May 10 20:58:47 2023 +0100

    always add service_user section to nova.conf

    As of I3629b84d3255a8fe9d8a7cea8c6131d7c40899e8 nova
    now requires the service_user section to be configured
    to address CVE-2023-2088. This change adds
    the service user section to the nova.conf template in
    the nova and nova-cell roles.

    Related-Bug: #2004555
    Signed-off-by: Sven Kieske <email address hidden>
    Change-Id: I2189dafca070accfd8efcd4b8cc4221c6decdc9f
    (cherry picked from commit a77ea13ef1991543df29b7eea14b1f91ef26f858)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to kolla-ansible (stable/yoga)

Related fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/883113

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/nova/+/882863
Committed: https://opendev.org/openstack/nova/commit/4d8efa2d196f72fdde33136a0b50c4ee8da3c941
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 4d8efa2d196f72fdde33136a0b50c4ee8da3c941
Author: melanie witt <email address hidden>
Date: Wed Feb 15 22:37:40 2023 +0000

    Use force=True for os-brick disconnect during delete

    The 'force' parameter of os-brick's disconnect_volume() method allows
    callers to ignore flushing errors and ensure that devices are being
    removed from the host.

    We should use force=True when we are going to delete an instance to
    avoid leaving leftover devices connected to the compute host which
    could then potentially be reused to map to volumes to an instance that
    should not have access to those volumes.

    We can use force=True even when disconnecting a volume that will not be
    deleted on termination because os-brick will always attempt to flush
    and disconnect gracefully before forcefully removing devices.

    Closes-Bug: #2004555

    Change-Id: I3629b84d3255a8fe9d8a7cea8c6131d7c40899e8
    (cherry picked from commit db455548a12beac1153ce04eca5e728d7b773901)
    (cherry picked from commit efb01985db88d6333897018174649b425feaa1b4)
    (cherry picked from commit 8b4b99149a35663fc11d7d163082747b1b210b4d)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/nova/+/882864
Committed: https://opendev.org/openstack/nova/commit/98c3e3707c08a07f7ca5996086b165512f604ad6
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 98c3e3707c08a07f7ca5996086b165512f604ad6
Author: melanie witt <email address hidden>
Date: Tue May 9 03:11:25 2023 +0000

    Enable use of service user token with admin context

    When the [service_user] section is configured in nova.conf, nova will
    have the ability to send a service user token alongside the user's
    token. The service user token is sent when nova calls other services'
    REST APIs to authenticate as a service, and service calls can sometimes
    have elevated privileges.

    Currently, nova does not however have the ability to send a service user
    token with an admin context. This means that when nova makes REST API
    calls to other services with an anonymous admin RequestContext (such as
    in nova-manage or periodic tasks), it will not be authenticated as a
    service.

    This adds a keyword argument to service_auth.get_auth_plugin() to
    enable callers to provide a user_auth object instead of attempting to
    extract the user_auth from the RequestContext.

    The cinder and neutron client modules are also adjusted to make use of
    the new user_auth keyword argument so that nova calls made with
    anonymous admin request contexts can authenticate as a service when
    configured.

    Related-Bug: #2004555

    Change-Id: I14df2d55f4b2f0be58f1a6ad3f19e48f7a6bfcb4
    (cherry picked from commit 41c64b94b0af333845e998f6cc195e72ca5ab6bc)
    (cherry picked from commit 1f781423ee4224c0871ab4aafec191bb2f7ef0e4)
    (cherry picked from commit 0d6dd6c67f56c9d4ed36246d14f119da6bca0a5a)

Revision history for this message
melanie witt (melwitt) wrote :

> Looks like it doesn't know about the "force" keyword that's being passed.

Hi Zakhar,

I checked through and found one missing kwarg for the LibvirtNetVolumeDriver -- I assume that is the driver you are using.

I had incorrectly thought the Xena and Wallaby patches were identical but there is a slight difference. Apologies for that.

I have updated the gerrit patch review with the change:

  https://review.opendev.org/c/openstack/nova/+/882869/2

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to kolla-ansible (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/kolla-ansible/+/883113
Committed: https://opendev.org/openstack/kolla-ansible/commit/cb105dc293ff1cdb11ab63fa3e3bf39fd17e0ee0
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit cb105dc293ff1cdb11ab63fa3e3bf39fd17e0ee0
Author: Sean Mooney <email address hidden>
Date: Wed May 10 20:58:47 2023 +0100

    always add service_user section to nova.conf

    As of I3629b84d3255a8fe9d8a7cea8c6131d7c40899e8 nova
    now requires the service_user section to be configured
    to address CVE-2023-2088. This change adds
    the service user section to the nova.conf template in
    the nova and nova-cell roles.

    Related-Bug: #2004555
    Signed-off-by: Sven Kieske <email address hidden>
    Change-Id: I2189dafca070accfd8efcd4b8cc4221c6decdc9f
    (cherry picked from commit a77ea13ef1991543df29b7eea14b1f91ef26f858)
    (cherry picked from commit 03c12abbcc107bfec451f4558bc97d14facae01c)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to kolla-ansible (stable/zed)

Related fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/883114

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to kolla-ansible (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/kolla-ansible/+/883114
Committed: https://opendev.org/openstack/kolla-ansible/commit/efe6650d09441b02cf93738a94a59723d84c5b19
Submitter: "Zuul (22348)"
Branch: stable/zed

commit efe6650d09441b02cf93738a94a59723d84c5b19
Author: Sean Mooney <email address hidden>
Date: Wed May 10 20:58:47 2023 +0100

    always add service_user section to nova.conf

    As of I3629b84d3255a8fe9d8a7cea8c6131d7c40899e8 nova
    now requires the service_user section to be configured
    to address CVE-2023-2088. This change adds
    the service user section to the nova.conf template in
    the nova and nova-cell roles.

    Related-Bug: #2004555
    Signed-off-by: Sven Kieske <email address hidden>
    Change-Id: I2189dafca070accfd8efcd4b8cc4221c6decdc9f
    (cherry picked from commit a77ea13ef1991543df29b7eea14b1f91ef26f858)
    (cherry picked from commit 03c12abbcc107bfec451f4558bc97d14facae01c)
    (cherry picked from commit cb105dc293ff1cdb11ab63fa3e3bf39fd17e0ee0)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to kolla-ansible (master)

Reviewed: https://review.opendev.org/c/openstack/kolla-ansible/+/882893
Committed: https://opendev.org/openstack/kolla-ansible/commit/ddadaa282e72cc437470859766ac963ac757a26a
Submitter: "Zuul (22348)"
Branch: master

commit ddadaa282e72cc437470859766ac963ac757a26a
Author: Sean Mooney <email address hidden>
Date: Wed May 10 20:58:47 2023 +0100

    always add service_user section to nova.conf

    As of I3629b84d3255a8fe9d8a7cea8c6131d7c40899e8 nova
    now requires the service_user section to be configured
    to address CVE-2023-2088. This change adds
    the service user section to the nova.conf template in
    the nova and nova-cell roles.

    Related-Bug: #2004555
    Signed-off-by: Sven Kieske <email address hidden>
    Change-Id: I2189dafca070accfd8efcd4b8cc4221c6decdc9f
    (cherry picked from commit a77ea13ef1991543df29b7eea14b1f91ef26f858)
    (cherry picked from commit 03c12abbcc107bfec451f4558bc97d14facae01c)
    (cherry picked from commit cb105dc293ff1cdb11ab63fa3e3bf39fd17e0ee0)
    (cherry picked from commit efe6650d09441b02cf93738a94a59723d84c5b19)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to glance_store (master)

Reviewed: https://review.opendev.org/c/openstack/glance_store/+/882980
Committed: https://opendev.org/openstack/glance_store/commit/ce86bf38239e3962db880bc9bfbaa9f6364a2d14
Submitter: "Zuul (22348)"
Branch: master

commit ce86bf38239e3962db880bc9bfbaa9f6364a2d14
Author: Brian Rosmaita <email address hidden>
Date: Thu May 11 12:12:51 2023 -0400

    Update 'extras' for cinder driver

    Raise the min version of os-brick to include the fix for
    CVE-2023-2088.

    Change-Id: If3dba01d5cbb3a3deacdf23ab5290d7bcab4b5c7
    Related-bug: #2004555

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to glance_store (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/glance_store/+/882892
Committed: https://opendev.org/openstack/glance_store/commit/4f4de2348f38a623523c37c31c91fdcf18bbcbf6
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit 4f4de2348f38a623523c37c31c91fdcf18bbcbf6
Author: Brian Rosmaita <email address hidden>
Date: Wed May 10 15:49:52 2023 -0400

    Update 'extras' for cinder driver

    Raise the min version of os-brick to include the fix for
    CVE-2023-2088.

    Change-Id: I4433df9414129ab2acec772791e05a17e3bf78ed
    Related-bug: #2004555

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to glance_store (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/glance_store/+/882907
Committed: https://opendev.org/openstack/glance_store/commit/02ab740fbf2a2fb12d2459b4e52e0200aa5e8f20
Submitter: "Zuul (22348)"
Branch: stable/zed

commit 02ab740fbf2a2fb12d2459b4e52e0200aa5e8f20
Author: Brian Rosmaita <email address hidden>
Date: Wed May 10 20:13:57 2023 -0400

    Update 'extras' for cinder driver

    Raise the min version of os-brick to include the fix for
    CVE-2023-2088.

    Change-Id: I6c55fc943d26a8a0fdffc028b123ef4e6ff68cb2
    Related-bug: #2004555

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to glance_store (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/glance_store/+/882908
Committed: https://opendev.org/openstack/glance_store/commit/712eb6df3b79009b49c0cf075675d75f14281914
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 712eb6df3b79009b49c0cf075675d75f14281914
Author: Brian Rosmaita <email address hidden>
Date: Wed May 10 20:17:36 2023 -0400

    Update 'extras' for cinder driver

    Raise the min version of os-brick to include the fix for
    CVE-2023-2088.

    Change-Id: Ic8bc4d7ae7e38eca65be01184add7ae1ca377a22
    Related-bug: #2004555

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/nova/+/882867
Committed: https://opendev.org/openstack/nova/commit/b574901500d936488cdedf9fda90c4d36eeddd97
Submitter: "Zuul (22348)"
Branch: stable/xena

commit b574901500d936488cdedf9fda90c4d36eeddd97
Author: melanie witt <email address hidden>
Date: Wed Feb 15 22:37:40 2023 +0000

    Use force=True for os-brick disconnect during delete

    The 'force' parameter of os-brick's disconnect_volume() method allows
    callers to ignore flushing errors and ensure that devices are being
    removed from the host.

    We should use force=True when we are going to delete an instance to
    avoid leaving leftover devices connected to the compute host which
    could then potentially be reused to map to volumes to an instance that
    should not have access to those volumes.

    We can use force=True even when disconnecting a volume that will not be
    deleted on termination because os-brick will always attempt to flush
    and disconnect gracefully before forcefully removing devices.

    Conflicts:
        nova/tests/unit/virt/libvirt/volume/test_lightos.py
        nova/virt/libvirt/volume/lightos.py

    NOTE(melwitt): The conflicts are because change
    Ic314b26695d9681d31a18adcec0794c2ff41fe71 (Lightbits LightOS driver) is
    not in Xena.

    Closes-Bug: #2004555

    Change-Id: I3629b84d3255a8fe9d8a7cea8c6131d7c40899e8
    (cherry picked from commit db455548a12beac1153ce04eca5e728d7b773901)
    (cherry picked from commit efb01985db88d6333897018174649b425feaa1b4)
    (cherry picked from commit 8b4b99149a35663fc11d7d163082747b1b210b4d)
    (cherry picked from commit 4d8efa2d196f72fdde33136a0b50c4ee8da3c941)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/nova/+/882868
Committed: https://opendev.org/openstack/nova/commit/6cc4e7fb9ac49606c598e72fcd3d6cf02efac4f1
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 6cc4e7fb9ac49606c598e72fcd3d6cf02efac4f1
Author: melanie witt <email address hidden>
Date: Tue May 9 03:11:25 2023 +0000

    Enable use of service user token with admin context

    When the [service_user] section is configured in nova.conf, nova will
    have the ability to send a service user token alongside the user's
    token. The service user token is sent when nova calls other services'
    REST APIs to authenticate as a service, and service calls can sometimes
    have elevated privileges.

    Currently, nova does not however have the ability to send a service user
    token with an admin context. This means that when nova makes REST API
    calls to other services with an anonymous admin RequestContext (such as
    in nova-manage or periodic tasks), it will not be authenticated as a
    service.

    This adds a keyword argument to service_auth.get_auth_plugin() to
    enable callers to provide a user_auth object instead of attempting to
    extract the user_auth from the RequestContext.

    The cinder and neutron client modules are also adjusted to make use of
    the new user_auth keyword argument so that nova calls made with
    anonymous admin request contexts can authenticate as a service when
    configured.

    Related-Bug: #2004555

    Change-Id: I14df2d55f4b2f0be58f1a6ad3f19e48f7a6bfcb4
    (cherry picked from commit 41c64b94b0af333845e998f6cc195e72ca5ab6bc)
    (cherry picked from commit 1f781423ee4224c0871ab4aafec191bb2f7ef0e4)
    (cherry picked from commit 0d6dd6c67f56c9d4ed36246d14f119da6bca0a5a)
    (cherry picked from commit 98c3e3707c08a07f7ca5996086b165512f604ad6)

Revision history for this message
Zakhar Kirpichenko (kzakhar) wrote :

I apologize for the late response. My volumes are Ceph RBD, not sure which driver Nova uses internally.

Thanks for your feedback and fixes, everyone!

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/glance_store 4.3.1

This issue was fixed in the openstack/glance_store 4.3.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/glance_store 3.0.1

This issue was fixed in the openstack/glance_store 3.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/glance_store 4.1.1

This issue was fixed in the openstack/glance_store 4.1.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 25.2.0

This issue was fixed in the openstack/nova 25.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 26.2.0

This issue was fixed in the openstack/nova 26.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 22.1.0

This issue was fixed in the openstack/cinder 22.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 27.1.0

This issue was fixed in the openstack/nova 27.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 20.3.0

This issue was fixed in the openstack/cinder 20.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 21.3.0

This issue was fixed in the openstack/cinder 21.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to cinder (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/cinder/+/883360

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/nova/+/882869
Committed: https://opendev.org/openstack/nova/commit/5b4cb92aa8adab2bd3d7905e0b76eceab680ab28
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 5b4cb92aa8adab2bd3d7905e0b76eceab680ab28
Author: melanie witt <email address hidden>
Date: Wed Feb 15 22:37:40 2023 +0000

    Use force=True for os-brick disconnect during delete

    The 'force' parameter of os-brick's disconnect_volume() method allows
    callers to ignore flushing errors and ensure that devices are being
    removed from the host.

    We should use force=True when we are going to delete an instance to
    avoid leaving leftover devices connected to the compute host which
    could then potentially be reused to map to volumes to an instance that
    should not have access to those volumes.

    We can use force=True even when disconnecting a volume that will not be
    deleted on termination because os-brick will always attempt to flush
    and disconnect gracefully before forcefully removing devices.

    Conflicts:
        nova/tests/unit/virt/libvirt/volume/test_lightos.py
        nova/virt/libvirt/volume/lightos.py

    NOTE(melwitt): The conflicts are because change
    Ic314b26695d9681d31a18adcec0794c2ff41fe71 (Lightbits LightOS driver) is
    not in Xena.

    NOTE(melwitt): The difference from the cherry picked change is because
    of the following additional affected volume driver in Wallaby:
        * nova/virt/libvirt/volume/net.py

    Closes-Bug: #2004555

    Change-Id: I3629b84d3255a8fe9d8a7cea8c6131d7c40899e8
    (cherry picked from commit db455548a12beac1153ce04eca5e728d7b773901)
    (cherry picked from commit efb01985db88d6333897018174649b425feaa1b4)
    (cherry picked from commit 8b4b99149a35663fc11d7d163082747b1b210b4d)
    (cherry picked from commit 4d8efa2d196f72fdde33136a0b50c4ee8da3c941)
    (cherry picked from commit b574901500d936488cdedf9fda90c4d36eeddd97)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to ossa (master)

Reviewed: https://review.opendev.org/c/openstack/ossa/+/883202
Committed: https://opendev.org/openstack/ossa/commit/136b24c5ddfaff6f4957af9bc9b84fa1b7deb6e3
Submitter: "Zuul (22348)"
Branch: master

commit 136b24c5ddfaff6f4957af9bc9b84fa1b7deb6e3
Author: Jeremy Stanley <email address hidden>
Date: Mon May 15 18:52:55 2023 +0000

    Add errata 3 for OSSA-2023-003

    Since this only impacts the fix for stable/wallaby which is not
    under normal maintenance, we'll dispose with the usual errata
    announcements.

    Change-Id: Ibd0d1d796012fb5d34d48925ce34f6f1c300b54e
    Related-Bug: #2004555

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/nova/+/882870
Committed: https://opendev.org/openstack/nova/commit/48150a6fbab7e2a7b9fbeaa39110d0e6f7f37aaf
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 48150a6fbab7e2a7b9fbeaa39110d0e6f7f37aaf
Author: melanie witt <email address hidden>
Date: Tue May 9 03:11:25 2023 +0000

    Enable use of service user token with admin context

    When the [service_user] section is configured in nova.conf, nova will
    have the ability to send a service user token alongside the user's
    token. The service user token is sent when nova calls other services'
    REST APIs to authenticate as a service, and service calls can sometimes
    have elevated privileges.

    Currently, nova does not however have the ability to send a service user
    token with an admin context. This means that when nova makes REST API
    calls to other services with an anonymous admin RequestContext (such as
    in nova-manage or periodic tasks), it will not be authenticated as a
    service.

    This adds a keyword argument to service_auth.get_auth_plugin() to
    enable callers to provide a user_auth object instead of attempting to
    extract the user_auth from the RequestContext.

    The cinder and neutron client modules are also adjusted to make use of
    the new user_auth keyword argument so that nova calls made with
    anonymous admin request contexts can authenticate as a service when
    configured.

    Related-Bug: #2004555

    Change-Id: I14df2d55f4b2f0be58f1a6ad3f19e48f7a6bfcb4
    (cherry picked from commit 41c64b94b0af333845e998f6cc195e72ca5ab6bc)
    (cherry picked from commit 1f781423ee4224c0871ab4aafec191bb2f7ef0e4)
    (cherry picked from commit 0d6dd6c67f56c9d4ed36246d14f119da6bca0a5a)
    (cherry picked from commit 98c3e3707c08a07f7ca5996086b165512f604ad6)
    (cherry picked from commit 6cc4e7fb9ac49606c598e72fcd3d6cf02efac4f1)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to os-brick (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/os-brick/+/883951

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/xena)
Download full text (3.2 KiB)

Reviewed: https://review.opendev.org/c/openstack/cinder/+/882839
Committed: https://opendev.org/openstack/cinder/commit/68fdc323369943f494541a3510e71290b091359f
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 68fdc323369943f494541a3510e71290b091359f
Author: Gorka Eguileor <email address hidden>
Date: Thu Feb 16 15:57:15 2023 +0100

    Reject unsafe delete attachment calls

    Due to how the Linux SCSI kernel driver works there are some storage
    systems, such as iSCSI with shared targets, where a normal user can
    access other projects' volume data connected to the same compute host
    using the attachments REST API.

    This affects both single and multi-pathed connections.

    To prevent users from doing this, unintentionally or maliciously,
    cinder-api will now reject some delete attachment requests that are
    deemed unsafe.

    Cinder will process the delete attachment request normally in the
    following cases:

    - The request comes from an OpenStack service that is sending the
      service token that has one of the roles in `service_token_roles`.
    - Attachment doesn't have an instance_uuid value
    - The instance for the attachment doesn't exist in Nova
    - According to Nova the volume is not connected to the instance
    - Nova is not using this attachment record

    There are 3 operations in the actions REST API endpoint that can be used
    for an attack:

    - `os-terminate_connection`: Terminate volume attachment
    - `os-detach`: Detach a volume
    - `os-force_detach`: Force detach a volume

    In this endpoint we just won't allow most requests not coming from a
    service. The rules we apply are the same as for attachment delete
    explained earlier, but in this case we may not have the attachment id
    and be more restrictive. This should not be a problem for normal
    operations because:

    - Cinder backup doesn't use the REST API but RPC calls via RabbitMQ
    - Glance doesn't use this interface

    Checking whether it's a service or not is done at the cinder-api level
    by checking that the service user that made the call has at least one of
    the roles in the `service_token_roles` configuration. These roles are
    retrieved from keystone by the keystone middleware using the value of
    the "X-Service-Token" header.

    If Cinder is configured with `service_token_roles_required = true` and
    an attacker provides non-service valid credentials the service will
    return a 401 error, otherwise it'll return 409 as if a normal user had
    made the call without the service token.

    Closes-Bug: #2004555
    Change-Id: I612905a1bf4a1706cce913c0d8a6df7a240d599a
    (cherry picked from commit 6df1839bdf288107c600b3e53dff7593a6d4c161)
    Conflicts:
            cinder/exception.py
    (cherry picked from commit dd6010a9f7bf8cbe0189992f0848515321781747)
    (cherry picked from commit cb4682fb836912225c5da1536108a0d05fd5c46e)
    Conflicts:
            cinder/exception.py
    (cherry picked from commit a66f4afa22fc5a0a85d5224a6b63dd766fef47b1)
    Conflicts:
            cinder/compute/nova.py
            cinder/tests/unit/attach...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to cinder (master)

Reviewed: https://review.opendev.org/c/openstack/cinder/+/883360
Committed: https://opendev.org/openstack/cinder/commit/1101402b8fda7423b41b2f2e078f8f5a1d2bb4bd
Submitter: "Zuul (22348)"
Branch: master

commit 1101402b8fda7423b41b2f2e078f8f5a1d2bb4bd
Author: Gorka Eguileor <email address hidden>
Date: Wed May 17 13:42:41 2023 +0200

    Doc: Improve service token

    This patch extends a bit the documentation for the service token
    configuration, since there have been complains about its clarity and
    completeness.

    Related-Bug: #2004555
    Change-Id: Id89497d068c1644e4615fc0fb85c4d1a139ecc19

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/nova/+/884571

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to os-brick (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/882848
Committed: https://opendev.org/openstack/os-brick/commit/70493735d2f99523c4a23ecbeed15969b2e81f6b
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 70493735d2f99523c4a23ecbeed15969b2e81f6b
Author: Gorka Eguileor <email address hidden>
Date: Wed Mar 1 13:08:16 2023 +0100

    Support force disconnect for FC

    This patch adds support for the force and ignore_errors on the
    disconnect_volume of the FC connector like we have in the iSCSI
    connector.

    Related-Bug: #2004555
    Change-Id: Ia74ecfba03ba23de9d30eb33706245a7f85e1d66
    (cherry picked from commit 570df49db9de3030e658619138588b836c007f8c)
    Conflicts:
            os_brick/initiator/connectors/fibre_channel.py
    (cherry picked from commit 111b3931a2db1d5be4ebe704bf26c34fa9408483)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to cinder (stable/wallaby)

Related fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/cinder/+/885553

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to cinder (stable/victoria)

Related fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/cinder/+/885554

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to cinder (stable/ussuri)

Related fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/cinder/+/885555

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to cinder (stable/train)

Related fix proposed to branch: stable/train
Review: https://review.opendev.org/c/openstack/cinder/+/885556

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to os-brick (stable/wallaby)

Related fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/os-brick/+/885558

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to os-brick (stable/victoria)

Related fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/os-brick/+/885559

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to os-brick (stable/ussuri)

Related fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/os-brick/+/885560

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to os-brick (stable/train)

Related fix proposed to branch: stable/train
Review: https://review.opendev.org/c/openstack/os-brick/+/885561

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to os-brick (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/885558
Committed: https://opendev.org/openstack/os-brick/commit/5dcda6b961fa765c817f94a782a6fff48295c89a
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 5dcda6b961fa765c817f94a782a6fff48295c89a
Author: Brian Rosmaita <email address hidden>
Date: Wed Jun 7 18:29:20 2023 -0400

    [stable-em-only] Add CVE-2023-2088 warning

    The Cinder project team does not intend to backport a fix for
    CVE-2023-2088 to stable/wallaby, so add a warning to the README
    so that consumers are aware of the vulnerability of this branch
    of the os-brick code.

    Change-Id: I6345a5a3a7c08c88233b47806c28284fa2dd87d3
    Related-bug: #2004555

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to os-brick (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/885560
Committed: https://opendev.org/openstack/os-brick/commit/2845871c87fc4e6384bd16d81832cc71e2fb0d61
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit 2845871c87fc4e6384bd16d81832cc71e2fb0d61
Author: Brian Rosmaita <email address hidden>
Date: Wed Jun 7 18:29:20 2023 -0400

    [stable-em-only] Add CVE-2023-2088 warning

    The Cinder project team does not intend to backport a fix for
    CVE-2023-2088 to stable/ussuri, so add a warning to the README
    so that consumers are aware of the vulnerability of this branch
    of the os-brick code.

    Change-Id: Ie54cfc6697b4e54d37fd66dbad2ff20971399c00
    Related-bug: #2004555

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to os-brick (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/885559
Committed: https://opendev.org/openstack/os-brick/commit/78a0ea24a586139343c98821f9914901f1b5ec5b
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 78a0ea24a586139343c98821f9914901f1b5ec5b
Author: Brian Rosmaita <email address hidden>
Date: Wed Jun 7 18:29:20 2023 -0400

    [stable-em-only] Add CVE-2023-2088 warning

    The Cinder project team does not intend to backport a fix for
    CVE-2023-2088 to stable/victoria, so add a warning to the README
    so that consumers are aware of the vulnerability of this branch
    of the os-brick code.

    Change-Id: I37da3be26c7099307b46ae6b6320a3de7658e106
    Related-bug: #2004555

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to os-brick (stable/train)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/885561
Committed: https://opendev.org/openstack/os-brick/commit/0cc7019eec2b58f507905d52370a74eb80613b99
Submitter: "Zuul (22348)"
Branch: stable/train

commit 0cc7019eec2b58f507905d52370a74eb80613b99
Author: Brian Rosmaita <email address hidden>
Date: Wed Jun 7 18:29:20 2023 -0400

    [stable-em-only] Add CVE-2023-2088 warning

    The Cinder project team does not intend to backport a fix for
    CVE-2023-2088 to stable/train, so add a warning to the README
    so that consumers are aware of the vulnerability of this branch
    of the os-brick code.

    Change-Id: I6d04c164521b72538665f53ab62250b14b2710fe
    Related-bug: #2004555

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to cinder (stable/train)

Reviewed: https://review.opendev.org/c/openstack/cinder/+/885556
Committed: https://opendev.org/openstack/cinder/commit/299553a4fe281cde9b14da34a470dcdb3ed17cc0
Submitter: "Zuul (22348)"
Branch: stable/train

commit 299553a4fe281cde9b14da34a470dcdb3ed17cc0
Author: Brian Rosmaita <email address hidden>
Date: Wed Jun 7 18:01:12 2023 -0400

    [stable-em-only] Add CVE-2023-2088 warning

    The Cinder project team does not intend to backport a fix for
    CVE-2023-2088 to stable/train, so add a warning to the README
    so that consumers are aware of the vulnerability of this branch
    of the cinder code.

    Change-Id: I1621e3d3d9272a7a25b2d9d9e6710efb6b637a89
    Related-bug: #2004555

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to cinder (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/cinder/+/885554
Committed: https://opendev.org/openstack/cinder/commit/63d7848a9548180d283a833beb7c5718e0ad0bdb
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 63d7848a9548180d283a833beb7c5718e0ad0bdb
Author: Brian Rosmaita <email address hidden>
Date: Wed Jun 7 18:01:12 2023 -0400

    [stable-em-only] Add CVE-2023-2088 warning

    The Cinder project team does not intend to backport a fix for
    CVE-2023-2088 to stable/victoria, so add a warning to the README
    so that consumers are aware of the vulnerability of this branch
    of the cinder code.

    Change-Id: I2866b0ca1511a53b096b73bbe51a74588cdd8947
    Related-bug: #2004555

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to cinder (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/cinder/+/885555
Committed: https://opendev.org/openstack/cinder/commit/60f705d722fc6b7c434194a9f3b11595294d6aa0
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit 60f705d722fc6b7c434194a9f3b11595294d6aa0
Author: Brian Rosmaita <email address hidden>
Date: Wed Jun 7 18:01:12 2023 -0400

    [stable-em-only] Add CVE-2023-2088 warning

    The Cinder project team does not intend to backport a fix for
    CVE-2023-2088 to stable/ussuri, so add a warning to the README
    so that consumers are aware of the vulnerability of this branch
    of the cinder code.

    Change-Id: I5c55ab7ca6c85d23c5ab7d2d383a18226735aaf2
    Related-bug: #2004555

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to cinder (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/cinder/+/885553
Committed: https://opendev.org/openstack/cinder/commit/2fef6c41fa8c5ea772cde227a119dcf22ce7a07d
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 2fef6c41fa8c5ea772cde227a119dcf22ce7a07d
Author: Brian Rosmaita <email address hidden>
Date: Wed Jun 7 18:01:12 2023 -0400

    [stable-em-only] Add CVE-2023-2088 warning

    The Cinder project team does not intend to backport a fix for
    CVE-2023-2088 to stable/wallaby, so add a warning to the README
    so that consumers are aware of the vulnerability of this branch
    of the cinder code.

    Change-Id: I83b5232076250553650b8b97409cbf72e90c15b9
    Related-bug: #2004555

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 23.0.0.0rc1

This issue was fixed in the openstack/cinder 23.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 28.0.0.0rc1

This issue was fixed in the openstack/nova 28.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public Security information  
Everyone can see this security related information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.