oslo notifications sending sensitive tokens

Bug #2030976 reported by Scott Solkhon
264
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
In Progress
Medium
Jay Faulkner
OpenStack Compute (nova)
Confirmed
Undecided
Unassigned
OpenStack Security Advisory
Confirmed
Medium
Jeremy Stanley
oslo.messaging
In Progress
Undecided
Jay Faulkner

Bug Description

Hi,

I have configured an OpenStack deployment to send Ironic service notifications using oslo_messaging_notifications[1] and noticed that Keystone tokens are being sent in the ['oslo.message']['_context_auth_token'] field of the message payload.

- I have confirmed that auth token is leaked using both a Kafka and RabbitMQ backed
- I have also confirmed that both messaging and messagingv2 options under oslo_messaging_notifications.driver are impacted[2]
- I am using the Victoria version of Openstack and I have not confirmed if this has been patched on newer versions

1) https://docs.openstack.org/ironic/latest/admin/notifications.html
2) https://docs.openstack.org/ironic/victoria/configuration/sample-config.html

Revision history for this message
Jeremy Stanley (fungi) wrote :

I've added a "won't fix" security advisory task for now as a formality, since reports of suspected vulnerabilities are not officially overseen by the OpenStack Vulnerability Management Team: https://security.openstack.org/repos-overseen.html

I'm still happy to provide guidance on behalf of the VMT on a best-effort basis.

Changed in ossa:
status: New → Won't Fix
Changed in ironic:
assignee: nobody → Jay Faulkner (jason-oldos)
Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

At first pass of Ironic's code, I'm not seeing where Ironic would extract the auth_token and embed it in the message payload. It looks like the code pattern means we generate payloads of just the node itself and the related state data through the decorator. That being said, still investigating. Is there any chance we could get more information or example payload so we can pin down the exact path which generates this so we can validate it is doing what we expect, or not expect as a sanity check?

Thanks!

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

A redacted payload was shared outside of this bug; it was not released for public consumption.

Revision history for this message
Jay Faulkner (jason-oldos) wrote :
Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

Okay, I think we've figured out what is going on.

It looks like we save _context on to the object in the ironic/objects/base.py in order to allow a free standing object and be able to transform/update/change and keep it handy.

It then also looks like we convert the entire object to a dict which captures the field in the notification as _context, which gets sent as the payload of the notification.

Changed in nova:
status: New → Confirmed
Changed in ironic:
status: New → Confirmed
Jeremy Stanley (fungi)
Changed in ossa:
status: Won't Fix → Incomplete
Changed in ironic:
importance: Undecided → Critical
Jeremy Stanley (fungi)
description: updated
Revision history for this message
Jay Faulkner (jason-oldos) wrote :

Confirmed in IRC via Dan Smith that Nova is impacted with a similar bug.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Since this report concerns a possible security risk, an incomplete
security advisory task has been added while the core security
reviewers for the affected project or projects confirm the bug and
discuss the scope of any vulnerability along with potential
solutions.

Revision history for this message
Dan Smith (danms) wrote (last edit ):

Jeremy, nova is affected, but it seems less than critical to me since this is only visible to operators which can already see tokens if they want. Is there any guidance on what this should be classified as?

Revision history for this message
Jeremy Stanley (fungi) wrote :

Once there is a template for a working fix in Ironic and/or Nova, I recommend we switch this to public and get other project teams to audit their codebases for similar issues. Once we think the bulk of them are covered (at least any for which vulnerability reports are overseen by the VMT), we can publish a single advisory covering the lot.

Coordinating an effort like that privately across many projects gets very complicated, and increases the number of people who know about the problem to the point where it may as well be public anyway.

Given the nature of this defect is that Keystone tokens are being included in notifications, untrusted parties with an interest in exploiting their access to that information will likely find it regardless of whether we work on this bug privately or in public.

Revision history for this message
Dan Smith (danms) wrote :

Ack, sounds reasonable to me. I haven't started looking at the fix, because it seems like we're only sending that information when the notifications are going over MQ and not to other destinations like driver=log. I think I'll mark this as "high" severity for nova regardless.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Just to confirm, a possible short-term mitigation in deployments who can't patch services immediately would be to simply disable notifications in configuration, correct?

Also, obviously, operators will need to purge or redact any copies of old notifications in external systems once disabled or patched.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Dan: From what I've been told, some deployments configure notifications to serve as a sort of "event stream" logging to external databases for posterity and later activities such as incident root cause analysis. With that in mind, it seems like this could be seen as similar to leaking credentials in (non-debug) service logs.

I agree it's probably low impact for most deployments, and so another reason not to need to review fixes in private, but still warrants an advisory to let operators know they might need to clean up external copies of notifications after patching or audit for the possibility of unauthorized token use depending on how and to whom their operational privileges are delegated.

Revision history for this message
Dan Smith (danms) wrote :

Sure, but notifications contain much more information than users should see as it is, so if they're exposing them to untrusted people they already have a problem. There's no tenant isolation in the notification bus, for example. But yeah if it leads to persisting the whole notification somewhere else, it does increase the surface by which someone could potentially see those things. Disabling would work, as would using one of the other drivers, as far as I can tell.

Revision history for this message
Jeremy Stanley (fungi) wrote :

It sounds like this could also benefit from a larger post-mortem discussion about whether OpenStack wants to discourage treating notifications as a low-risk data stream, maybe with some bold disclaimers/admonitions in documentation and configuration comment blocks. If the community collectively comes to that conclusion, we could consider future such defects as security hardening opportunities rather than privileged information leaks.

Revision history for this message
Dan Smith (danms) wrote :
Download full text (7.0 KiB)

Here's an example I captured from the rabbit bus from a nova example operation showing the problem. The context is *not* part of the o.vo serialization routine in the payload but rather top-level in the message, part of the oslo.messaging marshalling.

{'_context_auth_token': 'gAAAAABk1RkXSSLrJGj3jVFU-9Woyy4cxxSIJt_4JCCjy2ieh0T_HuZE5oOUuVXSvh8WKvvnx9Qe-0AtsRLj5uJDomyqWTifc7bzHjyZHlNTX2GK0sOJSTQcnzrrg5EP_jKFtD51msoH7ZP07rUXHkMBMlJQU5FwRl0bp5qTWP51HJTrgKXxBGc',
 '_context_domain': None,
 '_context_global_request_id': None,
 '_context_is_admin': False,
 '_context_is_admin_project': True,
 '_context_project': 'a45c2f77b50b4ff08a31d26a94346bbb',
 '_context_project_domain': 'default',
 '_context_project_id': 'a45c2f77b50b4ff08a31d26a94346bbb',
 '_context_project_name': 'demo',
 '_context_quota_class': None,
 '_context_read_deleted': 'no',
 '_context_read_only': False,
 '_context_remote_address': '192.168.122.222',
 '_context_request_id': 'req-b1f5ad71-d5ee-4779-8425-888dc52be82b',
 '_context_resource_uuid': None,
 '_context_roles': ['anotherrole', 'member', 'reader'],
 '_context_service_catalog': [{'endpoints': [{'publicURL': 'https://192.168.122.222:9696/networking',
                                              'region': 'RegionOne'}],
                               'name': 'neutron',
                               'type': 'network'},
                              {'endpoints': [{'publicURL': 'https://192.168.122.222/placement',
                                              'region': 'RegionOne'}],
                               'name': 'placement',
                               'type': 'placement'},
                              {'endpoints': [{'publicURL': 'https://192.168.122.222/volume/v3/a45c2f77b50b4ff08a31d26a94346bbb',
                                              'region': 'RegionOne'}],
                               'name': 'cinderv3',
                               'type': 'volumev3'},
                              {'endpoints': [{'publicURL': 'https://192.168.122.222/volume/v3/a45c2f77b50b4ff08a31d26a94346bbb',
                                              'region': 'RegionOne'}],
                               'name': 'cinder',
                               'type': 'block-storage'},
                              {'endpoints': [{'publicURL': 'https://192.168.122.222/image',
                                              'region': 'RegionOne'}],
                               'name': 'glance',
                               'type': 'image'}],
 '_context_show_deleted': False,
 '_context_system_scope': None,
 '_context_timestamp': '2023-08-10T17:06:32.802080',
 '_context_user': '7be5208c3a2c4643a0c2c746acb2716c',
 '_context_user_domain': 'default',
 '_context_user_id': '7be5208c3a2c4643a0c2c746acb2716c',
 '_context_user_identity': '7be5208c3a2c4643a0c2c746acb2716c '
                           'a45c2f77b50b4ff08a31d26a94346bbb - default default',
 '_context_user_name': 'demo',
 '_unique_id': 'c73164b9d6264787a15736264860a91e',
 'event_type': 'instance.create.start',
 'message_id': 'ff466c49-cea1-48d5-9176-ddc7bed75e2f',
 'payload': {'nova_object.data': {'action_initiator_project': 'a45c2f77b50b4ff08a31d26a94346bbb',
                    ...

Read more...

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

Julia, Dan and I agreed in a video call the embargo should be liftable. Anyone who has been leaking creds have been for a long time, and letting them know is more important than keeping quiet while we're fixing it.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Thanks, I've switched this to public security now. Please make sure to refer to this bug number in any fixes you push to Gerrit so they'll get tracked in here, and we can start seeing what other projects may have this problem and need similar treatment.

I'll work on the CVE assignment request now.

description: updated
information type: Private Security → Public Security
Revision history for this message
Jay Faulkner (jason-oldos) wrote :

This lays out, in comments, the fix needed.

Changed in oslo.messaging:
status: New → Confirmed
assignee: nobody → Jay Faulkner (jason-oldos)
Jeremy Stanley (fungi)
Changed in ossa:
status: Incomplete → Confirmed
importance: Undecided → Medium
assignee: nobody → Jeremy Stanley (fungi)
Revision history for this message
Jay Faulkner (jason-oldos) wrote :

So to summarize the digging done by myself/Julia/Dan:
- oslo.messaging is serializing context into the notification objects explicitly in drivers; what oslo_messaging.notify.notifier.Notifier._notify calls serialize_context
- Depending on what driver, that does different things:
-- Logging driver does not end up serializing items from context.
-- AMQP/Kafka driver *does* end up serializing items from context, including security sensitive items such as auth_token

This means any project that supports notifications via oslo.messaging, with a populated context being passed in, will be impacted by this bug.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Please take a look at the proposed impact description below. After a round of corrections I'll request a CVE assignment based on this additional information. We can amend the information later to include other affected projects (and reporters) if any are identified.

title: Authentication tokens included in notification payloads

reporters:
  - name: Scott Solkhon
    affiliation: G-Research
    reported: 'CVE-TBD'
  - name: Dan Smith
    affiliation: Red Hat
    reported: 'CVE-TBD'

affected-products:
  - product: Ironic
    version: '<20.1.2, >=20.2.0 <21.1.1, >=21.2.0 <21.4.1'
  - product: Nova
    version: '<25.2.1, >=26.0.0 <26.2.1, >=27.0.0 <27.1.1'

description: >
  Scott Solkhon with G-Research and Dan Smith with Red Hat reported
  related vulnerabilities in Ironic and Nova. Some service
  notification payloads may unnecessarily embed raw copies of
  authentication tokens, revealing those credentials to systems
  administrators who are allowed access to copies of notifications,
  allowing them to impersonate the affected accounts. Only
  deployments with notifications enabled using the AMQP or Kafka
  drivers are affected.

Revision history for this message
Dan Smith (danms) wrote :

Jeremy, that's fine as it is. However, it might be good to use a different word than "payload" for clarity. The whole thing is a payload, but there's also a payload field, which is what nova (et al) asked to be sent as the notification. The distinction may be important because nova itself has no control over anything *but* the actual payload field, and the context is included in the top-level message by oslo.messaging.

I would just s/payloads/messages/ perhaps if you agree the disambiguation is worthwhile. Otherwise, looks fine to me.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Thanks, I agree that's a worthwhile distinction. After an overdue lunch I can see a few other spots worth polishing in my hastily-scrawled prose. How's this...

title: Authentication tokens included in notification messages

reporters:
  - name: Scott Solkhon
    affiliation: G-Research
    reported: 'CVE-TBD'
  - name: Dan Smith
    affiliation: Red Hat
    reported: 'CVE-TBD'

affected-products:
  - product: Ironic
    version: '<20.1.2, >=20.2.0 <21.1.1, >=21.2.0 <21.4.1'
  - product: Nova
    version: '<25.2.1, >=26.0.0 <26.2.1, >=27.0.0 <27.1.1'

description: >
  Scott Solkhon with G-Research and Dan Smith with Red Hat reported
  related vulnerabilities in Ironic and Nova. Some service
  notifications may unnecessarily embed serialized authentication
  tokens, revealing those credentials to systems administrators who
  have access to copies of notifications and allowing them to
  impersonate the affected accounts. Only deployments with
  notifications enabled using the AMQP or Kafka drivers are
  affected.

Revision history for this message
Dan Smith (danms) wrote :

Yep, LGTM.

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

I assume oslo.messaging isn't setup to notify lp; https://review.opendev.org/c/openstack/oslo.messaging/+/891096 has been proposed to resolve this in master.

Changed in oslo.messaging:
status: Confirmed → In Progress
Changed in ironic:
status: Confirmed → In Progress
Revision history for this message
Jeremy Stanley (fungi) wrote (last edit ):

The commit message footer was incorrectly formatted, which is why the bug did not get updated. See comment on the change for further details.

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

draft notice LGTM

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to oslo.messaging (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/oslo.messaging/+/891610

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to oslo.messaging (master)

Reviewed: https://review.opendev.org/c/openstack/oslo.messaging/+/891096
Committed: https://opendev.org/openstack/oslo.messaging/commit/1b315615e7dc61dbf845bd663560fc8d5a18fa09
Submitter: "Zuul (22348)"
Branch: master

commit 1b315615e7dc61dbf845bd663560fc8d5a18fa09
Author: Jay Faulkner <email address hidden>
Date: Thu Aug 10 11:28:32 2023 -0700

    Only allow safe context fields in notifications

    Publishing a fully hydrated context object in a notification would give
    someone with access to that notification the ability to impersonate the
    original actor through inclusion of sensitive fields.

    Now, instead, we pare down the context object to the bare minimum before
    passing it for serialization in notification workflows.

    Related-bug: 2030976
    Change-Id: Ic94323658c89df1c1ff32f511ca23502317d0f00

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to oslo.messaging (stable/2023.1)

Related fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/oslo.messaging/+/891742

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to oslo.messaging (stable/zed)

Related fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/oslo.messaging/+/891743

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to oslo.messaging (stable/yoga)

Related fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/oslo.messaging/+/891744

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to oslo.messaging (stable/xena)

Related fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/oslo.messaging/+/891745

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to oslo.messaging (stable/wallaby)

Related fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/oslo.messaging/+/891746

Revision history for this message
Jeremy Stanley (fungi) wrote :

Before moving forward with communication around this issue, it would be good to get some clarification with regard to the proposed fixes.

The current impact description is with regard to defects in the codebases of Ironic and Nova, but I have yet to see patches like those originally attached get pushed into Gerrit. Work seems to be proceeding with backports of a change in oslo.messaging that say they are "related" to this bug report, but not indicating that they're fixing it (at least as far as the commit messages use a "Related-Bug" footer rather than "Closes-Bug"). If the fix, and so corresponding defect, has been determined to lie within oslo.messaging instead then we need to reframe the impact description before requesting a CVE with it.

Can someone update the bug with the current plan and thinking around how this is proceeding?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to oslo.messaging (stable/victoria)

Related fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/oslo.messaging/+/892372

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

We failed to properly update this bug with context related to the fix.

What we found is that oslo.messaging's notification support only filters sensitive information (almost the entire request context is filtered, in fact) when notifications are logged. When those same notifications are sent to a message bug (amqp or kafka), they contain secrets including an auth token.

The first-stage fix, https://review.opendev.org/c/openstack/oslo.messaging/+/891096/11/oslo_messaging/notify/notifier.py#174, filters the context for secrets before passing the context onto the messaging bus. This should be backported as far back in oslo.messaging as possible. (this is where we are now; this patch is landed in master and pending review as far back as victoria).

Once that fix is in, we need to update requirements or the services in question to ensure they will install the newer oslo.messaging. I am unsure of the correct mechanism for this change.

[at this point the bug is resolved]

Once that is done, we will move on to stage 2, which is mostly cleaning up tech debt caused by the coarse, backportable approach taken by the initial patch. This involves updating, and releasing, oslo.context to include a method to get a sanitized copy of the context. The follow-on oslo.messaging fix will use that context method instead of filtering manually.

The only changes that will be needed in Ironic/Nova or other impacted OpenStack services will be something to ensure that only fixed versions of oslo.messaging will be installed with them.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Thanks, taking the revised view of this vulnerability into account, here's a rewritten impact statement for use in the CVE request and subsequent security advisory publication (note that the fix to master was included in 14.4.0, its stable branch backports have not merged yet as of the time of writing):

title: Authentication tokens included in notification messages

reporters:
  - name: Scott Solkhon
    affiliation: G-Research
    reported: 'CVE-TBD'

affected-products:
  - product: oslo.messaging
    version: '<12.13.2, >=12.14.0 <14.0.2, >=14.1.0 <14.2.2, >=14.3.0 <14.4.0'

description: >
  Scott Solkhon with G-Research reported a vulnerability in
  oslo.messaging's notifier. Some service notifications may include
  context with embedded authentication tokens, which become
  serialized within the message revealing those credentials to
  systems administrators who have access to copies of notifications,
  potentially allowing them to impersonate the affected accounts.
  Only deployments with notifications enabled using the AMQP or
  Kafka drivers are affected.

Changed in ironic:
importance: Critical → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.messaging (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/oslo.messaging/+/891745
Committed: https://opendev.org/openstack/oslo.messaging/commit/44d112eb9d822e7432995bacdc494a0685798dd5
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 44d112eb9d822e7432995bacdc494a0685798dd5
Author: Jay Faulkner <email address hidden>
Date: Thu Aug 10 11:28:32 2023 -0700

    Only allow safe context fields in notifications

    Publishing a fully hydrated context object in a notification would give
    someone with access to that notification the ability to impersonate the
    original actor through inclusion of sensitive fields.

    Now, instead, we pare down the context object to the bare minimum before
    passing it for serialization in notification workflows.

    Closes-bug: 2030976
    Change-Id: Ic94323658c89df1c1ff32f511ca23502317d0f00
    (cherry picked from commit 1b315615e7dc61dbf845bd663560fc8d5a18fa09)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.messaging (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/oslo.messaging/+/891742
Committed: https://opendev.org/openstack/oslo.messaging/commit/6aa3c6fd389289507f40d22a83253100ddbb169d
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit 6aa3c6fd389289507f40d22a83253100ddbb169d
Author: Jay Faulkner <email address hidden>
Date: Thu Aug 10 11:28:32 2023 -0700

    Only allow safe context fields in notifications

    Publishing a fully hydrated context object in a notification would give
    someone with access to that notification the ability to impersonate the
    original actor through inclusion of sensitive fields.

    Now, instead, we pare down the context object to the bare minimum before
    passing it for serialization in notification workflows.

    Closes-bug: 2030976
    Change-Id: Ic94323658c89df1c1ff32f511ca23502317d0f00
    (cherry picked from commit 1b315615e7dc61dbf845bd663560fc8d5a18fa09)

tags: added: in-stable-zed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.messaging (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/oslo.messaging/+/891743
Committed: https://opendev.org/openstack/oslo.messaging/commit/bf0710d60fda74ec9d5afc31266edcee717c00eb
Submitter: "Zuul (22348)"
Branch: stable/zed

commit bf0710d60fda74ec9d5afc31266edcee717c00eb
Author: Jay Faulkner <email address hidden>
Date: Thu Aug 10 11:28:32 2023 -0700

    Only allow safe context fields in notifications

    Publishing a fully hydrated context object in a notification would give
    someone with access to that notification the ability to impersonate the
    original actor through inclusion of sensitive fields.

    Now, instead, we pare down the context object to the bare minimum before
    passing it for serialization in notification workflows.

    Closes-bug: 2030976
    Change-Id: Ic94323658c89df1c1ff32f511ca23502317d0f00
    (cherry picked from commit 1b315615e7dc61dbf845bd663560fc8d5a18fa09)

tags: added: in-stable-yoga
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.messaging (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/oslo.messaging/+/891744
Committed: https://opendev.org/openstack/oslo.messaging/commit/8ee7638a860c48785d94bfda8d336b6239da5013
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 8ee7638a860c48785d94bfda8d336b6239da5013
Author: Jay Faulkner <email address hidden>
Date: Thu Aug 10 11:28:32 2023 -0700

    Only allow safe context fields in notifications

    Publishing a fully hydrated context object in a notification would give
    someone with access to that notification the ability to impersonate the
    original actor through inclusion of sensitive fields.

    Now, instead, we pare down the context object to the bare minimum before
    passing it for serialization in notification workflows.

    Closes-bug: 2030976
    Change-Id: Ic94323658c89df1c1ff32f511ca23502317d0f00
    (cherry picked from commit 1b315615e7dc61dbf845bd663560fc8d5a18fa09)

Revision history for this message
Pavlo Shchelokovskyy (pshchelo) wrote :
Download full text (3.3 KiB)

this patch breaks cinder and potentially others when notifications are enabled.

https://codesearch.openstack.org/?q=def+_set_read_deleted&i=nope&literal=nope&files=context&excludeFiles=&repos=

the Context subclass in those projects has a property "read_deleted" that can not be None, only 'yes', 'no', or 'only'. But the safe_context is swallowing as it is not in 'safe list'. And fromDict in Cinder/Manila does not have default for it
https://opendev.org/openstack/cinder/src/commit/ddcf394ae277692d7cab2e03b4b6ae67ba7f1cc3/cinder/context.py#L193

so when it tries to create the class from dict it fails because it is passed None now.

2023-08-30T11:40:37.864665396Z 2023-08-30 11:40:37.864 1 ERROR cinder.scheduler.filter_scheduler [req-d940a4a6-f6fd-43f9-9ec6-f512168402b0 c298a2f40ba045de9bd4c675d3792920 dbe66253674c4173903b1c6eba4f9dd3 - - -] Error scheduling bde485dc-75ea-4c0a-bda9-e784676b4cdc from last vol-service: cinder-ceph-cluster@volumes-hdd#volumes-hdd : ['Traceback (most recent call last):\n', ' File "/var/lib/openstack/lib/python3.10/site-packages/taskflow/engines/action_engine/executor.py", line 53, in _execute_task\n result = task.execute(**arguments)\n', ' File "/var/lib/openstack/lib/python3.10/site-packages/cinder/volume/flows/manager/create_volume.py", line 371, in execute\n volume_utils.notify_about_volume_usage(context, volume,\n', ' File "/var/lib/openstack/lib/python3.10/site-packages/cinder/utils.py", line 960, in wrapped\n return f(*args, **kwargs)\n', ' File "/var/lib/openstack/lib/python3.10/site-packages/cinder/volume/volume_utils.py", line 177, in notify_about_volume_usage\n rpc.get_notifier("volume", host).info(context, \'volume.%s\' % event_suffix,\n', ' File "/var/lib/openstack/lib/python3.10/site-packages/oslo_messaging/notify/notifier.py", line 405, in info\n self._notify(ctxt, event_type, payload, \'INFO\')\n', ' File "/var/lib/openstack/lib/python3.10/site-packages/oslo_messaging/notify/notifier.py", line 494, in _notify\n super(_SubNotifier, self)._notify(ctxt, event_type, payload, priority)\n', ' File "/var/lib/openstack/lib/python3.10/site-packages/oslo_messaging/notify/notifier.py", line 344, in _notify\n safe_ctxt = _sanitize_context(ctxt)\n', ' File "/var/lib/openstack/lib/python3.10/site-packages/oslo_messaging/notify/notifier.py", line 212, in _sanitize_context\n return ctxt.__class__.from_dict(safe_dict)\n', ' File "/var/lib/openstack/lib/python3.10/site-packages/cinder/context.py", line 187, in from_dict\n return cls(user_id=values.get(\'user_id\'),\n', ' File "/var/lib/openstack/lib/python3.10/site-packages/cinder/context.py", line 115, in __init__\n self.read_deleted = read_deleted\n', ' File "/var/lib/openstack/lib/python3.10/site-packages/cinder/context.py", line 158, in _set_read_deleted\n raise ValueError(_("read_deleted can only be one of \'no\', "\n', "ValueError: read_deleted can only be one of 'no', 'yes' or 'only', not None\n"]

Seems like the real problem is that defaults in __init__ of cinder.context.RequestContext are different than when using fromDict

in __init__, read_deleted="no",
in dromDict, read_deleted=values.get("read_deleted") which is...

Read more...

Revision history for this message
Pavlo Shchelokovskyy (pshchelo) wrote :
Revision history for this message
Jeremy Stanley (fungi) wrote :

If we have to separately patch individual consumers of oslo.messaging, that's an indication that the change introduces a backward-incompatible regression and should not have been backported (but slipped through due to insufficient test coverage). Can this be fixed on the oslo.messaging side with additional backportable changes to address the regression?

Revision history for this message
Jeremy Stanley (fungi) wrote :

Oh, also, any feedback on my rewritten impact statement in comment #37? I'd like to move forward with the CVE request submission to MITRE if it looks accurate.

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

First of all; the impact statement is good.

Secondly, I do think it's still backportable, and that Cinder/Manila need to backport their bugfixes first. One perspective on this is that we exposed a bug in Cinder/Manila where from_dict() required defaults that __init__ for the same class didn't need. I obviously feel personally bad that the change broke those projects; but I think we have a path forward: backported fixes for Cinder/Manila, then backport the oslo.messaging fix.

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

Noting this is already landed in...

stable/2023.1: https://review.opendev.org/c/openstack/oslo.messaging/+/891742
stable/zed: https://review.opendev.org/c/openstack/oslo.messaging/+/891743
stable/yoga: https://review.opendev.org/c/openstack/oslo.messaging/+/891744
stable/xena: https://review.opendev.org/c/openstack/oslo.messaging/+/891745

So I think backporting the fix for cinder/manila is the best path forward if they are able to; if not we can potentially write another oslo.messaging patch that adds read_deleted to the safe list.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/oslo.messaging 14.0.2

This issue was fixed in the openstack/oslo.messaging 14.0.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/oslo.messaging 14.2.2

This issue was fixed in the openstack/oslo.messaging 14.2.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/oslo.messaging 12.13.2

This issue was fixed in the openstack/oslo.messaging 12.13.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on oslo.messaging (stable/wallaby)

Change abandoned by "Jay Faulkner <email address hidden>" on branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/oslo.messaging/+/891746
Reason: Going to stop backporting at Xena. This patch should still be OK for people running Wallaby with no Cinder/Manila.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on oslo.messaging (stable/victoria)

Change abandoned by "Jay Faulkner <email address hidden>" on branch: stable/victoria
Review: https://review.opendev.org/c/openstack/oslo.messaging/+/892372
Reason: Going to stop backporting at Xena. This patch should still be OK for people running Victoria with no Cinder/Manila.

Revision history for this message
Pavlo Shchelokovskyy (pshchelo) wrote :

btw, masakari also seems to be affected - it completely lacks its own 'from_dict' in its RequestContext subclass, and relies on FROM_DICT_EXTRA_KEYS list - which of course also has an effective 'default of None

Revision history for this message
Jeremy Stanley (fungi) wrote :

Any feedback on my rewritten impact statement in comment #37? I'd like to move forward with the CVE request submission to MITRE if it looks accurate.

Revision history for this message
Dan Smith (danms) wrote :

Jeremy, the impact statement looks good to me. If it were me, I'd probably add some sort of "it's been this way since the beginning of time" to make it clear it's not new, but I'll defer to you on whether anything like that is appropriate there. Otherwise looks fine to me.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Thanks Dan. And yeah, the versions list is meant to indicate that all versions of oslo.messaging older than 12.13.2 behave this way. We try not to give minimum affected versions unless someone identifies for certain what commit introduced the behavior in question.

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

I've had a lot of trouble getting fixes landed and reviewed for Cinder and Manila. I have not tried to fix Masakari yet.

I believe we should rework the patch to unbreak these, at least for 2023.2 and older. Moving forward we have a better version of this patch, adding a method to RequestContext to return a redacted copy, which shouldn't break anything.

My suggestion: make this line https://github.com/openstack/oslo.messaging/commit/3485301b1838353c16c6dd1c1cfbcab866df2e5d#diff-a57c1c0917590017268d8351fbe7692bfafa4cba4d7ba3b28e2c034b666c0e2fR212 call RequestContext.from_dict() -- we are already effectively removing any fields added by the projects.

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

Pushed https://review.opendev.org/c/openstack/masakari/+/894393 for masakari, as well as a CI fix as their CI is busted.

Revision history for this message
Jeremy Stanley (fungi) wrote :

I still haven't requested a CVE assignment for this because it's not yet clear to me what the end state is supposed to look like. Currently the fixed oslo.messaging versions are still not used in upper constraints, indicating there will either be further changes to oslo.messaging or to all services the fix is currently causing regressions in.

What's the current plan for a path forward here?

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

I pushed a fix up for https://bugs.launchpad.net/oslo.messaging/+bug/2037312 today; which I believe was the last unit-test blocker to updating requirements.

My plan:
- Get this landed in oslo.messaging master
- Backport thru stable/xena
- Release updated oslo.messaging branches as bugfix releases
- Get an RFE to bump oslo.messaging version in stable/2023.2 (not sure exactly how this works as it's something we'd update in a backport anyway)
- Release 2023.1-xena versions of updated oslo.messaging

I'm hoping I can get to the end of this path this week, but I'm not sure. I am prioritizing getting this across the finish line.

To post a comment you must log in.
This report contains Public Security information  
Everyone can see this security related information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.