auth_url for [ironic] section should be rendered with /v3 for yoga

Bug #1995778 reported by Przemyslaw Hausman
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Nova Cloud Controller Charm
Fix Committed
High
Unassigned
OpenStack Nova Compute Charm
Status tracked in Trunk
2023.1
Fix Committed
Undecided
Unassigned
Trunk
Fix Released
Undecided
Unassigned
Yoga
Fix Committed
Undecided
Unassigned
Zed
Fix Committed
Undecided
Unassigned

Bug Description

OpenStack Yoga on Focal
Charm channel yoga/stable
Charm revision 606
nova version 25.0.1

auth_url for [ironic] section is incorrectly rendered in nova.conf with suffix /v2.0 instead of /v3, e.g.:

```
[ironic]
auth_url = https://keystone-internal.example.com:35357/v2.0
```

As a result, an Ironic bare metal node cannot be created and the following error is reported in nova-compute.log (on the nova-compute unit configured with "virt-type: ironic"):

```
2022-11-05 22:42:04.211 25369 ERROR nova.virt.ironic.driver [req-edd97caf-2eb3-42bf-b3c8-9e376e4643a8 - - - - -] An unknown error has occurred when trying to get the list of nodes from the Ironic inventory. Error: Cannot use v2 authentication with domain scope: keystoneauth1.exceptions.discovery.DiscoveryFailure: Cannot use v2 authentication with domain scope
2022-11-05 22:42:04.211 25369 WARNING nova.compute.manager [req-edd97caf-2eb3-42bf-b3c8-9e376e4643a8 - - - - -] Virt driver is not ready.: nova.exception.VirtDriverNotReady: Virt driver is not ready.
```

Bare metal node cannot be created because nova resource tracker was not able to report bare metal nodes to Placement service. As a result, Placement was not able to find any nodes with resource class related to baremetal node, i.e. the following command returns no results (but it should):

```
openstack allocation candidate list --resource CUSTOM_BAREMETAL_LARGE='1'
```

Replacing /v2.0 to /v3 in nova.conf (on the nova-compute unit configured with "virt-type: ironic") and restarting nova-compute service fixes the problem. The fix is temporary until the charm re-renders nova.conf.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Thanks for the bug report. It's a curious one, this, as it really ought to render as v3 and not v2.0, so it may be that something's up elsewhere in the cloud.

If possible, please could you add the supporting information from https://docs.openstack.org/charm-guide/latest/community/software-bug.html#essential-information (e.g. juju show-unit for the nova-compute unit would be really useful, but it does contain passwords so may need sanitising!)

The config is rendered from this part:

{% if virt_type == 'ironic' and auth_host and ironic_api_ready -%}
{% if api_version and api_version == "3" -%}
{% set auth_ver = "v3" -%}
{% else -%}
{% set auth_ver = "v2.0" -%}
{% endif -%}
[ironic]
auth_type = password
auth_url = {{auth_protocol}}://{{auth_host}}:{{auth_port}}/{{auth_ver}}
project_name = {{ admin_tenant_name }}
username = {{ admin_user }}
password = {{ admin_password }}
project_domain_name = {{ admin_domain_name }}
user_domain_name = {{ admin_domain_name }}
{% endif -%}

So api_version is most likely *not* '3' when rendering this section; it would be useful to see what the nova.conf is being rendered at, particularly whether the auth_url is correct elsewhere.

Changed in charm-nova-compute:
status: New → Incomplete
Revision history for this message
Przemyslaw Hausman (phausman) wrote :
Download full text (3.7 KiB)

Alex, thank you for looking into this bug. Please see attached nova.conf from the nova-ironic unit. Interestingly, for other sections, e.g. [neutron], keystone-internal endpoint is rendered without specifying the API version at all:

```
[neutron]
auth_url = https://keystone-internal.engineering-cloud.example.com:35357
```

It looks like auth_ver is only rendered in [ironic] section:

```
root@juju-4bb50e-1-lxd-6:/var/lib/juju/agents/unit-nova-ironic-0/charm/templates# grep -RP auth_url
kilo/nova.conf:admin_auth_url = {{ network_manager_config.auth_protocol }}://{{ network_manager_config.keystone_host }}:{{ network_manager_config.auth_port }}/v2.0
rocky/nova.conf:auth_url = {{ auth_protocol }}://{{ auth_host }}:{{ auth_port }}
parts/section-ironic:auth_url = {{auth_protocol}}://{{auth_host}}:{{auth_port}}/{{auth_ver}}
parts/section-placement:auth_url = {{ auth_protocol }}://{{ auth_host }}:{{ auth_port }}
train/nova.conf:auth_url = {{ auth_protocol }}://{{ auth_host }}:{{ auth_port }}
ocata/nova.conf:auth_url = {{ auth_protocol }}://{{ auth_host }}:{{ auth_port }}
stein/nova.conf:auth_url = {{ auth_protocol }}://{{ auth_host }}:{{ auth_port }}
newton/nova.conf:auth_url = {{ auth_protocol }}://{{ auth_host }}:{{ auth_port }}
queens/nova.conf:auth_url = {{ auth_protocol }}://{{ auth_host }}:{{ auth_port }}
mitaka/nova.conf:auth_url = {{ auth_protocol }}://{{ auth_host }}:{{ auth_port }}
pike/nova.conf:auth_url = {{ auth_protocol }}://{{ auth_host }}:{{ auth_port }}
```

Relation data between nova-cloud-controller and nova-ironic does indeed specify `api_version: "2.0"`:

```
$ juju run --unit nova-cloud-controller/0 'relation-ids cloud-compute'
cloud-compute:350
cloud-compute:351

$ juju run --unit nova-cloud-controller/0 'relation-list -r cloud-compute:350'
nova-compute/0
nova-compute/1
nova-compute/2
nova-compute/3
nova-compute/4
nova-compute/5
nova-compute/6
nova-compute/7
nova-compute/8

$ juju run --unit nova-cloud-controller/0 'relation-list -r cloud-compute:351'
nova-ironic/0

$ juju run --unit nova-cloud-controller/0 'relation-get -r cloud-compute:351 - nova-cloud-controller/0'
admin_domain_name: service_domain
allow_resize_to_same_host: "False"
api_version: "2.0"
auth_host: keystone-internal.engineering-cloud.example.com
auth_port: "35357"
auth_protocol: https
ca_cert: REDACTED
console_access_protocol: spice
console_keymap: en-us
console_proxy_spice_address: https://nova.engineering-cloud.example.com:6082/spice_auto.html
console_proxy_spice_host: nova.engineering-cloud.example.com
console_proxy_spice_port: "6082"
cpu_allocation_ratio: "16"
cross_az_attach: "True"
disk_allocation_ratio: "1"
dns_domain: engineering-cloud.de.example.lan.
ec2_host: 10.9.186.7
egress-subnets: 192.168.4.90/32
enable_serial_console: "false"
ingress-address: 192.168.4.90
keystone_host: keystone-internal.engineering-cloud.example.com
network_manager: neutron
private-address: 192.168.4.90
quantum_host: neutron-internal.engineering-cloud.example.com
quantum_plugin: ovs
quantum_port: "9696"
quantum_security_groups: "yes"
quantum_url: https://neutron-internal.engineering-cloud.example.com:9696
ram_allocation_ratio: "1"
region: ABT102
restart_trigger: eab4201a-9bf0-...

Read more...

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Hi Przemyslaw

Well, that was a convoluted bug, but I've worked out what's going on. Basically, the bug is in nova-cloud-controller and not nova-compute, specifically in the bit where it sets the api_version on the relation between nova-cloud-controller and nova-compute.

In nova-cloud-controller in hooks/nova_cc_hooks.py in `def auth_token_config()` the the api_version is read from the /etc/nova/api-paste.ini file, which is called from `def _auth_config()` called from `def keystone_compute_settings()` which is called ultimately and set on the relation in `def cloud_compute_relation_changed()`.

All would be good, but sadly the 'api_version' key doesn't appear in the relevant section of the /etc/nova/api-paste.ini file which is in `templates/mitaka/api-paste.ini`.

The fix is probably going to be to just add `api_version` to the ./api-paste.ini template, but it needs to be checked that this is a valid field, etc.

I'll change the bug to nova-cloud-controller, and remove it from nova-compute.

Changed in charm-nova-compute:
status: Incomplete → Invalid
Changed in charm-nova-cloud-controller:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Przemyslaw Hausman (phausman) wrote :

Alex, thanks for the analysis! I came to the same conclusion regaring api-paste.ini but you were much quicker to post the results of your investigation. Good stuff!

I tested it and added the following to the templates/mitaka/api-paste.ini template on nova-cloud-controler unit.
```
api_version = {{ api_version }}
```

The resulting /etc/nova/api-paste.ini ended up with the correct configuration rendered:

```
$ sudo grep api_version /etc/nova/api-paste.ini
api_version = 3
```

Then, after I executed the config-changed hook on nova-ironic unit, the configuration in nova.conf has been correctly rendered with /v3:

```
$ sudo grep -A2 '\[ironic\]' /etc/nova/nova.conf
[ironic]
auth_type = password
auth_url = https://keystone-internal.engineering-cloud.example.com:35357/v3
```

Revision history for this message
Przemyslaw Hausman (phausman) wrote :
Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Hi Przemyslaw

Always useful to get to the same point! Okay, I suspect that to make this a proper fix we should probably make a tempates/queens/api-paste.ini and add the option there. The ironic charms only support bionic-train onwards, but it feels a bit more of an appropriate fix to fix it from queens (as the charms only actually support bionic onwards now anyway).

Cheers
Alex.

Notes:

 * make fix in master.
 * backport to stable/train

Revision history for this message
Natalia Litvinova (natalytvinova) wrote :

We're hitting that again in our current deployment and don't think that editing the nova.conf manually is a good workaroud as it may be overwritten by juju at any time. Subscribing field critical

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Hi Natalia. Are you sure it's correctly categorised as a field critical. A viable work-around that isn't overwritten by juju, would be to edit the templates/mitaka/api-paste.ini file with:

api_version = 3

as indicated by comment #4. This would only be over-written due to a charm upgrade/refresh which would be an operator intervention. This seems more like a field high.

We'll try to schedule a fix for this soon, thanks.

Revision history for this message
Natalia Litvinova (natalytvinova) wrote :

Hi Alex,

Can you specify where should I find the templates/mitaka/api-paste.ini file?
If that workaround works then field-high is fine

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote (last edit ):

Hi Natalia

> Can you specify where should I find the templates/mitaka/api-paste.ini file?
> If that workaround works then field-high is fine

Sure, yes. The most recent api-paste template is in the {charm_dir}/templates/mitaka/api-paste.ini

Easiest thing to do is to edit this file and in the [filter:authtoken] section of the file (at the end), and make the top of that section look like:

[filter:authtoken]
paste.filter_factory = keystonemiddleware.auth_token:filter_factory
api_version = 3

The force a config-changed by toggling a value (usually debug) and that will cause the template to be re-written.

This fix would be overwritten if the charm is upgraded/refreshed, but should result in the auth_url being written with a /v3 at the end.

You'll need to do this for each nova-cloud-controller unit.

Hope that helps.

Revision history for this message
Billy Olsen (billy-olsen) wrote :

I think the section for Ironic should just drop support for auth_url v2.0. It doesn't work anyways and it's causing problems here. For example, the project_domain_name and user_domain_name are set to using the {{ admin_domain_name }} in the template, which should be default for auth_url v2.0. Additionally, all cloud deployments with supported charm versions only use v3 anyways.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

@billy-olsen; that's a reasonable point. I was going for 'formally correct', but this is more pragmatic!

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-nova-compute (master)
Changed in charm-nova-compute:
status: Invalid → In Progress
Revision history for this message
Jadon Naas (jadonn) wrote :

I submitted a fix awaiting review at https://review.opendev.org/c/openstack/charm-nova-compute/+/886775 implementing what Billy described to only use Keystone API v3. The fix changes the template for Ironic to:

{% if virt_type == 'ironic' and auth_host and ironic_api_ready -%}
[ironic]
auth_type = password
auth_url = {{auth_protocol}}://{{auth_host}}:{{auth_port}}/v3
project_name = {{ admin_tenant_name }}
username = {{ admin_user }}
password = {{ admin_password }}
project_domain_name = {{ admin_domain_name }}
user_domain_name = {{ admin_domain_name }}
{% endif -%}

I hope this will resolve the issue, but if there is a better approach or implementation I'm happy to make the necessary adjustments to resolve this bug.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

The status on nova-compute was changed to 'in progress' due to an erroneous gerrit patch. Setting it back to invalid; the patch will need to be done on the nova-cloud-controller charm.

Changed in charm-nova-compute:
status: In Progress → Invalid
Felipe Reyes (freyes)
no longer affects: charm-nova-cloud-controller/trunk
Revision history for this message
Felipe Reyes (freyes) wrote :

I had a conversation with Alex about this bug. There is an issue here where for some reason nova-cloud-controller didn't set the auth_ver correctly in the relation databag, and that would be the reason why nova.conf was rendered incorrectly in nova-compute. This is what the nova-cloud-controller bug task is about.

On the other side of the relation we have nova-compute which is using auth_ver to render the [ironic] section, although this is something other templates are not doing (e.g. templates/parts/section-placement), so we agreed to re-open the proposed patch to align that template to the rest of the templates carried in the nova-compute charm.

Changed in charm-nova-compute:
status: Invalid → Triaged
no longer affects: charm-nova-cloud-controller/2023.1
no longer affects: charm-nova-cloud-controller/train
no longer affects: charm-nova-cloud-controller/ussuri
no longer affects: charm-nova-cloud-controller/victoria
no longer affects: charm-nova-cloud-controller/wallaby
no longer affects: charm-nova-cloud-controller/xena
no longer affects: charm-nova-cloud-controller/yoga
no longer affects: charm-nova-cloud-controller/zed
Revision history for this message
Jadon Naas (jadonn) wrote :

The patch I previously submitted was picked back up and merged to trunk for charm-nova-compute (https://review.opendev.org/c/openstack/charm-nova-compute/+/886775).

nova-cloud-controller still produces the wrong information for the api_version, but this patch should solve the overall issue of the template having the wrong auth_url.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-nova-compute (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/charm-nova-compute/+/895790

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-nova-compute (stable/zed)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-nova-compute (stable/yoga)

Fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/charm-nova-compute/+/895792

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-nova-cloud-controller (master)
Changed in charm-nova-cloud-controller:
status: Triaged → In Progress
Revision history for this message
Jadon Naas (jadonn) wrote :

The change I submitted for charm-nova-compute to change the template was accepted and merged to trunk. I have created backports to stable/2023.1, stable/zed, and stable/yoga.

I also have submitted a patch for review to fix this in charm-nova-cloud-controller. The problem is that this particular line:

https://opendev.org/openstack/charm-nova-cloud-controller/src/branch/master/hooks/nova_cc_hooks.py#L536

will return "2.0" if there is no value of "auth_version" in the "/etc/nova/api-paste.ini" file.

The review is at https://review.opendev.org/c/openstack/charm-nova-cloud-controller/+/895862. The patch I have submitted keeps the behavior but updates the "2.0" to "3.0" in recognition of all supported OpenStack releases using Keystone API version "3.0" instead of "2.0".

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-nova-cloud-controller (master)

Reviewed: https://review.opendev.org/c/openstack/charm-nova-cloud-controller/+/895862
Committed: https://opendev.org/openstack/charm-nova-cloud-controller/commit/98581a04d5ec74a9257dbdad0cb41f594257700b
Submitter: "Zuul (22348)"
Branch: master

commit 98581a04d5ec74a9257dbdad0cb41f594257700b
Author: Jadon Naas <email address hidden>
Date: Tue Sep 19 17:28:16 2023 -0400

    Update default Keystone api_version

    This change moves the default return value for the Keystone api_version
     to 3.0 instead of 2.0. By this point in time, all supported OpenStack
      releases use Keystone API version 3.0 instead of 2.0.
     This was previously causing Nova templates to render with 2.0 in the
     Keystone auth URL instead of 3.0, which caused auth failures.

    Closes-Bug: 1995778
    Change-Id: I6463a24fe4aaa654a58cff56720a55f0950db717

Changed in charm-nova-cloud-controller:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-nova-compute (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/charm-nova-compute/+/895790
Committed: https://opendev.org/openstack/charm-nova-compute/commit/eb2c391fa167175d25d7d8e1ea2faccf2cb7b5ce
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit eb2c391fa167175d25d7d8e1ea2faccf2cb7b5ce
Author: Jadon Naas <email address hidden>
Date: Thu Jun 22 12:40:04 2023 -0400

    Drop the path from the auth_url.

    The template previously could use v2.0 depending on the value of
    api_version. This was causing issues in newer releases of OpenStack
    where the value of api_version was reporting as something other than
    "3", and the generated Ironic config tried to use the v2.0 Keystone API.

    This patch removes the optional logic in the template for v2.0 and rely
    on the global default just like templates/parts/section-placement does.

    Closes-Bug: #1995778
    Change-Id: I8e0270b933f9c8fb5d6a65f9ebb930a0b21fead8
    (cherry picked from commit 8d560b3ff55257370be0b9bc9b5dea73ee82d0ca)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-nova-compute (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/charm-nova-compute/+/895791
Committed: https://opendev.org/openstack/charm-nova-compute/commit/4fca7b8938238614d2c0db741b2229a3a240e510
Submitter: "Zuul (22348)"
Branch: stable/zed

commit 4fca7b8938238614d2c0db741b2229a3a240e510
Author: Jadon Naas <email address hidden>
Date: Thu Jun 22 12:40:04 2023 -0400

    Drop the path from the auth_url.

    The template previously could use v2.0 depending on the value of
    api_version. This was causing issues in newer releases of OpenStack
    where the value of api_version was reporting as something other than
    "3", and the generated Ironic config tried to use the v2.0 Keystone API.

    This patch removes the optional logic in the template for v2.0 and rely
    on the global default just like templates/parts/section-placement does.

    Closes-Bug: #1995778
    Change-Id: I8e0270b933f9c8fb5d6a65f9ebb930a0b21fead8
    (cherry picked from commit 8d560b3ff55257370be0b9bc9b5dea73ee82d0ca)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-nova-compute (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/charm-nova-compute/+/895792
Committed: https://opendev.org/openstack/charm-nova-compute/commit/81ee072e71de61eb17329856d7262811ec60fd66
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 81ee072e71de61eb17329856d7262811ec60fd66
Author: Jadon Naas <email address hidden>
Date: Thu Jun 22 12:40:04 2023 -0400

    Drop the path from the auth_url.

    The template previously could use v2.0 depending on the value of
    api_version. This was causing issues in newer releases of OpenStack
    where the value of api_version was reporting as something other than
    "3", and the generated Ironic config tried to use the v2.0 Keystone API.

    This patch removes the optional logic in the template for v2.0 and rely
    on the global default just like templates/parts/section-placement does.

    Closes-Bug: #1995778
    Change-Id: I8e0270b933f9c8fb5d6a65f9ebb930a0b21fead8
    (cherry picked from commit 8d560b3ff55257370be0b9bc9b5dea73ee82d0ca)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.