Nova Scheduler caches placement URL

Bug #1848745 reported by Frode Nordahl
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Nova Cloud Controller Charm
Triaged
High
Unassigned

Bug Description

I have a Train deployment with the new placement charm and TLS enabled across the board with the help from the Vault charm and the certificates relation.

$ openstack catalog list
[ snip ]
| placement | placement | RegionOne |
| | | internal: https://10.246.114.6:8778 |
| | | RegionOne |
| | | admin: https://10.246.114.6:8778 |
| | | RegionOne |
| | | public: https://10.246.114.6:8778 |

However, when creating a instance Nova attempts to contact the Placement API through a http url:

2019-10-18 14:41:50.422 146762 WARNING keystoneauth.discover [req-13df06e7-e6fa-411f-895f-07c4c7a7ef33 3c646ddc1e23482495394fb5fec05180 0fdeb76e888240d79fb4e19da2216226 - c08cb1e9bc8842d29246016b7965694b c08cb1e9bc8842d29246016b7965694b] Failed to contact the endpoint at http://10.246.114.6:8778 for discovery. Fallback to using that endpoint as the base url.
2019-10-18 14:41:50.427 146762 ERROR nova.scheduler.client.report [req-13df06e7-e6fa-411f-895f-07c4c7a7ef33 3c646ddc1e23482495394fb5fec05180 0fdeb76e888240d79fb4e19da2216226 - c08cb1e9bc8842d29246016b7965694b c08cb1e9bc8842d29246016b7965694b] Failed to retrieve allocation candidates from placement API for filters: RequestGroup(aggregates=[],forbidden_aggregates=set([]),forbidden_traits=set(['COMPUTE_STATUS_DISABLED']),in_tree=None,provider_uuids=[],requester_id=None,required_traits=set([]),resources={DISK_GB=8,MEMORY_MB=512,VCPU=1},use_same_provider=False)
Got 400: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>400 Bad Request</title>
</head><body>
<h1>Bad Request</h1>
<p>Your browser sent a request that this server could not understand.<br />
Reason: You're speaking plain HTTP to an SSL-enabled server port.<br />
 Instead use the HTTPS scheme to access this URL, please.<br />
</p>
<hr>
<address>Apache/2.4.29 (Ubuntu) Server at 10.246.114.6 Port 443</address>
</body></html>
.
2019-10-18 14:41:50.427 146762 INFO nova.scheduler.manager [req-13df06e7-e6fa-411f-895f-07c4c7a7ef33 3c646ddc1e23482495394fb5fec05180 0fdeb76e888240d79fb4e19da2216226 - c08cb1e9bc8842d29246016b7965694b c08cb1e9bc8842d29246016b7965694b] Got no allocation candidates from the Placement API. This could be due to insufficient resources or a temporary occurrence as compute nodes start up.

It appears we need to coordinate a restart of the Nova Scheduler service after placement has configured TLS and has its URL set in the keystone catalog.

Revision history for this message
Frode Nordahl (fnordahl) wrote :

I can also confirm that restarting the nova scheduler service resolves the situation and I can now create instances again.

Changed in charm-nova-cloud-controller:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Alexander Balderson (asbalderson) wrote :
Revision history for this message
Alexander Balderson (asbalderson) wrote :

Marking as Field High as we've been hitting this pretty consistently in SQA runs; all occurrences can be found here:
https://solutions.qa.canonical.com/bugs/bugs/bug/1848745

Revision history for this message
Marian Gasparovic (marosg) wrote :

Reappeared in Xena !

We started to hit this now when we put xena to regular testing. Restarting nova-conductor solves it but that is not a solution for automatic deployments.

Revision history for this message
Moises Emilio Benzan Mora (moisesbenzan) wrote :

Saw once again in Xena, this time with a slightly different error:

```
2022-02-24 23:13:48.136 196590 CRITICAL nova [-] Unhandled error: openstack.exceptions.NotSupported: The placement service for keystone-internal.silo1.solutionsqa:RegionOne exists but does not have any supported versions.
2022-02-24 23:13:48.136 196590 ERROR nova Traceback (most recent call last):
2022-02-24 23:13:48.136 196590 ERROR nova File "/usr/bin/nova-conductor", line 10, in <module>
2022-02-24 23:13:48.136 196590 ERROR nova sys.exit(main())
2022-02-24 23:13:48.136 196590 ERROR nova File "/usr/lib/python3/dist-packages/nova/cmd/conductor.py", line 45, in main
2022-02-24 23:13:48.136 196590 ERROR nova server = service.Service.create(binary='nova-conductor',
2022-02-24 23:13:48.136 196590 ERROR nova File "/usr/lib/python3/dist-packages/nova/service.py", line 252, in create
2022-02-24 23:13:48.136 196590 ERROR nova service_obj = cls(host, binary, topic, manager,
2022-02-24 23:13:48.136 196590 ERROR nova File "/usr/lib/python3/dist-packages/nova/service.py", line 116, in __init__
2022-02-24 23:13:48.136 196590 ERROR nova self.manager = manager_class(host=self.host, *args, **kwargs)
2022-02-24 23:13:48.136 196590 ERROR nova File "/usr/lib/python3/dist-packages/nova/conductor/manager.py", line 119, in __init__
2022-02-24 23:13:48.136 196590 ERROR nova self.compute_task_mgr = ComputeTaskManager()
2022-02-24 23:13:48.136 196590 ERROR nova File "/usr/lib/python3/dist-packages/nova/conductor/manager.py", line 244, in __init__
2022-02-24 23:13:48.136 196590 ERROR nova self.report_client = report.SchedulerReportClient()
2022-02-24 23:13:48.136 196590 ERROR nova File "/usr/lib/python3/dist-packages/nova/scheduler/client/report.py", line 187, in __init__
2022-02-24 23:13:48.136 196590 ERROR nova self._client = self._create_client()
2022-02-24 23:13:48.136 196590 ERROR nova File "/usr/lib/python3/dist-packages/nova/scheduler/client/report.py", line 230, in _create_client
2022-02-24 23:13:48.136 196590 ERROR nova client = self._adapter or utils.get_sdk_adapter('placement')
2022-02-24 23:13:48.136 196590 ERROR nova File "/usr/lib/python3/dist-packages/nova/utils.py", line 983, in get_sdk_adapter
2022-02-24 23:13:48.136 196590 ERROR nova return getattr(conn, service_type)
2022-02-24 23:13:48.136 196590 ERROR nova File "/usr/lib/python3/dist-packages/openstack/service_description.py", line 87, in __get__
2022-02-24 23:13:48.136 196590 ERROR nova proxy = self._make_proxy(instance)
2022-02-24 23:13:48.136 196590 ERROR nova File "/usr/lib/python3/dist-packages/openstack/service_description.py", line 266, in _make_proxy
2022-02-24 23:13:48.136 196590 ERROR nova raise exceptions.NotSupported(
2022-02-24 23:13:48.136 196590 ERROR nova openstack.exceptions.NotSupported: The placement service for keystone-internal.silo1.solutionsqa:RegionOne exists but does not have any supported versions.
```

Test run: https://solutions.qa.canonical.com/testruns/testRun/142ee5ef-49d6-43c2-9700-dc8ac4808538
Artifacts: https://oil-jenkins.canonical.com/artifacts/142ee5ef-49d6-43c2-9700-dc8ac4808538/index.html

Revision history for this message
Mario Chirinos (mario-chirinos) wrote :

I have the same problem, how did you solve it?

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

It looks like restarting the nova scheduler process is a work around (from comment #1 https://bugs.launchpad.net/charm-nova-cloud-controller/+bug/1848745/comments/1)

> I can also confirm that restarting the nova scheduler service resolves the situation and I can now create instances again.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.