[NOVA] NoAuthURLProvided: auth_url was not provided to the Neutron client

Bug #1450192 reported by Leontiy Istomin
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Invalid
High
Boris Bobrov
6.0.x
Invalid
High
Denis Meltsaykin
6.1.x
Won't Fix
High
Boris Bobrov
7.0.x
Invalid
High
Boris Bobrov

Bug Description

api: '1.0'
astute_sha: 3f1ece0318e5e93eaf48802fefabf512ca1dce40
auth_required: true
build_id: 2015-03-26_21-32-43
build_number: '233'
feature_groups:
- mirantis
fuellib_sha: 9c7716bc2ce6075065d7d9dcf96f4c94662c0b56
fuelmain_sha: 320b5f46fc1b2798f9e86ed7df51d3bda1686c10
nailgun_sha: b163f6fc77d6639aaffd9dd992e1ad96951c3bbf
ostf_sha: a4cf5f218c6aea98105b10c97a4aed8115c15867
production: docker
python-fuelclient_sha: e5e8389d8d481561a4d7107a99daae07c6ec5177
release: '6.1'

Successfully deployed the following configuration:
Baremetal,Ubuntu,IBP,HA, Neutron-vlan,Ceph-all,Nova-debug,Nova-quotas,6.1-233
Controllers:3 Computes:47

But during "MoxScenarios.boot_server_with_network_in_single_tenant" rally scenario we got the following error:
http://paste.openstack.org/show/212666/

root@node-1:~# mysql -e "use nova; select uuid from instances where display_name='rally_mox_server_sqxmaeaene';"
+--------------------------------------+
| uuid |
+--------------------------------------+
| 5e9f2730-20b6-4bd0-b5e9-d6848a764cba |
+--------------------------------------+

from nova-all.log at the time:
http://paste.openstack.org/show/212665/

related bug from nova project: https://bugs.launchpad.net/nova/+bug/1418529

Diagnostic Snapshot: http://mos-scale-share.mirantis.com/fuel-snapshot-2015-04-29_20-32-07.tar.xz

description: updated
description: updated
Changed in mos:
status: New → Triaged
Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote :

Hi,

So this bug should have been fixed already by the following review in trunk/kilo:
https://review.openstack.org/#/c/136931/

-- dims

Dina Belova (dbelova)
Changed in mos:
importance: Undecided → High
milestone: none → 6.1
Changed in mos:
assignee: MOS Nova (mos-nova) → Pavel Kholkin (pkholkin)
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/nova (openstack-ci/fuel-6.1/2014.2)

Fix proposed to branch: openstack-ci/fuel-6.1/2014.2
Change author: Jamie Lennox <email address hidden>
Review: https://review.fuel-infra.org/6447

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Change abandoned on openstack/nova (openstack-ci/fuel-6.1/2014.2)

Change abandoned by Kholkin Pavel <email address hidden> on branch: openstack-ci/fuel-6.1/2014.2
Review: https://review.fuel-infra.org/6447

Revision history for this message
Pavel Kholkin (pkholkin) wrote :

Hello guys. The problem was with the expired keystone token. Non-admin user unsuccessfully tried to authorize with expired token and then tried re-authenticate with no credentials. In this case we will change response code in nova from 500 to 401 (Unauthorized)

Revision history for this message
Andrey Kurilin (andreykurilin) wrote :

Scenario KeystonePlugin.assign_and_remove_user_role was launched with MOX scenario in parallel[1] and finished earlier. Admin cleanup of KeystonePlugin.assign_and_remove_user_role clean all temporary tenant of MOX scenario.
This bug is not relate to nova-team. It looks like MOX scenario should not be launched in parallel with scenarios related to keystone.

[1] - http://mos-scale.vm.mirantis.net/test_results/build_6.1-233/jenkins-11_env_run_rally_light-53-ubuntu-MSK-2015-04-29-19:07:19-2015-04-29-20:22:40/logs/run_rally-debug.log.gz

[2] - http://mos-scale.vm.mirantis.net/test_results/build_6.1-233/jenkins-11_env_run_rally_light-53-ubuntu-MSK-2015-04-29-19:07:19-2015-04-29-20:22:40/logs/assign_and_remove_user_role-rally-stdout.log.gz

Revision history for this message
Leontiy Istomin (listomin) wrote :

reproduced this issue with create_and_delete_snapshot rally scenario.

from rally.log:
http://paste.openstack.org/show/230638/
from messages(haproxy):
http://paste.openstack.org/show/230639/
from nova-all:
http://paste.openstack.org/show/230625/

At the time create-tenant-with-users-rally-stdout rally scenario was completing.
Rally team, please change keystone scenarios.

Rally logs is coming.
Diagnostic Snapshot is coming.

Revision history for this message
Leontiy Istomin (listomin) wrote :

Diagnostic Snapshot: http://mos-scale-share.mirantis.com/fuel-snapshot-2015-05-21_14-36-09.tar.xz
rally logs and reports are attached.

Revision history for this message
Leontiy Istomin (listomin) wrote :

We excluded keystone scenarios from test, but faced with the issue:
http://paste.openstack.org/show/231679/
I'll crated a new bug for rally keystone plugin. Here we will work with NOVA->KEYSTONE->NEUTRON issue.
I'll generate diagnostic snapshot for this case asap.

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

I checked the env provided by Leontiy and see the following:

When talking to neutron-server on behalf of a user (thus, using a user token), nova-api sometimes receives 401 caused by neutron-server not being able to validate a passed token (due to various reasons: sometimes token is not found and keystone returns 404, sometimes keystone fails with 503 - http://paste.openstack.org/show/231809/).

401 is usually handled by python-*client's by getting a new token from Keystone, but in order to do that, one would need to pass user credentials. And when operations are performed on behalf of a user (like when you do something like `nova show server_name` and nova-api goes to neutron-server to get networking information), it's just not possible as nova/etc just don't have user credentials.

In case of nova/python-neutronclient a very misleading exception is raised - http://paste.openstack.org/show/212665/ . It effectively means - "the token you passed is invalid (read: can not be found in Keystone), please get a new one and retry the request". Obviously, we can't get a new token without user credentials, so Nova must return 401 to the user, so that he/she could do that. Instead it fails with 500 producing an obscure log error message.

This is a *valid* Nova issue in the sense, that nova-api should return a proper HTTP response code. Still, this would not prevent requests from failing, unless users will manually retry the original request passing a new token.

That being said, I'm not sure, that 401 vs 500 response code returned by nova-api should be treated as High and fixed in 6.1 (this is already fixed in Kilo).

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Still, it will be interesting to understand why tokens mysteriously disappear sometimes and why Keystone occasionally returns 503.

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Hmm, on the other hand, some of those tokens were used in "admin" neutronclient instance, which is preserved between different requests to Neutron and might actually expire.

This must be handled by Nova by obtaining a new token periodically, but maybe it does not work as expected - https://github.com/openstack/nova/blob/stable/juno/nova/network/neutronv2/__init__.py#L65-L94

Uploading logs to ELK to investigate this further.

Revision history for this message
Leontiy Istomin (listomin) wrote :

token-create-and-use-for-auth rally test failed with 503 on 600 iteration:
http://paste.openstack.org/show/231859/

rally test send 94 requests per second.
rally log is attached

Revision history for this message
Leontiy Istomin (listomin) wrote :

the following from haproxy at the time when first exception for token-create-and-use-for-auth test was occurred:
http://paste.openstack.org/show/231891/

it seems haproxy can't be connected to keystone backend.

Revision history for this message
Leontiy Istomin (listomin) wrote :

At this time (2015/05/22 09:35:02) one of apache threads consumed 113-114% CPU:
23416 113% /usr/sbin/apache2 -k start
Keystone is owner of this process.

It seems that apache uses only one thread for each port (5000 and 35357)

Revision history for this message
Leontiy Istomin (listomin) wrote :

screenshots of atop

Revision history for this message
Leontiy Istomin (listomin) wrote :

Diagnostic Snapshot for todays findings (from 8 comment): http://mos-scale-share.mirantis.com/fuel-snapshot-2015-05-22_12-40-19.tar.xz

Revision history for this message
Leontiy Istomin (listomin) wrote :

I've changed parameters
processes=12 threads=1
in the followng fies on controller nodes:
/etc/apache2/sites-enabled/05-keystone_wsgi_admin.conf
/etc/apache2/sites-enabled/05-keystone_wsgi_main.conf
It solved the issue. I seems it's duplicate of https://bugs.launchpad.net/fuel/+bug/1457037 for token-create-and-use-for-auth test. I'll perform full rally test to be sure

Revision history for this message
Leontiy Istomin (listomin) wrote :

Tests showed that processes=12 threads=1 have solved the issue.
defenetely duplicate of https://bugs.launchpad.net/fuel/+bug/1457037

Revision history for this message
Andrey Grebennikov (agrebennikov) wrote :
Download full text (7.9 KiB)

Guys, how it can be duplicate of the last mentioned bug, if there is no "keystone under apache" in 6.1?
This is what happens in generic 6.1 in my installation:

<179>Jul 27 23:44:40 GGUTTHLDK014 nova-compute Build of instance c75e1e46-9368-45c0-b8a5-1592a64a4e76 aborted: Could not clean up failed build, not rescheduling
2015-07-27 23:44:40.977 23919 TRACE nova.compute.manager [instance: c75e1e46-9368-45c0-b8a5-1592a64a4e76] Traceback (most recent call last):
2015-07-27 23:44:40.977 23919 TRACE nova.compute.manager [instance: c75e1e46-9368-45c0-b8a5-1592a64a4e76] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line
 2085, in _do_build_and_run_instance
2015-07-27 23:44:40.977 23919 TRACE nova.compute.manager [instance: c75e1e46-9368-45c0-b8a5-1592a64a4e76] filter_properties)
2015-07-27 23:44:40.977 23919 TRACE nova.compute.manager [instance: c75e1e46-9368-45c0-b8a5-1592a64a4e76] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line
 2184, in _build_and_run_instance
2015-07-27 23:44:40.977 23919 TRACE nova.compute.manager [instance: c75e1e46-9368-45c0-b8a5-1592a64a4e76] 'create.error', fault=e)
2015-07-27 23:44:40.977 23919 TRACE nova.compute.manager [instance: c75e1e46-9368-45c0-b8a5-1592a64a4e76] File "/usr/lib/python2.7/dist-packages/nova/openstack/common/excutils
.py", line 82, in __exit__
2015-07-27 23:44:40.977 23919 TRACE nova.compute.manager [instance: c75e1e46-9368-45c0-b8a5-1592a64a4e76] six.reraise(self.type_, self.value, self.tb)
2015-07-27 23:44:40.977 23919 TRACE nova.compute.manager [instance: c75e1e46-9368-45c0-b8a5-1592a64a4e76] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line
 2168, in _build_and_run_instance
2015-07-27 23:44:40.977 23919 TRACE nova.compute.manager [instance: c75e1e46-9368-45c0-b8a5-1592a64a4e76] block_device_info=block_device_info)
2015-07-27 23:44:40.977 23919 TRACE nova.compute.manager [instance: c75e1e46-9368-45c0-b8a5-1592a64a4e76] File "/usr/lib/python2.7/contextlib.py", line 35, in __exit__
2015-07-27 23:44:40.977 23919 TRACE nova.compute.manager [instance: c75e1e46-9368-45c0-b8a5-1592a64a4e76] self.gen.throw(type, value, traceback)
2015-07-27 23:44:40.977 23919 TRACE nova.compute.manager [instance: c75e1e46-9368-45c0-b8a5-1592a64a4e76] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line
 2322, in _build_resources
2015-07-27 23:44:40.977 23919 TRACE nova.compute.manager [instance: c75e1e46-9368-45c0-b8a5-1592a64a4e76] instance_uuid=instance.uuid, reason=msg)
2015-07-27 23:44:40.977 23919 TRACE nova.compute.manager [instance: c75e1e46-9368-45c0-b8a5-1592a64a4e76] BuildAbortException: Build of instance c75e1e46-9368-45c0-b8a5-1592a64a
4e76 aborted: Could not clean up failed build, not rescheduling
2015-07-27 23:44:40.977 23919 TRACE nova.compute.manager [instance: c75e1e46-9368-45c0-b8a5-1592a64a4e76]
<179>Jul 27 23:44:41 GGUTTHLDK014 nova-compute Failed to deallocate networks
2015-07-27 23:44:41.002 23919 TRACE nova.compute.manager [instance: c75e1e46-9368-45c0-b8a5-1592a64a4e76] Traceback (most recent call last):
2015-07-27 23:44:41.002 23919 TRACE nova.compute.manager [instance: c75e1e46-9368-45c0...

Read more...

tags: added: customer-found
Revision history for this message
Alexander Makarov (amakarov) wrote :

@agrebennikov
6.1 has several ISO's with keystone under Apache.
Can you please provide a link to ISO you used?
If it's your case, please follow the instructions above: set threads to 1 in Apache virtual host config.

Revision history for this message
Boris Bobrov (bbobrov) wrote :

Asked to reproduce the issue in MOSS-251 in Jira

Revision history for this message
Leontiy Istomin (listomin) wrote :

6.1 GA doesn't include keystone under apache.
Hasn't been reproduced with 7.0

Roman Rufanov (rrufanov)
tags: added: support
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.