When updating large heat stacks with number of nested stacks, we create auth plugins and request number of trust tokens concurrently for every nested stack, and we see ETIMEDOUT errors[1], in spite of having large number of keystone workers. Adding retries in session.request() helps fix those errors. At present we don't seem to have a way to pass connect_retries when doing endpoint discovery or requesting tokens with plugin get_auth_ref().
It would be a good idea to initialize session with connect_retries and use that unless specified in the request.
This seems to have been mentioned as a future work in https://github.com/openstack/keystoneauth/commit/34c94a6a207bfa3cf665852b0c84bb47f37e4e0a.
[1]
2019-08-14 02:27:08.142 164299 WARNING keystoneauth.identity.generic.base [req-8ab27215-d75f-43c0-8b5f-f85d7c2cba3e - admin - - -] Failed to discover available identity versions when contacting http:/
/192.168.0.1:35357. Attempting to parse version from URL.: ConnectFailure: Unable to establish connection to http://192.168.0.1:35357: HTTPConnectionPool(host='192.168.0.1', port=35357): Max retries e
xceeded with url: / (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f435def5f10>: Failed to establish a new connection: [Errno 110] ETIMEDOUT',))
2019-08-14 02:27:08.142 164298 DEBUG heat.engine.scheduler [req-8ab27215-d75f-43c0-8b5f-f85d7c2cba3e - admin - default default] Task update from TemplateResource "14" [f751b50e-1dea-4f32-b4fb-8f26863e
8f69] Stack "overcloud-5039msCompute-bhf4fit4v7l3" [75b36683-3847-4894-b935-b21909a104e5] running step /usr/lib/python2.7/site-packages/heat/engine/scheduler.py:209
2019-08-14 02:27:08.143 164307 DEBUG heat.engine.scheduler [req-8ab27215-d75f-43c0-8b5f-f85d7c2cba3e - admin - default default] Task update from TemplateResource "NodeUserData" [cbd22f69-1773-44f3-aad
f-48d7d63c4e76] Stack "overcloud-5039msCompute-bhf4fit4v7l3-83-iuqbefazivqr" [a5e591d6-5fcc-4eec-a549-d86634daada4] running step /usr/lib/python2.7/site-packages/heat/engine/scheduler.py:209
2019-08-14 02:27:08.143 164297 DEBUG heat.engine.scheduler [req-8ab27215-d75f-43c0-8b5f-f85d7c2cba3e - admin - default default] Task update from TemplateResource "NodeTimesyncUserData" [16b08cd3-0c61-
4922-8e4e-7b8bc629123c] Stack "overcloud-1029uCompute-xbw6u6ajet4g-27-zomtrymyl4we" [cd9c6798-7eb0-4288-96a2-03bbf11d197a] running step /usr/lib/python2.7/site-packages/heat/engine/scheduler.py:209
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server [req-8ab27215-d75f-43c0-8b5f-f85d7c2cba3e - admin - - -] Exception during message handling: DiscoveryFailure: Could not determine a suita
ble URL for the plugin
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 220, in dispatch
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 190, in _do_dispatch
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 158, in wrapper
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server result = f(*args, **kwargs)
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 409, in wrapped
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server return func(self, ctx, *args, **kwargs)
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/heat/engine/service.py", line 998, in update_stack
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server is_registered_policy=True)
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/heat/common/policy.py", line 182, in enforce_stack
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server is_registered_policy=is_registered_policy)
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/heat/common/policy.py", line 163, in enforce
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server is_registered_policy=is_registered_policy)
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/heat/common/policy.py", line 148, in _enforce
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server target=target, is_registered_policy=is_registered_policy)
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/heat/common/policy.py", line 106, in enforce
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server is_registered_policy=is_registered_policy)
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/heat/common/policy.py", line 77, in _check
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server credentials = context.to_policy_values()
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 208, in to_policy_values
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server policy = super(RequestContext, self).to_policy_values()
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_context/context.py", line 314, in to_policy_values
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server 'roles': self.roles,
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 301, in roles
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server self._load_keystone_data()
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 292, in _load_keystone_data
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server auth_ref = self.auth_plugin.get_access(self.keystone_session)
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/base.py", line 134, in get_access
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server self.auth_ref = self.get_auth_ref(session)
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/generic/base.py", line 199, in get_auth_ref
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server self._plugin = self._do_create_plugin(session)
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/generic/base.py", line 194, in _do_create_plugin
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server raise exceptions.DiscoveryFailure('Could not determine a suitable URL '
2019-08-14 02:27:08.143 164299 ERROR oslo_messaging.rpc.server DiscoveryFailure: Could not determine a suitable URL for the plugin
Fix proposed to branch: master /review. opendev. org/676648
Review: https:/