Several hours after restarting a NovaLink partition, it was noticed that the neutron api logs (on the separate management node) were not able to communicate with the neutron agent on one of the compute nodes:
Jul 17 11:09:11 ip9-114-251-95 neutron-server[32460]: WARNING neutron.plugins.ml2.drivers.mech_agent [None req-de80026a-1666-4334-98af-a2ae86a04750 service neutron] Refusing to bind port d8d6a33f-3783-49e6-bb0d-3bba704fb141 to dead agent: {'binary': u'networking-powervm-sharedethernet-agent', 'description': None, 'availability_zone': None, 'heartbeat_timestamp': datetime.datetime(2018, 7, 13, 20, 46, 18), 'admin_state_up': True, 'alive': False, 'topic': u'N/A', 'host': u'neo21', 'agent_type': u'PowerVM Shared Ethernet agent', 'resource_versions': {}, 'created_at': datetime.datetime(2018, 7, 11, 14, 2, 32), 'started_at': datetime.datetime(2018, 7, 11, 14, 2, 32), 'id': '9e6f3f28-e36d-43a5-b1a5-d2548be4411f', 'configurations': {u'bridge_mappings': {u'default': u'c37d98e8-ccc3-3316-bddc-102189356e6a'}, u'devices': 0}}
Logging into that NovaLink compute node, systemctl showed <email address hidden> as enabled, but journalctl didn't have any logs. I found the following in /var/log/messages with a timestamp that would have been just a few minutes after the NovaLink was restarted.
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00mTraceback (most recent call last):
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m File "/usr/local/bin/networking-powervm-sea-agent", line 10, in <module>
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m sys.exit(main())
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m File "/opt/stack/networking-powervm/networking_powervm/plugins/ibm/agent/powervm/sea_agent.py", line 286, in main
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m agent = SharedEthernetNeutronAgent()
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m File "/opt/stack/networking-powervm/networking_powervm/plugins/ibm/agent/powervm/agent_base.py", line 265, in __init__
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m self.br_map = self.parse_bridge_mappings()
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m File "/opt/stack/networking-powervm/networking_powervm/plugins/ibm/agent/powervm/sea_agent.py", line 97, in parse_bridge_mappings
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m ACONF.bridge_mappings)
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m File "/opt/stack/networking-powervm/networking_powervm/plugins/ibm/agent/powervm/utils.py", line 63, in parse_sea_mappings
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m nb_wraps = list_bridges(adapter, host_uuid)
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m File "/usr/local/lib/python2.7/dist-packages/pypowervm/utils/retry.py", line 251, in __retry
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m _raise_exc()
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m File "/usr/local/lib/python2.7/dist-packages/pypowervm/utils/retry.py", line 237, in __retry
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m resp = func(*args, **kwds)
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m File "/opt/stack/networking-powervm/networking_powervm/plugins/ibm/agent/powervm/utils.py", line 303, in list_bridges
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m parent_uuid=host_uuid)
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m File "/usr/local/lib/python2.7/dist-packages/pypowervm/wrappers/entry_wrapper.py", line 857, in get
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m read_kwargs)
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m File "/usr/local/lib/python2.7/dist-packages/pypowervm/wrappers/entry_wrapper.py", line 893, in _read_child
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m child_type=cls.schema_type, **read_kwargs)
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m File "/usr/local/lib/python2.7/dist-packages/pypowervm/adapter.py", line 762, in read
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m sensitive=sensitive, helpers=helpers)
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m File "/usr/local/lib/python2.7/dist-packages/pypowervm/adapter.py", line 799, in read_by_path
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m sensitive, helpers=helpers)
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m File "/usr/local/lib/python2.7/dist-packages/pypowervm/adapter.py", line 823, in _read_by_path
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m sensitive=sensitive)
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m File "/usr/local/lib/python2.7/dist-packages/pypowervm/adapter.py", line 645, in _request
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m resp = func(method, path, **kwds)
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m File "/usr/local/lib/python2.7/dist-packages/pypowervm/helpers/log_helper.py", line 150, in log_req_resp
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m response = func(*args, **kwds)
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m File "/usr/local/lib/python2.7/dist-packages/pypowervm/helpers/vios_busy.py", line 60, in wrapper
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m resp = func(*args, **kwds)
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m File "/usr/local/lib/python2.7/dist-packages/pypowervm/adapter.py", line 432, in request
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00m raise self._get_httperror(resp)
Jul 17 06:30:08 neo21 networking-powervm-sea-agent[1701]: ERROR neutron #033[01;35m#033[00mHttpError: HTTP error 500 for method GET on path /rest/api/uom/ManagedSystem/daf0fca1-d881-3d39-a5f3-fc221870a008/NetworkBridge?group=None: Internal Server Error -- VIOS0005 VIOS_LICENSE: Generic exception occurred in fetching license status for vios vios1 with ID 2 in CEC 8247-21L*212A58A
Restarting the pvm-q-sea-agt service seems to have fixed the issue.
It looks to me like there is a race condition on start up which led to this unchecked exception that messed up the service.
found in a devstack environment using stable/queens.