2019-02-20 09:54:52 |
Colleen Murphy |
bug |
|
|
added bug |
2019-02-20 14:33:04 |
Jeremy Stanley |
description |
Description
===========
We have nova-novncproxy configured to use SSL:
```
[DEFAULT]
ssl_only=true
cert = /etc/nova/ssl/certs/signing_cert.pem
key = /etc/nova/ssl/private/signing_key.pem
...
[vnc]
enabled = True
server_listen = "0.0.0.0"
server_proxyclient_address = 192.168.237.81
novncproxy_host = 192.168.237.81
novncproxy_port = 5554
novncproxy_base_url = https://<public hostname>:6080/vnc_auto.html
xvpvncproxy_host = 192.168.237.81
```
We also have haproxy acting as a load balancer, but not terminating SSL. We have an haproxy health check configured like this for nova-novncproxy:
```
listen nova-novncproxy
# irrelevant config...
server <internal hostname> 192.168.237.84:5554 check check-ssl verify none inter 2000 rise 5 fall 2
```
where 192.168.237.81 is a virtual IP address and 192.168.237.84 is the node's individual IP address.
With that health check enabled, we found the nova-novncproxy process CPU spiking and eventually causing the node to hang. With debug logging enabled, we noticed this in the nova-novncproxy logs:
2019-02-19 15:02:44.148 2880 INFO nova.console.websocketproxy [-] WebSocket server settings:
2019-02-19 15:02:44.149 2880 INFO nova.console.websocketproxy [-] - Listen on 192.168.237.81:5554
2019-02-19 15:02:44.149 2880 INFO nova.console.websocketproxy [-] - Flash security policy server
2019-02-19 15:02:44.149 2880 INFO nova.console.websocketproxy [-] - Web server (no directory listings). Web root: /usr/share/novnc
2019-02-19 15:02:44.150 2880 INFO nova.console.websocketproxy [-] - SSL/TLS support
2019-02-19 15:02:44.151 2880 INFO nova.console.websocketproxy [-] - proxying from 192.168.237.81:5554 to None:None
2019-02-19 15:02:45.015 2880 DEBUG nova.console.websocketproxy [-] 192.168.237.85: new handler Process vmsg /usr/lib/python2.7/site-packages/websockify/websocket.py:873
2019-02-19 15:02:45.184 2889 DEBUG oslo_db.sqlalchemy.engines [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTI
TUTION _check_effective_sql_mode /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py:308
2019-02-19 15:02:45.377 2889 DEBUG nova.context [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Found 2 cells: 00000000-0000-0000-0000-000000000000(cell0),9f9825dd-868f-41cc-9c8e-e544f1528d6a(cell1) load_cells /usr/lib/python2.7/site-packages/nova/context.py:479
2019-02-19 15:02:45.380 2889 DEBUG oslo_concurrency.lockutils [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Lock "00000000-0000-0000-0000-000000000000" acquired by "nova.context.get_or_set_cached_cell_and_set_connections" :: waited 0.000s inner /usr/lib/python2.7/site-packag
es/oslo_concurrency/lockutils.py:273
2019-02-19 15:02:45.382 2889 DEBUG oslo_concurrency.lockutils [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Lock "00000000-0000-0000-0000-000000000000" released by "nova.context.get_or_set_cached_cell_and_set_connections" :: held 0.002s inner /usr/lib/python2.7/site-packages
/oslo_concurrency/lockutils.py:285
2019-02-19 15:02:45.393 2889 DEBUG oslo_concurrency.lockutils [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Lock "9f9825dd-868f-41cc-9c8e-e544f1528d6a" acquired by "nova.context.get_or_set_cached_cell_and_set_connections" :: waited 0.000s inner /usr/lib/python2.7/site-packag
es/oslo_concurrency/lockutils.py:273
2019-02-19 15:02:45.395 2889 DEBUG oslo_concurrency.lockutils [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Lock "9f9825dd-868f-41cc-9c8e-e544f1528d6a" released by "nova.context.get_or_set_cached_cell_and_set_connections" :: held 0.003s inner /usr/lib/python2.7/site-packages
/oslo_concurrency/lockutils.py:285
2019-02-19 15:02:45.437 2889 DEBUG oslo_db.sqlalchemy.engines [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTI
TUTION _check_effective_sql_mode /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py:308
2019-02-19 15:02:45.443 2889 DEBUG oslo_db.sqlalchemy.engines [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTI
TUTION _check_effective_sql_mode /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py:308
2019-02-19 15:02:45.451 2889 INFO nova.compute.rpcapi [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Automatically selected compute RPC version 5.0 from minimum service version 35
2019-02-19 15:02:45.452 2889 INFO nova.console.websocketproxy [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] handler exception: [Errno 104] Connection reset by peer
2019-02-19 15:02:45.452 2889 DEBUG nova.console.websocketproxy [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] exception vmsg /usr/lib/python2.7/site-packages/websockify/websocket.py:873
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy Traceback (most recent call last):
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/websockify/websocket.py", line 928, in top_new_client
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy client = self.do_handshake(startsock, address)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/websockify/websocket.py", line 858, in do_handshake
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy self.RequestHandlerClass(retsock, address, self)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/nova/console/websocketproxy.py", line 311, in __init__
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy websockify.ProxyRequestHandler.__init__(self, *args, **kwargs)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/websockify/websocket.py", line 113, in __init__
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy SimpleHTTPRequestHandler.__init__(self, req, addr, server)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/SocketServer.py", line 652, in __init__
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy self.handle()
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/websockify/websocket.py", line 579, in handle
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy SimpleHTTPRequestHandler.handle(self)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/BaseHTTPServer.py", line 340, in handle
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy self.handle_one_request()
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/BaseHTTPServer.py", line 310, in handle_one_request
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy self.raw_requestline = self.rfile.readline(65537)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/socket.py", line 480, in readline
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy data = self._sock.recv(self._rbufsize)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/eventlet/green/ssl.py", line 190, in recv
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy return self._base_recv(buflen, flags, into=False)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/eventlet/green/ssl.py", line 217, in _base_recv
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy read = self.read(nbytes)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/eventlet/green/ssl.py", line 135, in read
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy super(GreenSSLSocket, self).read, *args, **kwargs)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/eventlet/green/ssl.py", line 109, in _call_trampolining
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy return func(*a, **kw)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/ssl.py", line 673, in read
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy v = self._sslobj.read(len)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy error: [Errno 104] Connection reset by peer
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy
2019-02-19 15:02:47.037 2880 DEBUG nova.console.websocketproxy [-] 192.168.237.85: new handler Process vmsg /usr/lib/python2.7/site-packages/websockify/websocket.py:873
(paste: http://paste.openstack.org/show/745451/)
This sequence starting with the "new handler Process" repeats continuously. It seems that the haproxy health checks initiate an SSL connection but then immediately send a TCP RST:
http://git.haproxy.org/?p=haproxy.git;a=commit;h=fd29cc537b8511db6e256529ded625c8e7f856d0
For most services this does not seem to be an issue, but for nova-novncproxy it repeatedly initializes NovaProxyRequestHandler which creates a full nova.compute.rpcapi.ComputeAPI instance which very quickly starts to consume significant CPU and overtake the host.
Note that we tried upgrading to HEAD of websockify and eventlet which did not improve the issue.
Our workaround was to turn off check-ssl in haproxy and use a basic tcp check, but we're concerned that nova-novncproxy remains vulnerable to a DOS attack given how easy it is for haproxy to overload the service. For that reason I'm opening this initially as a security bug, though you could perhaps argue that it's no secret that making un-ratelimited requests at any service will cause high load.
Steps to reproduce
==================
1. Configure nova-novncproxy to use SSL by setting the cert= and key= parameters in [DEFAULT] and turn on debug logging.
2. You can simulate the haproxy SSL health check with this python script:
import socket, ssl, struct, time
host = '192.168.237.81'
port = 5554
while True:
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
ssl_sock = ssl.wrap_socket(sock)
ssl_sock.connect((host, port))
ssl_sock.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER, struct.pack('ii', 1, 0))
sock.close()
time.sleep(2)
Expected result
===============
nova-novncproxy should gracefully handle the RST and not start overutilizing CPU. It should probably hold off on initializing database connections and such until a meaningful request other than an SSL HELLO is received.
Actual result
=============
The nova-novncproxy process quickly jumps to the top of the CPU% metrics of process analyzers like top and htop and if left unattended on a server with few cores will cause the server's overall performance to be degraded.
Environment
===========
We found this on the latest of the stable/rocky branch on SLES:
# cat /etc/os-release
NAME="SLES"
VERSION="12-SP4"
VERSION_ID="12.4"
PRETTY_NAME="SUSE Linux Enterprise Server 12 SP4"
# uname -a
Linux d52-54-77-77-01-01 4.12.14-95.6-default #1 SMP Thu Jan 17 06:04:39 UTC 2019 (6af4ef8) x86_64 x86_64 x86_64 GNU/Linux
# zypper info openstack-nova
Information for package openstack-nova:
---------------------------------------
Repository : Cloud
Name : openstack-nova
Version : 18.1.1~dev47-749.1
Arch : noarch
Vendor : obs://build.suse.de/Devel:Cloud
Support Level : Level 3
Installed Size : 444.7 KiB
Installed : Yes
Status : up-to-date
Source package : openstack-nova-18.1.1~dev47-749.1.src
Summary : OpenStack Compute (Nova)
# zypper info haproxy
Information for package haproxy:
--------------------------------
Repository : Cloud
Name : haproxy
Version : 1.6.11-10.2
Arch : x86_64
Vendor : SUSE LLC <https://www.suse.com/>
Support Level : Level 3
Installed Size : 3.1 MiB
Installed : Yes
Status : up-to-date
Source package : haproxy-1.6.11-10.2.src |
This issue is being treated as a potential security risk under embargo. Please do not make any public mention of embargoed (private) security vulnerabilities before their coordinated publication by the OpenStack Vulnerability Management Team in the form of an official OpenStack Security Advisory. This includes discussion of the bug or associated fixes in public forums such as mailing lists, code review systems and bug trackers. Please also avoid private disclosure to other individuals not already approved for access to this information, and provide this same reminder to those who are made aware of the issue prior to publication. All discussion should remain confined to this private bug report, and any proposed fixes should be added to the bug as attachments.
Description
===========
We have nova-novncproxy configured to use SSL:
```
[DEFAULT]
ssl_only=true
cert = /etc/nova/ssl/certs/signing_cert.pem
key = /etc/nova/ssl/private/signing_key.pem
...
[vnc]
enabled = True
server_listen = "0.0.0.0"
server_proxyclient_address = 192.168.237.81
novncproxy_host = 192.168.237.81
novncproxy_port = 5554
novncproxy_base_url = https://<public hostname>:6080/vnc_auto.html
xvpvncproxy_host = 192.168.237.81
```
We also have haproxy acting as a load balancer, but not terminating SSL. We have an haproxy health check configured like this for nova-novncproxy:
```
listen nova-novncproxy
# irrelevant config...
server <internal hostname> 192.168.237.84:5554 check check-ssl verify none inter 2000 rise 5 fall 2
```
where 192.168.237.81 is a virtual IP address and 192.168.237.84 is the node's individual IP address.
With that health check enabled, we found the nova-novncproxy process CPU spiking and eventually causing the node to hang. With debug logging enabled, we noticed this in the nova-novncproxy logs:
2019-02-19 15:02:44.148 2880 INFO nova.console.websocketproxy [-] WebSocket server settings:
2019-02-19 15:02:44.149 2880 INFO nova.console.websocketproxy [-] - Listen on 192.168.237.81:5554
2019-02-19 15:02:44.149 2880 INFO nova.console.websocketproxy [-] - Flash security policy server
2019-02-19 15:02:44.149 2880 INFO nova.console.websocketproxy [-] - Web server (no directory listings). Web root: /usr/share/novnc
2019-02-19 15:02:44.150 2880 INFO nova.console.websocketproxy [-] - SSL/TLS support
2019-02-19 15:02:44.151 2880 INFO nova.console.websocketproxy [-] - proxying from 192.168.237.81:5554 to None:None
2019-02-19 15:02:45.015 2880 DEBUG nova.console.websocketproxy [-] 192.168.237.85: new handler Process vmsg /usr/lib/python2.7/site-packages/websockify/websocket.py:873
2019-02-19 15:02:45.184 2889 DEBUG oslo_db.sqlalchemy.engines [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTI
TUTION _check_effective_sql_mode /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py:308
2019-02-19 15:02:45.377 2889 DEBUG nova.context [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Found 2 cells: 00000000-0000-0000-0000-000000000000(cell0),9f9825dd-868f-41cc-9c8e-e544f1528d6a(cell1) load_cells /usr/lib/python2.7/site-packages/nova/context.py:479
2019-02-19 15:02:45.380 2889 DEBUG oslo_concurrency.lockutils [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Lock "00000000-0000-0000-0000-000000000000" acquired by "nova.context.get_or_set_cached_cell_and_set_connections" :: waited 0.000s inner /usr/lib/python2.7/site-packag
es/oslo_concurrency/lockutils.py:273
2019-02-19 15:02:45.382 2889 DEBUG oslo_concurrency.lockutils [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Lock "00000000-0000-0000-0000-000000000000" released by "nova.context.get_or_set_cached_cell_and_set_connections" :: held 0.002s inner /usr/lib/python2.7/site-packages
/oslo_concurrency/lockutils.py:285
2019-02-19 15:02:45.393 2889 DEBUG oslo_concurrency.lockutils [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Lock "9f9825dd-868f-41cc-9c8e-e544f1528d6a" acquired by "nova.context.get_or_set_cached_cell_and_set_connections" :: waited 0.000s inner /usr/lib/python2.7/site-packag
es/oslo_concurrency/lockutils.py:273
2019-02-19 15:02:45.395 2889 DEBUG oslo_concurrency.lockutils [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Lock "9f9825dd-868f-41cc-9c8e-e544f1528d6a" released by "nova.context.get_or_set_cached_cell_and_set_connections" :: held 0.003s inner /usr/lib/python2.7/site-packages
/oslo_concurrency/lockutils.py:285
2019-02-19 15:02:45.437 2889 DEBUG oslo_db.sqlalchemy.engines [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTI
TUTION _check_effective_sql_mode /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py:308
2019-02-19 15:02:45.443 2889 DEBUG oslo_db.sqlalchemy.engines [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTI
TUTION _check_effective_sql_mode /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py:308
2019-02-19 15:02:45.451 2889 INFO nova.compute.rpcapi [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Automatically selected compute RPC version 5.0 from minimum service version 35
2019-02-19 15:02:45.452 2889 INFO nova.console.websocketproxy [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] handler exception: [Errno 104] Connection reset by peer
2019-02-19 15:02:45.452 2889 DEBUG nova.console.websocketproxy [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] exception vmsg /usr/lib/python2.7/site-packages/websockify/websocket.py:873
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy Traceback (most recent call last):
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/websockify/websocket.py", line 928, in top_new_client
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy client = self.do_handshake(startsock, address)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/websockify/websocket.py", line 858, in do_handshake
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy self.RequestHandlerClass(retsock, address, self)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/nova/console/websocketproxy.py", line 311, in __init__
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy websockify.ProxyRequestHandler.__init__(self, *args, **kwargs)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/websockify/websocket.py", line 113, in __init__
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy SimpleHTTPRequestHandler.__init__(self, req, addr, server)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/SocketServer.py", line 652, in __init__
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy self.handle()
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/websockify/websocket.py", line 579, in handle
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy SimpleHTTPRequestHandler.handle(self)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/BaseHTTPServer.py", line 340, in handle
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy self.handle_one_request()
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/BaseHTTPServer.py", line 310, in handle_one_request
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy self.raw_requestline = self.rfile.readline(65537)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/socket.py", line 480, in readline
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy data = self._sock.recv(self._rbufsize)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/eventlet/green/ssl.py", line 190, in recv
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy return self._base_recv(buflen, flags, into=False)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/eventlet/green/ssl.py", line 217, in _base_recv
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy read = self.read(nbytes)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/eventlet/green/ssl.py", line 135, in read
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy super(GreenSSLSocket, self).read, *args, **kwargs)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/eventlet/green/ssl.py", line 109, in _call_trampolining
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy return func(*a, **kw)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/ssl.py", line 673, in read
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy v = self._sslobj.read(len)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy error: [Errno 104] Connection reset by peer
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy
2019-02-19 15:02:47.037 2880 DEBUG nova.console.websocketproxy [-] 192.168.237.85: new handler Process vmsg /usr/lib/python2.7/site-packages/websockify/websocket.py:873
(paste: http://paste.openstack.org/show/745451/)
This sequence starting with the "new handler Process" repeats continuously. It seems that the haproxy health checks initiate an SSL connection but then immediately send a TCP RST:
http://git.haproxy.org/?p=haproxy.git;a=commit;h=fd29cc537b8511db6e256529ded625c8e7f856d0
For most services this does not seem to be an issue, but for nova-novncproxy it repeatedly initializes NovaProxyRequestHandler which creates a full nova.compute.rpcapi.ComputeAPI instance which very quickly starts to consume significant CPU and overtake the host.
Note that we tried upgrading to HEAD of websockify and eventlet which did not improve the issue.
Our workaround was to turn off check-ssl in haproxy and use a basic tcp check, but we're concerned that nova-novncproxy remains vulnerable to a DOS attack given how easy it is for haproxy to overload the service. For that reason I'm opening this initially as a security bug, though you could perhaps argue that it's no secret that making un-ratelimited requests at any service will cause high load.
Steps to reproduce
==================
1. Configure nova-novncproxy to use SSL by setting the cert= and key= parameters in [DEFAULT] and turn on debug logging.
2. You can simulate the haproxy SSL health check with this python script:
import socket, ssl, struct, time
host = '192.168.237.81'
port = 5554
while True:
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
ssl_sock = ssl.wrap_socket(sock)
ssl_sock.connect((host, port))
ssl_sock.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER, struct.pack('ii', 1, 0))
sock.close()
time.sleep(2)
Expected result
===============
nova-novncproxy should gracefully handle the RST and not start overutilizing CPU. It should probably hold off on initializing database connections and such until a meaningful request other than an SSL HELLO is received.
Actual result
=============
The nova-novncproxy process quickly jumps to the top of the CPU% metrics of process analyzers like top and htop and if left unattended on a server with few cores will cause the server's overall performance to be degraded.
Environment
===========
We found this on the latest of the stable/rocky branch on SLES:
# cat /etc/os-release
NAME="SLES"
VERSION="12-SP4"
VERSION_ID="12.4"
PRETTY_NAME="SUSE Linux Enterprise Server 12 SP4"
# uname -a
Linux d52-54-77-77-01-01 4.12.14-95.6-default #1 SMP Thu Jan 17 06:04:39 UTC 2019 (6af4ef8) x86_64 x86_64 x86_64 GNU/Linux
# zypper info openstack-nova
Information for package openstack-nova:
---------------------------------------
Repository : Cloud
Name : openstack-nova
Version : 18.1.1~dev47-749.1
Arch : noarch
Vendor : obs://build.suse.de/Devel:Cloud
Support Level : Level 3
Installed Size : 444.7 KiB
Installed : Yes
Status : up-to-date
Source package : openstack-nova-18.1.1~dev47-749.1.src
Summary : OpenStack Compute (Nova)
# zypper info haproxy
Information for package haproxy:
--------------------------------
Repository : Cloud
Name : haproxy
Version : 1.6.11-10.2
Arch : x86_64
Vendor : SUSE LLC <https://www.suse.com/>
Support Level : Level 3
Installed Size : 3.1 MiB
Installed : Yes
Status : up-to-date
Source package : haproxy-1.6.11-10.2.src |
|
2019-02-20 14:33:20 |
Jeremy Stanley |
bug task added |
|
ossa |
|
2019-02-20 14:33:32 |
Jeremy Stanley |
ossa: status |
New |
Incomplete |
|
2019-02-20 14:33:48 |
Jeremy Stanley |
bug |
|
|
added subscriber Nova Core security contacts |
2019-03-15 21:28:20 |
melanie witt |
attachment added |
|
websocketproxy.patch https://bugs.launchpad.net/nova/+bug/1816727/+attachment/5246610/+files/websocketproxy.patch |
|
2019-03-18 23:14:54 |
melanie witt |
bug |
|
|
added subscriber melanie witt |
2019-03-18 23:32:49 |
melanie witt |
attachment removed |
websocketproxy.patch https://bugs.launchpad.net/nova/+bug/1816727/+attachment/5246610/+files/websocketproxy.patch |
|
|
2019-03-18 23:37:08 |
melanie witt |
attachment added |
|
websocketproxy.patch https://bugs.launchpad.net/nova/+bug/1816727/+attachment/5247431/+files/websocketproxy.patch |
|
2019-03-20 18:28:02 |
Jeremy Stanley |
description |
This issue is being treated as a potential security risk under embargo. Please do not make any public mention of embargoed (private) security vulnerabilities before their coordinated publication by the OpenStack Vulnerability Management Team in the form of an official OpenStack Security Advisory. This includes discussion of the bug or associated fixes in public forums such as mailing lists, code review systems and bug trackers. Please also avoid private disclosure to other individuals not already approved for access to this information, and provide this same reminder to those who are made aware of the issue prior to publication. All discussion should remain confined to this private bug report, and any proposed fixes should be added to the bug as attachments.
Description
===========
We have nova-novncproxy configured to use SSL:
```
[DEFAULT]
ssl_only=true
cert = /etc/nova/ssl/certs/signing_cert.pem
key = /etc/nova/ssl/private/signing_key.pem
...
[vnc]
enabled = True
server_listen = "0.0.0.0"
server_proxyclient_address = 192.168.237.81
novncproxy_host = 192.168.237.81
novncproxy_port = 5554
novncproxy_base_url = https://<public hostname>:6080/vnc_auto.html
xvpvncproxy_host = 192.168.237.81
```
We also have haproxy acting as a load balancer, but not terminating SSL. We have an haproxy health check configured like this for nova-novncproxy:
```
listen nova-novncproxy
# irrelevant config...
server <internal hostname> 192.168.237.84:5554 check check-ssl verify none inter 2000 rise 5 fall 2
```
where 192.168.237.81 is a virtual IP address and 192.168.237.84 is the node's individual IP address.
With that health check enabled, we found the nova-novncproxy process CPU spiking and eventually causing the node to hang. With debug logging enabled, we noticed this in the nova-novncproxy logs:
2019-02-19 15:02:44.148 2880 INFO nova.console.websocketproxy [-] WebSocket server settings:
2019-02-19 15:02:44.149 2880 INFO nova.console.websocketproxy [-] - Listen on 192.168.237.81:5554
2019-02-19 15:02:44.149 2880 INFO nova.console.websocketproxy [-] - Flash security policy server
2019-02-19 15:02:44.149 2880 INFO nova.console.websocketproxy [-] - Web server (no directory listings). Web root: /usr/share/novnc
2019-02-19 15:02:44.150 2880 INFO nova.console.websocketproxy [-] - SSL/TLS support
2019-02-19 15:02:44.151 2880 INFO nova.console.websocketproxy [-] - proxying from 192.168.237.81:5554 to None:None
2019-02-19 15:02:45.015 2880 DEBUG nova.console.websocketproxy [-] 192.168.237.85: new handler Process vmsg /usr/lib/python2.7/site-packages/websockify/websocket.py:873
2019-02-19 15:02:45.184 2889 DEBUG oslo_db.sqlalchemy.engines [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTI
TUTION _check_effective_sql_mode /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py:308
2019-02-19 15:02:45.377 2889 DEBUG nova.context [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Found 2 cells: 00000000-0000-0000-0000-000000000000(cell0),9f9825dd-868f-41cc-9c8e-e544f1528d6a(cell1) load_cells /usr/lib/python2.7/site-packages/nova/context.py:479
2019-02-19 15:02:45.380 2889 DEBUG oslo_concurrency.lockutils [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Lock "00000000-0000-0000-0000-000000000000" acquired by "nova.context.get_or_set_cached_cell_and_set_connections" :: waited 0.000s inner /usr/lib/python2.7/site-packag
es/oslo_concurrency/lockutils.py:273
2019-02-19 15:02:45.382 2889 DEBUG oslo_concurrency.lockutils [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Lock "00000000-0000-0000-0000-000000000000" released by "nova.context.get_or_set_cached_cell_and_set_connections" :: held 0.002s inner /usr/lib/python2.7/site-packages
/oslo_concurrency/lockutils.py:285
2019-02-19 15:02:45.393 2889 DEBUG oslo_concurrency.lockutils [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Lock "9f9825dd-868f-41cc-9c8e-e544f1528d6a" acquired by "nova.context.get_or_set_cached_cell_and_set_connections" :: waited 0.000s inner /usr/lib/python2.7/site-packag
es/oslo_concurrency/lockutils.py:273
2019-02-19 15:02:45.395 2889 DEBUG oslo_concurrency.lockutils [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Lock "9f9825dd-868f-41cc-9c8e-e544f1528d6a" released by "nova.context.get_or_set_cached_cell_and_set_connections" :: held 0.003s inner /usr/lib/python2.7/site-packages
/oslo_concurrency/lockutils.py:285
2019-02-19 15:02:45.437 2889 DEBUG oslo_db.sqlalchemy.engines [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTI
TUTION _check_effective_sql_mode /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py:308
2019-02-19 15:02:45.443 2889 DEBUG oslo_db.sqlalchemy.engines [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTI
TUTION _check_effective_sql_mode /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py:308
2019-02-19 15:02:45.451 2889 INFO nova.compute.rpcapi [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Automatically selected compute RPC version 5.0 from minimum service version 35
2019-02-19 15:02:45.452 2889 INFO nova.console.websocketproxy [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] handler exception: [Errno 104] Connection reset by peer
2019-02-19 15:02:45.452 2889 DEBUG nova.console.websocketproxy [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] exception vmsg /usr/lib/python2.7/site-packages/websockify/websocket.py:873
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy Traceback (most recent call last):
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/websockify/websocket.py", line 928, in top_new_client
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy client = self.do_handshake(startsock, address)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/websockify/websocket.py", line 858, in do_handshake
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy self.RequestHandlerClass(retsock, address, self)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/nova/console/websocketproxy.py", line 311, in __init__
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy websockify.ProxyRequestHandler.__init__(self, *args, **kwargs)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/websockify/websocket.py", line 113, in __init__
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy SimpleHTTPRequestHandler.__init__(self, req, addr, server)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/SocketServer.py", line 652, in __init__
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy self.handle()
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/websockify/websocket.py", line 579, in handle
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy SimpleHTTPRequestHandler.handle(self)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/BaseHTTPServer.py", line 340, in handle
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy self.handle_one_request()
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/BaseHTTPServer.py", line 310, in handle_one_request
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy self.raw_requestline = self.rfile.readline(65537)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/socket.py", line 480, in readline
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy data = self._sock.recv(self._rbufsize)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/eventlet/green/ssl.py", line 190, in recv
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy return self._base_recv(buflen, flags, into=False)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/eventlet/green/ssl.py", line 217, in _base_recv
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy read = self.read(nbytes)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/eventlet/green/ssl.py", line 135, in read
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy super(GreenSSLSocket, self).read, *args, **kwargs)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/eventlet/green/ssl.py", line 109, in _call_trampolining
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy return func(*a, **kw)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/ssl.py", line 673, in read
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy v = self._sslobj.read(len)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy error: [Errno 104] Connection reset by peer
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy
2019-02-19 15:02:47.037 2880 DEBUG nova.console.websocketproxy [-] 192.168.237.85: new handler Process vmsg /usr/lib/python2.7/site-packages/websockify/websocket.py:873
(paste: http://paste.openstack.org/show/745451/)
This sequence starting with the "new handler Process" repeats continuously. It seems that the haproxy health checks initiate an SSL connection but then immediately send a TCP RST:
http://git.haproxy.org/?p=haproxy.git;a=commit;h=fd29cc537b8511db6e256529ded625c8e7f856d0
For most services this does not seem to be an issue, but for nova-novncproxy it repeatedly initializes NovaProxyRequestHandler which creates a full nova.compute.rpcapi.ComputeAPI instance which very quickly starts to consume significant CPU and overtake the host.
Note that we tried upgrading to HEAD of websockify and eventlet which did not improve the issue.
Our workaround was to turn off check-ssl in haproxy and use a basic tcp check, but we're concerned that nova-novncproxy remains vulnerable to a DOS attack given how easy it is for haproxy to overload the service. For that reason I'm opening this initially as a security bug, though you could perhaps argue that it's no secret that making un-ratelimited requests at any service will cause high load.
Steps to reproduce
==================
1. Configure nova-novncproxy to use SSL by setting the cert= and key= parameters in [DEFAULT] and turn on debug logging.
2. You can simulate the haproxy SSL health check with this python script:
import socket, ssl, struct, time
host = '192.168.237.81'
port = 5554
while True:
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
ssl_sock = ssl.wrap_socket(sock)
ssl_sock.connect((host, port))
ssl_sock.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER, struct.pack('ii', 1, 0))
sock.close()
time.sleep(2)
Expected result
===============
nova-novncproxy should gracefully handle the RST and not start overutilizing CPU. It should probably hold off on initializing database connections and such until a meaningful request other than an SSL HELLO is received.
Actual result
=============
The nova-novncproxy process quickly jumps to the top of the CPU% metrics of process analyzers like top and htop and if left unattended on a server with few cores will cause the server's overall performance to be degraded.
Environment
===========
We found this on the latest of the stable/rocky branch on SLES:
# cat /etc/os-release
NAME="SLES"
VERSION="12-SP4"
VERSION_ID="12.4"
PRETTY_NAME="SUSE Linux Enterprise Server 12 SP4"
# uname -a
Linux d52-54-77-77-01-01 4.12.14-95.6-default #1 SMP Thu Jan 17 06:04:39 UTC 2019 (6af4ef8) x86_64 x86_64 x86_64 GNU/Linux
# zypper info openstack-nova
Information for package openstack-nova:
---------------------------------------
Repository : Cloud
Name : openstack-nova
Version : 18.1.1~dev47-749.1
Arch : noarch
Vendor : obs://build.suse.de/Devel:Cloud
Support Level : Level 3
Installed Size : 444.7 KiB
Installed : Yes
Status : up-to-date
Source package : openstack-nova-18.1.1~dev47-749.1.src
Summary : OpenStack Compute (Nova)
# zypper info haproxy
Information for package haproxy:
--------------------------------
Repository : Cloud
Name : haproxy
Version : 1.6.11-10.2
Arch : x86_64
Vendor : SUSE LLC <https://www.suse.com/>
Support Level : Level 3
Installed Size : 3.1 MiB
Installed : Yes
Status : up-to-date
Source package : haproxy-1.6.11-10.2.src |
Description
===========
We have nova-novncproxy configured to use SSL:
```
[DEFAULT]
ssl_only=true
cert = /etc/nova/ssl/certs/signing_cert.pem
key = /etc/nova/ssl/private/signing_key.pem
...
[vnc]
enabled = True
server_listen = "0.0.0.0"
server_proxyclient_address = 192.168.237.81
novncproxy_host = 192.168.237.81
novncproxy_port = 5554
novncproxy_base_url = https://<public hostname>:6080/vnc_auto.html
xvpvncproxy_host = 192.168.237.81
```
We also have haproxy acting as a load balancer, but not terminating SSL. We have an haproxy health check configured like this for nova-novncproxy:
```
listen nova-novncproxy
# irrelevant config...
server <internal hostname> 192.168.237.84:5554 check check-ssl verify none inter 2000 rise 5 fall 2
```
where 192.168.237.81 is a virtual IP address and 192.168.237.84 is the node's individual IP address.
With that health check enabled, we found the nova-novncproxy process CPU spiking and eventually causing the node to hang. With debug logging enabled, we noticed this in the nova-novncproxy logs:
2019-02-19 15:02:44.148 2880 INFO nova.console.websocketproxy [-] WebSocket server settings:
2019-02-19 15:02:44.149 2880 INFO nova.console.websocketproxy [-] - Listen on 192.168.237.81:5554
2019-02-19 15:02:44.149 2880 INFO nova.console.websocketproxy [-] - Flash security policy server
2019-02-19 15:02:44.149 2880 INFO nova.console.websocketproxy [-] - Web server (no directory listings). Web root: /usr/share/novnc
2019-02-19 15:02:44.150 2880 INFO nova.console.websocketproxy [-] - SSL/TLS support
2019-02-19 15:02:44.151 2880 INFO nova.console.websocketproxy [-] - proxying from 192.168.237.81:5554 to None:None
2019-02-19 15:02:45.015 2880 DEBUG nova.console.websocketproxy [-] 192.168.237.85: new handler Process vmsg /usr/lib/python2.7/site-packages/websockify/websocket.py:873
2019-02-19 15:02:45.184 2889 DEBUG oslo_db.sqlalchemy.engines [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTI
TUTION _check_effective_sql_mode /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py:308
2019-02-19 15:02:45.377 2889 DEBUG nova.context [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Found 2 cells: 00000000-0000-0000-0000-000000000000(cell0),9f9825dd-868f-41cc-9c8e-e544f1528d6a(cell1) load_cells /usr/lib/python2.7/site-packages/nova/context.py:479
2019-02-19 15:02:45.380 2889 DEBUG oslo_concurrency.lockutils [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Lock "00000000-0000-0000-0000-000000000000" acquired by "nova.context.get_or_set_cached_cell_and_set_connections" :: waited 0.000s inner /usr/lib/python2.7/site-packag
es/oslo_concurrency/lockutils.py:273
2019-02-19 15:02:45.382 2889 DEBUG oslo_concurrency.lockutils [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Lock "00000000-0000-0000-0000-000000000000" released by "nova.context.get_or_set_cached_cell_and_set_connections" :: held 0.002s inner /usr/lib/python2.7/site-packages
/oslo_concurrency/lockutils.py:285
2019-02-19 15:02:45.393 2889 DEBUG oslo_concurrency.lockutils [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Lock "9f9825dd-868f-41cc-9c8e-e544f1528d6a" acquired by "nova.context.get_or_set_cached_cell_and_set_connections" :: waited 0.000s inner /usr/lib/python2.7/site-packag
es/oslo_concurrency/lockutils.py:273
2019-02-19 15:02:45.395 2889 DEBUG oslo_concurrency.lockutils [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Lock "9f9825dd-868f-41cc-9c8e-e544f1528d6a" released by "nova.context.get_or_set_cached_cell_and_set_connections" :: held 0.003s inner /usr/lib/python2.7/site-packages
/oslo_concurrency/lockutils.py:285
2019-02-19 15:02:45.437 2889 DEBUG oslo_db.sqlalchemy.engines [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTI
TUTION _check_effective_sql_mode /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py:308
2019-02-19 15:02:45.443 2889 DEBUG oslo_db.sqlalchemy.engines [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTI
TUTION _check_effective_sql_mode /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py:308
2019-02-19 15:02:45.451 2889 INFO nova.compute.rpcapi [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] Automatically selected compute RPC version 5.0 from minimum service version 35
2019-02-19 15:02:45.452 2889 INFO nova.console.websocketproxy [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] handler exception: [Errno 104] Connection reset by peer
2019-02-19 15:02:45.452 2889 DEBUG nova.console.websocketproxy [req-8552f48d-8c1c-4330-aaec-64d544c6cc1e - - - - -] exception vmsg /usr/lib/python2.7/site-packages/websockify/websocket.py:873
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy Traceback (most recent call last):
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/websockify/websocket.py", line 928, in top_new_client
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy client = self.do_handshake(startsock, address)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/websockify/websocket.py", line 858, in do_handshake
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy self.RequestHandlerClass(retsock, address, self)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/nova/console/websocketproxy.py", line 311, in __init__
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy websockify.ProxyRequestHandler.__init__(self, *args, **kwargs)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/websockify/websocket.py", line 113, in __init__
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy SimpleHTTPRequestHandler.__init__(self, req, addr, server)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/SocketServer.py", line 652, in __init__
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy self.handle()
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/websockify/websocket.py", line 579, in handle
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy SimpleHTTPRequestHandler.handle(self)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/BaseHTTPServer.py", line 340, in handle
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy self.handle_one_request()
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/BaseHTTPServer.py", line 310, in handle_one_request
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy self.raw_requestline = self.rfile.readline(65537)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/socket.py", line 480, in readline
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy data = self._sock.recv(self._rbufsize)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/eventlet/green/ssl.py", line 190, in recv
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy return self._base_recv(buflen, flags, into=False)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/eventlet/green/ssl.py", line 217, in _base_recv
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy read = self.read(nbytes)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/eventlet/green/ssl.py", line 135, in read
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy super(GreenSSLSocket, self).read, *args, **kwargs)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib/python2.7/site-packages/eventlet/green/ssl.py", line 109, in _call_trampolining
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy return func(*a, **kw)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/ssl.py", line 673, in read
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy v = self._sslobj.read(len)
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy error: [Errno 104] Connection reset by peer
2019-02-19 15:02:45.452 2889 ERROR nova.console.websocketproxy
2019-02-19 15:02:47.037 2880 DEBUG nova.console.websocketproxy [-] 192.168.237.85: new handler Process vmsg /usr/lib/python2.7/site-packages/websockify/websocket.py:873
(paste: http://paste.openstack.org/show/745451/)
This sequence starting with the "new handler Process" repeats continuously. It seems that the haproxy health checks initiate an SSL connection but then immediately send a TCP RST:
http://git.haproxy.org/?p=haproxy.git;a=commit;h=fd29cc537b8511db6e256529ded625c8e7f856d0
For most services this does not seem to be an issue, but for nova-novncproxy it repeatedly initializes NovaProxyRequestHandler which creates a full nova.compute.rpcapi.ComputeAPI instance which very quickly starts to consume significant CPU and overtake the host.
Note that we tried upgrading to HEAD of websockify and eventlet which did not improve the issue.
Our workaround was to turn off check-ssl in haproxy and use a basic tcp check, but we're concerned that nova-novncproxy remains vulnerable to a DOS attack given how easy it is for haproxy to overload the service. For that reason I'm opening this initially as a security bug, though you could perhaps argue that it's no secret that making un-ratelimited requests at any service will cause high load.
Steps to reproduce
==================
1. Configure nova-novncproxy to use SSL by setting the cert= and key= parameters in [DEFAULT] and turn on debug logging.
2. You can simulate the haproxy SSL health check with this python script:
import socket, ssl, struct, time
host = '192.168.237.81'
port = 5554
while True:
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
ssl_sock = ssl.wrap_socket(sock)
ssl_sock.connect((host, port))
ssl_sock.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER, struct.pack('ii', 1, 0))
sock.close()
time.sleep(2)
Expected result
===============
nova-novncproxy should gracefully handle the RST and not start overutilizing CPU. It should probably hold off on initializing database connections and such until a meaningful request other than an SSL HELLO is received.
Actual result
=============
The nova-novncproxy process quickly jumps to the top of the CPU% metrics of process analyzers like top and htop and if left unattended on a server with few cores will cause the server's overall performance to be degraded.
Environment
===========
We found this on the latest of the stable/rocky branch on SLES:
# cat /etc/os-release
NAME="SLES"
VERSION="12-SP4"
VERSION_ID="12.4"
PRETTY_NAME="SUSE Linux Enterprise Server 12 SP4"
# uname -a
Linux d52-54-77-77-01-01 4.12.14-95.6-default #1 SMP Thu Jan 17 06:04:39 UTC 2019 (6af4ef8) x86_64 x86_64 x86_64 GNU/Linux
# zypper info openstack-nova
Information for package openstack-nova:
---------------------------------------
Repository : Cloud
Name : openstack-nova
Version : 18.1.1~dev47-749.1
Arch : noarch
Vendor : obs://build.suse.de/Devel:Cloud
Support Level : Level 3
Installed Size : 444.7 KiB
Installed : Yes
Status : up-to-date
Source package : openstack-nova-18.1.1~dev47-749.1.src
Summary : OpenStack Compute (Nova)
# zypper info haproxy
Information for package haproxy:
--------------------------------
Repository : Cloud
Name : haproxy
Version : 1.6.11-10.2
Arch : x86_64
Vendor : SUSE LLC <https://www.suse.com/>
Support Level : Level 3
Installed Size : 3.1 MiB
Installed : Yes
Status : up-to-date
Source package : haproxy-1.6.11-10.2.src |
|
2019-03-20 18:28:31 |
Jeremy Stanley |
ossa: status |
Incomplete |
Won't Fix |
|
2019-03-20 18:28:47 |
Jeremy Stanley |
tags |
|
security |
|
2019-03-20 18:29:08 |
Jeremy Stanley |
information type |
Private Security |
Public |
|
2019-03-20 20:39:25 |
OpenStack Infra |
nova: status |
New |
In Progress |
|
2019-03-20 20:39:25 |
OpenStack Infra |
nova: assignee |
|
melanie witt (melwitt) |
|
2019-03-20 20:41:38 |
melanie witt |
nova: importance |
Undecided |
Medium |
|
2019-03-20 20:41:49 |
melanie witt |
tags |
security |
console security |
|
2019-03-29 21:06:58 |
OpenStack Infra |
nova: status |
In Progress |
Fix Released |
|
2019-04-02 20:35:41 |
melanie witt |
nominated for series |
|
nova/rocky |
|
2019-04-02 20:35:41 |
melanie witt |
bug task added |
|
nova/rocky |
|
2019-04-02 20:35:41 |
melanie witt |
nominated for series |
|
nova/stein |
|
2019-04-02 20:35:41 |
melanie witt |
bug task added |
|
nova/stein |
|
2019-04-02 20:36:54 |
melanie witt |
nova/stein: importance |
Undecided |
Medium |
|
2019-04-02 20:36:54 |
melanie witt |
nova/stein: status |
New |
In Progress |
|
2019-04-02 20:36:54 |
melanie witt |
nova/stein: assignee |
|
Colleen Murphy (krinkle) |
|
2019-04-02 20:37:10 |
melanie witt |
nova/rocky: importance |
Undecided |
Medium |
|
2019-04-02 20:37:10 |
melanie witt |
nova/rocky: status |
New |
In Progress |
|
2019-04-02 20:37:10 |
melanie witt |
nova/rocky: assignee |
|
Colleen Murphy (krinkle) |
|
2019-04-10 22:02:16 |
OpenStack Infra |
nova/stein: status |
In Progress |
Fix Committed |
|
2019-04-12 02:51:25 |
OpenStack Infra |
nova/rocky: status |
In Progress |
Fix Committed |
|
2019-04-26 22:17:34 |
melanie witt |
nominated for series |
|
nova/queens |
|
2019-04-26 22:17:34 |
melanie witt |
bug task added |
|
nova/queens |
|
2019-04-26 22:21:01 |
melanie witt |
bug task deleted |
nova/queens |
|
|