Eventlet green threads not released back to the pool leading to choking of new requests
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Cinder |
Fix Released
|
High
|
Abhishek Kekane | ||
Icehouse |
Fix Released
|
High
|
Abhishek Kekane | ||
Juno |
Fix Released
|
Undecided
|
Unassigned | ||
Glance |
Fix Released
|
Medium
|
Abhishek Kekane | ||
Icehouse |
Fix Committed
|
Undecided
|
Abhishek Kekane | ||
Juno |
Fix Released
|
Undecided
|
Unassigned | ||
OpenStack Compute (nova) |
Fix Released
|
High
|
Abhishek Kekane | ||
Icehouse |
Fix Released
|
High
|
Abhishek Kekane | ||
OpenStack Heat |
Fix Released
|
Medium
|
Thomas Herve | ||
Kilo |
Fix Released
|
Medium
|
Angus Salkeld | ||
OpenStack Identity (keystone) |
Fix Released
|
Medium
|
Abhishek Kekane | ||
Juno |
Fix Released
|
Medium
|
Abhishek Kekane | ||
Kilo |
Fix Released
|
Undecided
|
Unassigned | ||
OpenStack Security Advisory |
Won't Fix
|
Undecided
|
Unassigned | ||
OpenStack Security Notes |
Won't Fix
|
Undecided
|
Travis McPeak | ||
OpenStack Shared File Systems Service (Manila) |
Fix Released
|
Medium
|
Valeriy Ponomaryov | ||
Sahara |
Fix Released
|
Medium
|
Sergey Reshetnyak | ||
neutron |
Fix Released
|
Undecided
|
Abhishek Kekane | ||
Icehouse |
Fix Released
|
High
|
Abhishek Kekane | ||
Juno |
Fix Committed
|
Undecided
|
Unassigned |
Bug Description
Currently reproduced on Juno milestone 2. but this issue should be reproducible in all releases since its inception.
It is possible to choke OpenStack API controller services using wsgi+eventlet library by simply not closing the client socket connection. Whenever a request is received by any OpenStack API service for example nova api service, eventlet library creates a green thread from the pool and starts processing the request. Even after the response is sent to the caller, the green thread is not returned back to the pool until the client socket connection is closed. This way, any malicious user can send many API requests to the API controller node and determine the wsgi pool size configured for the given service and then send those many requests to the service and after receiving the response, wait there infinitely doing nothing leading to disrupting services for other tenants. Even when service providers have enabled rate limiting feature, it is possible to choke the API services with a group (many tenants) attack.
Following program illustrates choking of nova-api services (but this problem is omnipresent in all other OpenStack API Services using wsgi+eventlet)
Note: I have explicitly set the wsi_default_
After you run the below program, you should try to invoke API
=======
import time
import requests
from multiprocessing import Process
def request(number):
#Port is important here
path = 'http://
try:
response = requests.get(path)
print "RESPONSE %s-%d" % (response.
#during this sleep time, check if the client socket connection is released or not on the API controller node.
print “Thread %d complete" % number
except requests.
print “Exception occurred %d-%s" % (number, str(ex))
if __name__ == '__main__':
processes = []
for number in range(40):
p = Process(
p.start()
for p in processes:
p.join()
=======
Presently, the wsgi server allows persist connections if you configure keepalive to True which is default.
In order to close the client socket connection explicitly after the response is sent and read successfully by the client, you simply have to set keepalive to False when you create a wsgi server.
Additional information: By default eventlet passes “Connection: keepalive” if keepalive is set to True when a response is sent to the client. But it doesn’t have capability to set the timeout and max parameter.
For example.
Keep-Alive: timeout=10, max=5
Note: After we have disabled keepalive in all the OpenStack API service using wsgi library, then it might impact all existing applications built with the assumptions that OpenStack API services uses persistent connections. They might need to modify their applications if reconnection logic is not in place and also they might experience the performance has slowed down as it will need to reestablish the http connection for every request.
Changed in nova: | |
status: | New → Confirmed |
Changed in ossa: | |
status: | Incomplete → Confirmed |
Changed in keystone: | |
status: | New → Confirmed |
Changed in ossa: | |
importance: | Undecided → High |
Changed in cinder: | |
assignee: | nobody → Abhishek Kekane (abhishek-kekane) |
Changed in glance: | |
assignee: | nobody → Abhishek Kekane (abhishek-kekane) |
Changed in keystone: | |
assignee: | nobody → Abhishek Kekane (abhishek-kekane) |
Changed in nova: | |
assignee: | nobody → Abhishek Kekane (abhishek-kekane) |
Changed in neutron: | |
assignee: | nobody → Abhishek Kekane (abhishek-kekane) |
information type: | Private Security → Public |
tags: | added: security |
Changed in ossa: | |
importance: | High → Undecided |
status: | Confirmed → Won't Fix |
Changed in cinder: | |
importance: | Undecided → High |
Changed in nova: | |
importance: | Undecided → High |
Changed in neutron: | |
assignee: | Abhishek Kekane (abhishek-kekane) → Assaf Muller (amuller) |
Changed in neutron: | |
assignee: | Assaf Muller (amuller) → Abhishek Kekane (abhishek-kekane) |
Changed in cinder: | |
milestone: | none → kilo-1 |
status: | Fix Committed → Fix Released |
Changed in nova: | |
milestone: | none → kilo-1 |
status: | Fix Committed → Fix Released |
Changed in heat: | |
assignee: | nobody → Xurong Yang (idopra) |
Changed in sahara: | |
assignee: | nobody → Xurong Yang (idopra) |
Changed in neutron: | |
milestone: | none → kilo-2 |
status: | Fix Committed → Fix Released |
information type: | Public → Public Security |
information type: | Public Security → Public |
Changed in glance: | |
importance: | Undecided → Medium |
milestone: | none → kilo-3 |
Changed in glance: | |
status: | Fix Committed → Fix Released |
tags: | added: kilo-rc-potential |
Changed in heat: | |
importance: | Undecided → Medium |
milestone: | none → kilo-rc1 |
Changed in heat: | |
assignee: | Xurong Yang (idopra) → Angus Salkeld (asalkeld) |
Changed in sahara: | |
milestone: | none → liberty-1 |
Changed in heat: | |
milestone: | kilo-rc1 → liberty-1 |
Changed in sahara: | |
importance: | Undecided → Medium |
milestone: | liberty-1 → kilo-rc1 |
status: | New → Confirmed |
Changed in sahara: | |
milestone: | kilo-rc1 → liberty-1 |
Changed in glance: | |
milestone: | kilo-3 → 2015.1.0 |
Changed in nova: | |
milestone: | kilo-1 → 2015.1.0 |
Changed in neutron: | |
milestone: | kilo-2 → 2015.1.0 |
Changed in cinder: | |
milestone: | kilo-1 → 2015.1.0 |
tags: | removed: kilo-rc-potential |
tags: | added: kilo-backport-potential |
Changed in heat: | |
assignee: | Angus Salkeld (asalkeld) → Thomas Herve (therve) |
Changed in sahara: | |
milestone: | liberty-1 → liberty-2 |
Changed in keystone: | |
milestone: | none → liberty-1 |
status: | Fix Committed → Fix Released |
Changed in heat: | |
status: | Fix Committed → Fix Released |
Changed in manila: | |
assignee: | nobody → Valeriy Ponomaryov (vponomaryov) |
status: | New → In Progress |
Changed in manila: | |
importance: | Undecided → Medium |
milestone: | none → liberty-2 |
Changed in sahara: | |
milestone: | liberty-2 → liberty-3 |
Changed in manila: | |
status: | Fix Committed → Fix Released |
Changed in sahara: | |
milestone: | liberty-3 → liberty-rc1 |
Changed in ossn: | |
assignee: | nobody → Travis McPeak (travis-mcpeak) |
tags: | removed: in-stable-kilo kilo-backport-potential |
Changed in sahara: | |
milestone: | liberty-rc1 → next |
no longer affects: | keystone/icehouse |
Changed in sahara: | |
assignee: | Xurong Yang (idopra) → Sergey Reshetnyak (sreshetniak) |
milestone: | next → mitaka-1 |
Changed in sahara: | |
status: | Confirmed → In Progress |
Changed in keystone: | |
milestone: | liberty-1 → 8.0.0 |
Changed in heat: | |
milestone: | liberty-1 → 5.0.0 |
Changed in manila: | |
milestone: | liberty-2 → 1.0.0 |
Changed in sahara: | |
status: | Fix Committed → Fix Released |
Thanks for the report! The OSSA task is set to incomplete pending additional details from nova-coresec.
It sounds like a slowloris DoS like scenario... What are the side effect of disabling keepalive ?