You can reproduce this by issuing a GET request for a few hundred MB file and never consuming the response, but keep the client socket open. Swift will log a 499 but the socket does not always close.
The problem is that if the client is not consuming the socket buffer then that flush will wait forever; it's trying to write on a socket that just threw a timeout trying to write! The flush write is not protected by any timeout.
All of this is probably mitigated by most operators setting an idle timeout in their load balancers, but I wanted to report it. Going directly to a proxy I was able to hold sockets open for long periods of time.
I did the initial research on version 2.2.2 but I was able to reproduce on 2.7.0. I'm trying to translate links to master branch on github. I apologize in advance if it's not quite right.
You can reproduce this by issuing a GET request for a few hundred MB file and never consuming the response, but keep the client socket open. Swift will log a 499 but the socket does not always close.
ChunkWriteTimeout is meant to protect the proxy from a slow reading client: /github. com/openstack/ swift/blob/ master/ swift/proxy/ controllers/ base.py# L889-L905
https:/
Sometimes when this exception is thrown there is still data in the process socket buffer, so when eventlet tries to close the socket it first flushes it: /github. com/eventlet/ eventlet/ blob/master/ eventlet/ wsgi.py# L631 /hg.python. org/cpython/ file/v2. 7.11/Lib/ SocketServer. py#l711
https:/
https:/
The problem is that if the client is not consuming the socket buffer then that flush will wait forever; it's trying to write on a socket that just threw a timeout trying to write! The flush write is not protected by any timeout.
As far as I can tell, this WRITE_TIMEOUT does nothing: /github. com/openstack/ swift/blob/ master/ swift/common/ wsgi.py# L407
https:/
wsgi.server() takes a socket_timeout that might be what we're after? /github. com/openstack/ swift/blob/ master/ swift/common/ wsgi.py# L437-L440
https:/
Even with socket_timeout, eventlet needs to be patched. This should be in a finally block: /github. com/eventlet/ eventlet/ blob/master/ eventlet/ wsgi.py# L636-L637
https:/
All of this is probably mitigated by most operators setting an idle timeout in their load balancers, but I wanted to report it. Going directly to a proxy I was able to hold sockets open for long periods of time.
I did the initial research on version 2.2.2 but I was able to reproduce on 2.7.0. I'm trying to translate links to master branch on github. I apologize in advance if it's not quite right.