appserver deployment must not interrupt live requests
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Launchpad itself |
Triaged
|
High
|
Unassigned |
Bug Description
See bug 380504 for historical context.
In short:
- when we start to upgrade an appserver it may have requests live
- haproxy will keep sending requests until the appserver falls over and dies
- RT 41503 will change this to workaround in a kludgy fashion.
We need to change the appservers to have a graceful shutdown:
- close the socket
- report that its closed
- let requests complete
- after a configured interval (200 seconds probably) forcibly take down remaining threads
This will let the overall process become:
- take the server out of rotation (sysadmin change, rt 41503)
- start the upgrade by telling the appserver to gracefully stop
- wait for the 'socket xxxx closed' message
- start the new instance (sysadmin change, future rt)
- if the old appserver hasn't shut down after (long time) kill -9 it (belt and braces)
ha proxy may stop forwarding requests cleanly once the listening socket is gone, but we want to be completely robust against bugs there.
description: | updated |
description: | updated |
description: | updated |
tags: | added: rfwtad |
This is why I say 200 seconds:
140.59s OOPS-1719D2284 https:/ /api.launchpad. net/beta /api.launchpad. net/devel /api.launchpad. net/1.0
102.74s OOPS-1719E2173 Person:+rdf
94.33s OOPS-1719F2300 https:/
77.05s OOPS-1719H2208 https:/