Client connection handling improvements
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Gearman |
Fix Released
|
Medium
|
Brian Aker |
Bug Description
There's 2 scenarios that this bug relates to:
1) a client adds 2 or more servers (gearmand-1 and gearmand-2 in that order). It sends a job over to gearmand-1 and all is well. gearmand-1 then dies. The client tries to insert a new job insertions and it fails with LOST_CONNECTION. The client tries again and gets a COULD_NOT_CONNECT, even though gearmand-2 is up. In fact, the job is sent to gearmand-2 but the client never reads the server's response. The server though will merrily process that job.
2) This is similar. A client adds 2 or more servers (gearmand-1 and gearmand-2 in that order). It sends a job over to gearmand-1 and all is well. gearmand-1 then dies. The client tries to insert 2 new jobs. If #1 is not fixed, both will fail. Then gearmand-1 comes back. Now when a new job is tried to be inserted to gearmand-1, the client will go into a poll() and timeout. This is even though the server actually responds correctly and has inserted the job into its queue.
The test case for both is attached.
Patch 1 fixes scenario #1. We do not return an error code from gearman_
Patch 2 fixes scenario #2. The created_id in the connection and task get out of sync when there are errors. Instead of figuring out all the places to update to keep them in sync, I thought it simpler to reset the connection id's when it is closed.
Changed in gearmand: | |
assignee: | nobody → Brian Aker (brianaker) |
Changed in gearmand: | |
status: | New → In Progress |
importance: | Undecided → Medium |
Changed in gearmand: | |
status: | In Progress → Fix Committed |
milestone: | none → 1.0.3 |
Changed in gearmand: | |
status: | Fix Committed → Fix Released |
Patch 1