lock problem on transient disconnection during acquisition

Bug #998268 reported by Kapil Thangavelu
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
txzookeeper
New
Medium
Unassigned

Bug Description

<benbangert> hazmat: were you going to fix txzookeeper's lock?
<hazmat> benbangert, unsurprising, and that sounds great re zc.zk changes
<hazmat> benbangert, what was the problem?
<benbangert> the create node edge case
<benbangert> you create the node, server dies and you get connection loss, but the node was created
<benbangert> so then txzookeeper reconnects... and makes another node :)
<benbangert> and now there's two nodes it created, except it doesn't know the other one actually worked
<benbangert> thats why the recipe has the GUID bit in it now
* hazmat files a bug
<benbangert> if you look at my async lock, on connection loss during create candidate, it waits till reconnect and then calls get_children to see if the create actually did work
<benbangert> same thing could happen with create node using non-ephemeral of course, which would cause a program to throw a NodeAlreadyExists bug and might leave someone scratching their head if they weren't aware of that edge case
<hazmat> yeah.. it would have to check the lock children and match on session id owner for an error to know determinstically for the ephemeral seq
<hazmat> or use explicit client ids/guids for the node names
<benbangert> yea, the lock recipe uses the guid node name, prolly cause its cheaper/faster than calling get on every child
<hazmat> definitely

Changed in txzookeeper:
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.