Error condition on relation hooks locks events processing
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
pyjuju |
Fix Released
|
Critical
|
Kapil Thangavelu |
Bug Description
Let says I have a bug in my relation-change hook. When the hook gets called, this causes an error which is reported to juju (normal til here), but then juju seems to choke on it:
2012-01-06 15:37:55,093 unit:roundcube/4: hook.output ERROR: /usr/share/
2012-01-06 15:37:55,093 unit:roundcube/4: hook.output DEBUG: hook peer-relation-
Failure: juju.errors.
.
2012-01-06 15:37:55,094 unit:roundcube/4: hook.executor DEBUG: Hook error: /var/lib/
2012-01-06 15:37:55,094 unit:roundcube/4: unit.relation.
2012-01-06 15:37:55,094 unit:roundcube/4: unit.relation.
2012-01-06 15:37:55,102 unit:roundcube/4: twisted ERROR: Unhandled error in Deferred:
2012-01-06 15:37:55,103 unit:roundcube/4: twisted ERROR: Unhandled Error
Traceback (most recent call last):
File "/usr/lib/
return _inlineCallback
File "/usr/lib/
result = g.send(result)
File "/usr/lib/
error_
File "/usr/lib/
return _inlineCallback
--- <exception caught here> ---
File "/usr/lib/
result = g.send(result)
File "/usr/lib/
transition_id, current_state))
juju.lib.
Once this has occured, it seems that I won't ever get any relation event occcuring on any units until I destroy and restart my lxc environement.
Related branches
- Jim Baker (community): Approve
-
Diff: 330 lines (+159/-10)9 files modifiedjuju/errors.py (+8/-3)
juju/hooks/invoker.py (+5/-2)
juju/hooks/scheduler.py (+3/-1)
juju/hooks/tests/test_invoker.py (+13/-0)
juju/hooks/tests/test_scheduler.py (+41/-3)
juju/tests/test_errors.py (+6/-0)
juju/unit/lifecycle.py (+4/-1)
juju/unit/tests/test_lifecycle.py (+41/-0)
juju/unit/tests/test_workflow.py (+38/-0)
Changed in juju: | |
assignee: | nobody → Jim Baker (jimbaker) |
status: | Confirmed → In Progress |
Changed in juju: | |
status: | In Progress → Fix Released |
It looks like a another hook is queued while the current one is executing, the first one completes with error, transitioning the unit relation state to an error state, but the subsequent queued hook is then executed is against the error state. on an error state pending hook executions against that relation should get purged.
Effectively this is a symptom of the current design where workflow state instruments watches and scheduling new hooks, but does not affect hooks currently queued for execution.
We'll need to associate a context identifier to queued hooks, such that we can purge against the context if their conditions change.