Error condition on relation hooks locks events processing

Bug #912812 reported by Nick Barcet
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
pyjuju
Fix Released
Critical
Kapil Thangavelu

Bug Description

Let says I have a bug in my relation-change hook. When the hook gets called, this causes an error which is reported to juju (normal til here), but then juju seems to choke on it:

2012-01-06 15:37:55,093 unit:roundcube/4: hook.output ERROR: /usr/share/charm-helper/sh/peer.sh: line 330: on: unbound variable
2012-01-06 15:37:55,093 unit:roundcube/4: hook.output DEBUG: hook peer-relation-changed exited, exit code Traceback (most recent call last):
Failure: juju.errors.CharmInvocationError: Error processing '/var/lib/juju/units/roundcube-4/charm/hooks/peer-relation-changed': exit code 1.
.
2012-01-06 15:37:55,094 unit:roundcube/4: hook.executor DEBUG: Hook error: /var/lib/juju/units/roundcube-4/charm/hooks/peer-relation-changed Error processing '/var/lib/juju/units/roundcube-4/charm/hooks/peer-relation-changed': exit code 1.
2012-01-06 15:37:55,094 unit:roundcube/4: unit.relation.lifecycle WARNING: Error in peer-relation-changed hook: Error processing '/var/lib/juju/units/roundcube-4/charm/hooks/peer-relation-changed': exit code 1.
2012-01-06 15:37:55,094 unit:roundcube/4: unit.relation.lifecycle INFO: Invoked error handler for peer-relation-changed hook
2012-01-06 15:37:55,102 unit:roundcube/4: twisted ERROR: Unhandled error in Deferred:
2012-01-06 15:37:55,103 unit:roundcube/4: twisted ERROR: Unhandled Error
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1141, in unwindGenerator
    return _inlineCallbacks(None, f(*args, **kwargs), Deferred())
  File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1020, in _inlineCallbacks
    result = g.send(result)
  File "/usr/lib/python2.7/dist-packages/juju/unit/workflow.py", line 444, in on_hook_error
    error_message=str(error))
  File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1141, in unwindGenerator
    return _inlineCallbacks(None, f(*args, **kwargs), Deferred())
--- <exception caught here> ---
  File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1020, in _inlineCallbacks
    result = g.send(result)
  File "/usr/lib/python2.7/dist-packages/juju/lib/statemachine.py", line 151, in fire_transition
    transition_id, current_state))
juju.lib.statemachine.InvalidTransitionError: 'error' not a valid transition for state error

Once this has occured, it seems that I won't ever get any relation event occcuring on any units until I destroy and restart my lxc environement.

Related branches

Revision history for this message
Kapil Thangavelu (hazmat) wrote :

It looks like a another hook is queued while the current one is executing, the first one completes with error, transitioning the unit relation state to an error state, but the subsequent queued hook is then executed is against the error state. on an error state pending hook executions against that relation should get purged.

Effectively this is a symptom of the current design where workflow state instruments watches and scheduling new hooks, but does not affect hooks currently queued for execution.

We'll need to associate a context identifier to queued hooks, such that we can purge against the context if their conditions change.

Changed in juju:
milestone: none → florence
importance: Undecided → Critical
status: New → Confirmed
Jim Baker (jimbaker)
Changed in juju:
assignee: nobody → Jim Baker (jimbaker)
status: Confirmed → In Progress
Revision history for this message
Kapil Thangavelu (hazmat) wrote :

It looks like this error is specific to -relation-joined hooks failing, because of the expansion of joined to both joined and changed hook execution. This aliased/expanded -relation-changed hook is executed without consideration of the failure of the proceeding joined hook causing the traceback above.

Revision history for this message
Kapil Thangavelu (hazmat) wrote :

switching this out to myself after discussion.

Changed in juju:
assignee: Jim Baker (jimbaker) → Kapil Thangavelu (hazmat)
Revision history for this message
Kapil Thangavelu (hazmat) wrote :

Reproducing this bug needs both -joined and -changed hooks to exit with error.

Changed in juju:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.