juju stop responding after juju-upgrade

Bug #1438489 reported by lithium
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
High
John Weldon
1.24
Fix Released
High
John Weldon

Bug Description

After running:

juju upgrade-juju --version 1.23-beta2 --upload-tools

All unit start saying in debug-log

....
unit-pgsql-medium-0[9015]: 2015-03-31 00:24:01 ERROR juju.worker runner.go:219 exited "uniter": failed to initialize uniter for "unit-pgsql-medium-0": cannot read "/var/lib/juju/agents/unit-pgsql-medium-0/state/uniter": invalid operation state: unexpected hook info with Kind Continue
....

After a while juju stop responding. Ex: juju status

I already try to restart juju-machine0 service and juju-mongodb

If i keep running "juju status" in a console and restart juju-machine0 service from machine0 it show statuses and then hang again. As if there were something in the queue.

Attached extra big large /var/log/juju/machine-0.log

Tags: upgrade-juju
Revision history for this message
lithium (rudicba) wrote :
Revision history for this message
lithium (rudicba) wrote :

I forget to upload all-machine.log

Revision history for this message
John Weldon (johnweldon4) wrote :

It seems the upgrade step for adding the Stopped field to the uniter state isn't working. The logs indicate that the upgrade steps ran, but many (most / all ?) of the units all report the error that the uniter state has a hook value in ModeContinue, which the upgrade step should have taken care of.

Changed in juju-core:
assignee: nobody → John Weldon (johnweldon4)
John George (jog)
tags: added: upgrade-juju
Changed in juju-core:
status: New → Triaged
importance: Undecided → High
milestone: none → 1.23-beta3
John George (jog)
Changed in juju-core:
milestone: 1.23-beta3 → 1.23
Revision history for this message
John Weldon (johnweldon4) wrote :

Once the uniter operation state file is manually fixed (by removing the hook key and subkeys) another error that seems upgrade related appears too:

unit-jw-charm-test-0[10017]: 2015-03-31 17:09:18 ERROR juju.worker.uniter.filter filter.go:137 cannot retrieve meter status for unit jw-charm-test/0: not found
unit-jw-charm-test-0[10017]: 2015-03-31 17:09:19 ERROR juju.worker runner.go:219 exited "uniter": cannot retrieve meter status for unit jw-charm-test/0: not found

Revision history for this message
John Weldon (johnweldon4) wrote :

so the issue is that the upgrade steps do not run against units, but only against machines. The upgrade steps need to explicitly iterate the units and apply the upgrade to the units.

I'm working on implementing this, but until it's in this upgrade won't work.

Revision history for this message
lithium (rudicba) wrote :

So, when the new update is available, juju environment with wrong uniter state will be fixed? or i have to start from scratch?

Thanks for the quick response

Changed in juju-core:
status: Triaged → Fix Committed
Revision history for this message
Liam Young (gnuoy) wrote :

I've just upgraded from 1.22.1 to 1.23.0 (from proposed ppa) and after the upgrade I seem to have hit this bug.

Last entry in debug-log is:

unit-swift-proxy-0[8375]: 2015-04-17 10:20:02 ERROR juju.cmd supercommand.go:430 must restart: an agent upgrade is available
unit-swift-proxy-0[4189]: 2015-04-17 10:20:03 ERROR juju.worker.uniter.operation state.go:137 unexpected hook info with Kind Continue

juju set no longer fires hooks on the units.

I did not use upload-tools for the upgrade. I did:

juju set-env agent-metadata-url=https://streams.canonical.com/juju/tools
juju set-env agent-stream=proposed
juju upgrade-juju --version 1.23.0

Revision history for this message
John Weldon (johnweldon4) wrote :

Even after the committed fix, hook operations don't fire any more after the upgrade. Investigating further.

Changed in juju-core:
status: Fix Committed → In Progress
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 1.23 → 1.24-alpha1
Revision history for this message
Martin Packman (gz) wrote :

I can reproduce this with our standard upgrade job by adding a step at the end that sets a new value on our testing charm and expects it to propagate across a relation. It seems any juju deployment upgraded from 1.22 to 1.23 will basically be in an unusable state afterwards.

Changed in juju-core:
importance: High → Critical
Revision history for this message
Curtis Hovey (sinzui) wrote :

As there is bot a fix and a release with this issue. I opened bug 1447846 to resolve the hooks issue.

no longer affects: juju-core/1.23
Changed in juju-core:
status: In Progress → Fix Committed
Revision history for this message
Menno Finlay-Smits (menno.smits) wrote :

John, can you confirm that the fix is made it into the 1.24 branch as well as master? (and update this bug accordingly)

Changed in juju-core:
milestone: 1.24-alpha1 → none
milestone: none → 1.25.0
status: Fix Committed → In Progress
Revision history for this message
John Weldon (johnweldon4) wrote :

Yes, this fix was in revision [2fb4df33](https://github.com/johnweldon/juju/commit/2fb4df3367c061d408a6ec8325f13f5e2a74f8aa)

running `git branch -r --contains 2fb4df33` will confirm that the revision is in both master and 1.24

Changed in juju-core:
status: In Progress → Fix Committed
Curtis Hovey (sinzui)
Changed in juju-core:
importance: Critical → High
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.