juju-core

inconsistent juju status from cli vs api

Bug #1467690 reported by JuanJo Ciarlante on 2015-06-22

This bug affects 3 people

Affects		Status	Importance	Assigned to	Milestone
	juju-core	Fix Released	High	Ian Booth	juju-core 1.25-alpha1
	1.24	Fix Released	Critical	Ian Booth	juju-core 1.24.1

Bug Description

juju version: 1.24.0-trusty-amd64 (agent-version: 1.24.0.1)

We have an environment showing a clean 'juju status': no
'error' string found at its output, excerpt for the failing
'landscape-client/1' subord unit:
http://paste.ubuntu.com/11759456/ ,
but using a python script fetch status via API [0]
shows the unit erroring with:
agent-state-info: 'hook failed: "leader-elected"':
http://paste.ubuntu.com/11759469/

[0] http://paste.ubuntu.com/11759446/

Tags:

Revision history for this message

JuanJo Ciarlante (jjo) wrote on 2015-06-22:

BTW we found this issue from juju-deployer failing as per above,
note that trying to resolve it fails:
$ juju resolved -r landscape-client/1
ERROR unit "landscape-client/1" is not in an error state

i.e. this issue leaves the environment inoperable by juju-deployer.

Ian Booth (wallyworld) on 2015-06-23

Changed in juju-core:
importance:	Undecided → High
status:	New → Triaged
milestone:	none → 1.25.0

Revision history for this message

Ian Booth (wallyworld) wrote on 2015-06-23:

I have a theory as to what's happening. The CLI status is reporting the correct status, but the status via the API is wrong. The API reported status uses an all watcher backing model. That model appears to be incorrectly updated in response to some status changes, and thus reports stale data to callers like deployer. Restarting the start server(s) seems to have got around the issue, lending credence to this theory.

The issue seemed to happen when a leader election hook failed and then ran again and comes good second time. No user did a resolve --retry to reset things.

The code below is called when a status value changes. If the change is for a unit (#charm) or is for an error, the workload status is put into error. Once the error goes away status for agent is updated back to "idle" but the first if{} block is not run ever and so the unit workload status remains in error state. I think we just need some logic to say if workload state is error and new incoming agent state is not error, reset the workload status to what's currently in state. We may need to record the previous non error workload status on the backing doc to make this work.

func (s *backingStatus) updatedUnitStatus(st *State, store *multiwatcherStore, id string, newInfo *multiwatcher.UnitInfo) error {
// Unit or workload status - display the agent status or any error.
if strings.HasSuffix(id, "#charm") || s.Status == StatusError {
  newInfo.WorkloadStatus.Current = multiwatcher.Status(s.Status)
  newInfo.WorkloadStatus.Message = s.StatusInfo
  newInfo.WorkloadStatus.Data = s.StatusData
  newInfo.WorkloadStatus.Since = s.Updated
} else {
  newInfo.AgentStatus.Current = multiwatcher.Status(s.Status)
  newInfo.AgentStatus.Message = s.StatusInfo
  newInfo.AgentStatus.Data = s.StatusData
  newInfo.AgentStatus.Since = s.Updated
}

Revision history for this message

Brad Marshall (brad-marshall) wrote on 2015-06-23:

statuses.json Edit (178.8 KiB, application/json)

As per request from Ian, attaching a mongo dump of the statuses collection from the state server.

Revision history for this message

Brad Marshall (brad-marshall) wrote on 2015-06-23:

statuses_history.json Edit (3.5 MiB, application/json)

And now here's the statuseshistory collection.

Ian Booth (wallyworld) on 2015-06-23

Changed in juju-core:
assignee:	nobody → Ian Booth (wallyworld)
status:	Triaged → Fix Committed

Curtis Hovey (sinzui) on 2015-06-25

Changed in juju-core:
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

Duplicates of this bug

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.