ceph-osd is showing as fail
Bug #1931567 reported by
Eric Desrochers
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Fix Released
|
High
|
Ian Booth | ||
2.9 |
Fix Released
|
High
|
Heather Lanigan |
Bug Description
# juju status
Model Controller Cloud/Region Version SLA Timestamp
openstack <OBFUSCATED> <OBFUSCATED> 2.8.10 unsupported 04:27:28Z
# juju show-action-status 3783
actions:
- action: juju-run
completed at: n/a
id: "3783"
status: aborting
unit: ceph-osd/36
# controller logs:
https:/
# ceph-osd/36 logs
https:/
description: | updated |
description: | updated |
tags: | added: seg sts |
Changed in juju: | |
assignee: | nobody → Heather Lanigan (hmlanigan) |
status: | Triaged → In Progress |
Changed in juju: | |
status: | In Progress → Won't Fix |
Changed in juju: | |
status: | In Progress → Fix Committed |
Changed in juju: | |
status: | Fix Committed → Fix Released |
To post a comment you must log in.
Looking at a database dump, the issue is that the parent operation consisted of 39 actions across 39 units - 38 of them are marked as completed, 1 is marked as aborting (the one we are looking at here 3783)
An aborting action only happens if someone has run juju cancel-action. the action would be killed by juju and then marked as aborted but that hasn't happened so it's still in aborting state. Sometimes the process can get hung and juju will forcibly kill it but that hasn't happened - perhaps the unit agent got shut down before this could happen.
The logs show the unit agent is making an API call to fail the action (set status to failed),
but, the parent operation itself is marked as completed but it's really not because 1 action is not complete yet (still aborting) so juju gets confused.
Given the unit agent appears to be trying to set the action to failed, we can try to set the parent operation state back to running; this should allow things to progress
db.operations. update( {"_id" : "f7afc459- 639a-44e6- 8bf1-ad59286377 72:3754" },{$set: { "status" : "running"}});
We will need to loosen how strict juju is with checking for expected state so that in cases like this juju will mark the offended action as failed even if the parent thinks it is already complete.