One of ceph-osd processes doesn't start during deployment
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Fix Released
|
Medium
|
Stanislav Makar | ||
5.1.x |
Invalid
|
High
|
MOS Maintenance | ||
6.0.x |
Invalid
|
High
|
MOS Maintenance |
Bug Description
System tests: 'ceph_ha_
When Ceph service was started on the controller, one of two ceph-osd processes wasn't started:
[root@node-2 ~]# service ceph status
=== mon.node-2 ===
mon.node-2: running {"version"
=== osd.1 ===
osd.1: not running.
=== osd.0 ===
osd.0: running {"version"
[root@node-2 ~]# echo $?
3
[root@node-1 ~]# ceph status
cluster 75098087-
health HEALTH_OK
monmap e1: 1 mons at {node-2=
osdmap e22: 6 osds: 5 up, 5 in
pgmap v46: 1728 pgs, 6 pools, 12859 kB data, 5 objects
10470 MB used, 236 GB / 246 GB avail
As was found in the ceph logs, the command 'osd crush create-or-move' never appeared for osd.1:
==== osd.0 is starting:
Feb 5 21:53:26 node-2 ceph-mon: 2015-02-05 18:53:26.559788 7eff644ef700 0 mon.node-
osd"]} v 0) v1
Feb 5 21:53:27 node-2 ceph-mon: 2015-02-05 18:53:27.392325 7eff644ef700 0 mon.node-
weight": 0.0500000000000
Feb 5 21:53:27 node-2 ceph-mon: 2015-02-05 18:53:27.392494 7eff644ef700 0 mon.node-
==== osd.1 is starting:
Feb 5 21:53:32 node-2 ceph-mon: 2015-02-05 18:53:32.797480 7eff644ef700 0 mon.node-
osd"]} v 0) v1
Feb 5 21:53:32 node-2 puppet-user[18429]: (/Stage[
Feb 5 21:53:32 node-2 puppet-user[18429]: (/Stage[
Feb 5 21:53:32 node-2 puppet-user[18429]: (Class[Ceph]) Starting to evaluate the resource
Feb 5 21:53:33 node-2 puppet-user[18429]: (Class[Ceph]) Evaluated in 0.06 seconds
Feb 5 21:53:33 node-2 puppet-user[18429]: (Stage[main]) Starting to evaluate the resource
Feb 5 21:53:33 node-2 puppet-user[18429]: (Stage[main]) Evaluated in 0.05 seconds
Changed in fuel: | |
status: | New → Confirmed |
importance: | Undecided → High |
assignee: | nobody → Fuel Library Team (fuel-library) |
Changed in fuel: | |
assignee: | Fuel Library Team (fuel-library) → Stanislav Makar (smakar) |
status: | Confirmed → In Progress |
Changed in fuel: | |
status: | Fix Committed → Fix Released |
There are lots of errors like "IOError: [Errno 28] No space left on device" in the console log (http:// jenkins- product. srt.mirantis. net:8080/ job/6.1. system_ test.centos. thread_ 1/32/consoleFul l). What's up with those?