Use of default tunables during Ceph upgrade can cause the process to stop

Bug #1704959 reported by Giulio Fidente
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Giulio Fidente

Bug Description

During the Ceph monitors upgrade process, use of default tunables for the CRUSH map can cause the cluster to emit a warning message because legacy tunables are in use.

As a consequence, after the majority of the monitors have been upgraded, the upgrade process will stop because the Ceph cluster is not healthy anymore.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.openstack.org/484681

Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/484682

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/484684

tags: added: newton-backport-potential
tags: added: ocata-backport-potential
Changed in tripleo:
milestone: none → pike-3
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/484681
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=5e9f855f7c96950ca29a0f85086441c57ae7aed5
Submitter: Jenkins
Branch: master

commit 5e9f855f7c96950ca29a0f85086441c57ae7aed5
Author: Giulio Fidente <email address hidden>
Date: Tue Jul 18 11:03:35 2017 +0200

    Use optimal (instead of default) tunables for Ceph on upgrade

    With the default setting, after the majority of the monitors have
    been upgraded the cluster will go in WARN state because of legacy
    tunables. This changes the tunables we set after each monitor is
    upgraded from 'default' to 'optimal' [1].

    1. http://docs.ceph.com/docs/master/rados/operations/crush-map/#warning-when-tunables-are-non-optimal

    Change-Id: I0f16c29cc200d762f0c4acfd87ba7d1adb5c1eeb
    Closes-Bug: #1704959

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/ocata)

Reviewed: https://review.openstack.org/484682
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=40de798ee69b4672172dcc766269e10c75f22a38
Submitter: Jenkins
Branch: stable/ocata

commit 40de798ee69b4672172dcc766269e10c75f22a38
Author: Giulio Fidente <email address hidden>
Date: Tue Jul 18 11:03:35 2017 +0200

    Use optimal (instead of default) tunables for Ceph on upgrade

    With the default setting, after the majority of the monitors have
    been upgraded the cluster will go in WARN state because of legacy
    tunables. This changes the tunables we set after each monitor is
    upgraded from 'default' to 'optimal' [1].

    1. http://docs.ceph.com/docs/master/rados/operations/crush-map/#warning-when-tunables-are-non-optimal

    Change-Id: I0f16c29cc200d762f0c4acfd87ba7d1adb5c1eeb
    Closes-Bug: #1704959
    (cherry picked from commit 5e9f855f7c96950ca29a0f85086441c57ae7aed5)

tags: added: in-stable-ocata
tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/newton)

Reviewed: https://review.openstack.org/484684
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=a18621e1cbdd7944ecb482816f6a7d6f922570bf
Submitter: Jenkins
Branch: stable/newton

commit a18621e1cbdd7944ecb482816f6a7d6f922570bf
Author: Giulio Fidente <email address hidden>
Date: Tue Jul 18 11:14:53 2017 +0200

    Use optimal (instead of default) tunables for Ceph on upgrade

    With the default setting, after the majority of the monitors have
    been upgraded the cluster will go in WARN state because of legacy
    tunables. This changes the tunables we set after each monitor is
    upgraded from 'default' to 'optimal' [1].

    1. http://docs.ceph.com/docs/master/rados/operations/crush-map/#warning-when-tunables-are-non-optimal

    While different from the ocata (or pike) versions, this change implements
    the same fix submitted for ocata: https://review.openstack.org/#/c/484682/

    Change-Id: I3635f095ae7d0cdf7f4384dcdc0ae2b39980721e
    Closes-Bug: #1704959

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 7.0.0.0b3

This issue was fixed in the openstack/tripleo-heat-templates 7.0.0.0b3 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 5.3.1

This issue was fixed in the openstack/tripleo-heat-templates 5.3.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 6.2.1

This issue was fixed in the openstack/tripleo-heat-templates 6.2.1 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.