[plugin][ceph] collect ceph balancer and pr-autoscale status

Bug #1893109 reported by Chris Johnston
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
sosreport
Fix Released
Unknown
sosreport (Ubuntu)
Fix Released
Medium
Dan Hill
Xenial
New
Undecided
Unassigned
Bionic
In Progress
Medium
Dan Hill
Focal
Fix Released
Medium
Dan Hill
Groovy
Fix Released
Medium
Dan Hill

Bug Description

[Impact]

It would be nice to collect:

ceph osd pool autoscale-status
ceph balancer status

https://docs.ceph.com/docs/master/rados/operations/placement-groups/

VIEWING PG SCALING RECOMMENDATIONS
You can view each pool, its relative utilization, and any suggested changes to the PG count with this command:

ceph osd pool autoscale-status
https://docs.ceph.com/docs/mimic/mgr/balancer/

STATUS
The current status of the balancer can be check at any time with:

ceph balancer status

[Test Case]

* Install latest sosreport found in -updates
* Run sosreport -o ceph (version 3.X and/or 4.X) or sos report -o ceph (4.X only)
* Look content inside /path_to_sosreport/sos_command/ceph/
* Make sure the 2 new commands are found there.
* There will be 3 additional files, as the autoscale-status is also captured in JSON format.

[Regression Potential]
This patch adds two commands to the collected command output. Potential regressions would include a command typo, a command hang, a code typo.
- A command typo would result in a failed command which should capture the command error output.
- A command hang will result in the ceph plug-in taking a long time to complete (hitting the default sos timeout).
- A code typo will raise an exception in the ceph plug-in halting further ceph data capture.

[Other Info]
Both commands are querying the ceph's internal state, without grabbing any locks or performing any modifications. The commands are expected to return very quickly.

[Original Description]
It would be nice to collect:

ceph osd pool autoscale-status
ceph balancer status

Upstream report: https://github.com/sosreport/sos/issues/2211
Upstream commit: https://github.com/sosreport/sos/commit/52f4661e2b594134b98e2967b02cc860d7963fef

Changed in sosreport:
status: Unknown → Fix Released
Eric Desrochers (slashd)
tags: added: seg sts
Changed in sosreport (Ubuntu Groovy):
status: New → In Progress
assignee: nobody → Chris Johnston (cjohnston)
Eric Desrochers (slashd)
Changed in sosreport (Ubuntu Groovy):
assignee: Chris Johnston (cjohnston) → Eric Desrochers (slashd)
Eric Desrochers (slashd)
Changed in sosreport (Ubuntu Groovy):
assignee: Eric Desrochers (slashd) → nobody
assignee: nobody → Dan Hill (hillpd)
Eric Desrochers (slashd)
summary: - Backport - collect ceph balancer and pr-autoscale status
+ [plugin][ceph] Backport - collect ceph balancer and pr-autoscale status
summary: - [plugin][ceph] Backport - collect ceph balancer and pr-autoscale status
+ [plugin][ceph] collect ceph balancer and pr-autoscale status
Changed in sosreport (Ubuntu Focal):
assignee: nobody → Dan Hill (hillpd)
Changed in sosreport (Ubuntu Bionic):
assignee: nobody → Dan Hill (hillpd)
Changed in sosreport (Ubuntu Groovy):
importance: Undecided → Medium
description: updated
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package sosreport - 4.0-1ubuntu1

---------------
sosreport (4.0-1ubuntu1) groovy; urgency=medium

  [Eric Desrochers]
  * d/p/0003-sosclean-fix-handling-of-filepath-with-archive-name.patch:
    - Fixes the splitting of filepaths within the archive,
      when the archive name is included in the filename
      inside the archive. (LP: #1896222)

  * d/p/0004-sosclean-fix-tarball-skipping-regex.patch:
    - Fix tarball skipping regex

  [Dan Hill]
  * d/p/0005-ceph-collect-balancer-and-pg-autoscale-status.patch:
    - Collect balancer and pg-autoscale status (LP: #1893109)

  [Nicolas Bock]
  * d/p/0006-rabbitmq-add-info-on-maybe-stuck-processes.patch:
    - Add information on maybe_stuck() processes for RMQ. (LP: #1890846)

 -- Eric Desrochers <email address hidden> Fri, 18 Sep 2020 09:23:04 -0400

Changed in sosreport (Ubuntu Groovy):
status: In Progress → Fix Released
Revision history for this message
Eric Desrochers (slashd) wrote :

@dan, @CJ

Could you please file the SRU template before we can proceed with the SRU ?

Revision history for this message
Eric Desrochers (slashd) wrote :

@dan, could you fill the [Regression Potential] section ?
I already completed the rest.

I want to make sure I don't miss anything in the regression potential. I'd prefer if you can take 5 minutes to do it.

- Eric

description: updated
description: updated
description: updated
Dan Hill (hillpd)
description: updated
Eric Desrochers (slashd)
Changed in sosreport (Ubuntu Focal):
status: New → In Progress
Dan Hill (hillpd)
description: updated
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello Chris, or anyone else affected,

Accepted sosreport into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/sosreport/4.0-1~ubuntu0.20.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in sosreport (Ubuntu Focal):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-focal
Eric Desrochers (slashd)
Changed in sosreport (Ubuntu Bionic):
status: New → In Progress
Eric Desrochers (slashd)
Changed in sosreport (Ubuntu Bionic):
importance: Undecided → Critical
Changed in sosreport (Ubuntu Focal):
importance: Undecided → Medium
Changed in sosreport (Ubuntu Bionic):
importance: Critical → Medium
Revision history for this message
Dan Hill (hillpd) wrote :

Verified sos collects the two new commands on focal.

# lsb_release -a No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.1 LTS
Release: 20.04
Codename: focal

# dpkg -l | grep sos
ii sosreport 4.0-1~ubuntu0.20.04.2 amd64 Set of tools to gather troubleshooting data from a system

# sos report -e ceph
...
Your sosreport has been generated and saved in:
        /tmp/sosreport-juju-ffa3a9-lp1893109-0-2020-10-09-nfxoyto.tar.xz
...

/tmp/sosreport-juju-ffa3a9-lp1893109-0-2020-10-09-nfxoyto/sos_commands/ceph# cat ceph_osd_pool_autoscale-status
POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO TARGET RATIO EFFECTIVE RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE
device_health_metrics 0 3.0 30708M 0.0000 1.0 1 on
glance 0 3.0 30708M 0.0000 0.0500 1.0000 1.0 128 on

/tmp/sosreport-juju-ffa3a9-lp1893109-0-2020-10-09-nfxoyto/sos_commands/ceph# cat ceph_balancer_status
{
    "active": false,
    "last_optimize_duration": "",
    "last_optimize_started": "",
    "mode": "none",
    "optimize_result": "",
    "plans": []
}

tags: added: verification-done-focal
removed: verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package sosreport - 4.0-1~ubuntu0.20.04.2

---------------
sosreport (4.0-1~ubuntu0.20.04.2) focal; urgency=medium

  [Eric Desrochers]
  * d/p/0003-sosclean-fix-handling-of-filepath-with-archive-name.patch:
    - Fixes the splitting of filepaths within the archive,
      when the archive name is included in the filename
      inside the archive. (LP: #1896222)

  * d/p/0004-sosclean-fix-tarball-skipping-regex.patch:
    - Fix tarball skipping regex

  [Dan Hill]
  * d/p/0005-ceph-collect-balancer-and-pg-autoscale-status.patch:
    - Collect balancer and pg-autoscale status (LP: #1893109)

  [Nicolas Bock]
  * d/p/0006-rabbitmq-add-info-on-maybe-stuck-processes.patch:
    - Add information on maybe_stuck() processes for RMQ. (LP: #1890846)

  * d/p/0007-rabbitmq-add-10sec-timeout-to-call-to-maybestuck.patch:
    - Add 10 second timeout to call to `maybe_stuck()`.

 -- Eric Desrochers <email address hidden> Wed, 30 Sep 2020 14:29:50 -0400

Changed in sosreport (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for sosreport has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.