Bionic w/ CMR exporter fails to start

Bug #1840513 reported by Chris Sanders
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Prometheus Ceph Exporter Charm
Confirmed
Medium
Unassigned

Bug Description

Installing Rev 5 on a bionic LXD and relating to Ceph-Mon via CMR is failing to start and register the service.

The error from systemd shows:

Aug 16 22:38:29 juju-5815ea-12 systemd[1]: Started Service for snap application prometheus-ceph-exporter.ceph-exporter.
-- Subject: Unit snap.prometheus-ceph-exporter.ceph-exporter.service has finished start-up
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit snap.prometheus-ceph-exporter.ceph-exporter.service has finished starting up.
--
-- The start-up result is RESULT.
Aug 16 22:38:29 juju-5815ea-12 prometheus-ceph-exporter.ceph-exporter[14485]: * Running /snap/prometheus-ceph-exporter/x1/bin/ceph_exporter with args: -ceph.user ceph-exporter
Aug 16 22:38:29 juju-5815ea-12 prometheus-ceph-exporter.ceph-exporter[14485]: 2019/08/16 22:38:29 cannot connect to ceph cluster: rados: Operation not permitted
Aug 16 22:38:29 juju-5815ea-12 systemd[1]: snap.prometheus-ceph-exporter.ceph-exporter.service: Main process exited, code=exited, status=1/FAILURE
Aug 16 22:38:29 juju-5815ea-12 systemd[1]: snap.prometheus-ceph-exporter.ceph-exporter.service: Failed with result 'exit-code'.
Aug 16 22:38:30 juju-5815ea-12 systemd[1]: snap.prometheus-ceph-exporter.ceph-exporter.service: Service hold-off time over, scheduling restart.
Aug 16 22:38:30 juju-5815ea-12 systemd[1]: snap.prometheus-ceph-exporter.ceph-exporter.service: Scheduled restart job, restart counter is at 5.

It appears the service is trying to authenticate as "ceph-exporter". Checking `ceph auth ls` shows that user doesn't exist in ceph.

Ceph-Mon is being used via a CMR, and is the recently released Rev 42.

Both controllers are Juju 2.6.6, which is the current latest.

Revision history for this message
Jose Guedez (jfguedez) wrote :

Confirmed.

The relation does create a user and the key does match between nodes, but it's actually created with a different username:

ceph-mon: user with the following pattern - "client.remote-<hash>"
prometheus-ceph-exporter: "client.prometheus-ceph-exporter"

So it cannot connect, and the journal is full of entries like:

Mar 20 01:02:47 juju-e56e6c-0 prometheus-ceph-exporter.ceph-exporter[16251]: * Running /snap/prometheus-ceph-exporter/x1/bin/ceph_exporter with args: -ceph.user prometheus-ceph-exporter
Mar 20 01:02:47 juju-e56e6c-0 prometheus-ceph-exporter.ceph-exporter[16251]: 2020/03/20 01:02:47 cannot connect to ceph cluster: rados: Operation not permitted

Changed in charm-prometheus-ceph-exporter:
status: New → Confirmed
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.