[RFE] Use tooz for coordination of ceilometer agents

Bug #1768527 reported by Dmitrii Shcherbakov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Ceilometer Charm
Triaged
Wishlist
Unassigned

Bug Description

Ceilometer can use tooz for coordination which is currently not utilized in charm-ceilometer. Right now the assumption is that only one central agent will be running at a given time:

https://github.com/openstack/ceilometer/blob/stable/queens/ceilometer/opts.py#L82-L89
        ('coordination', [
            cfg.StrOpt(
                'backend_url',
                help='The backend URL to use for distributed coordination. If '
                'left empty, per-deployment central agent and per-host '
                'compute agent won\'t do workload '
                'partitioning and will only function correctly if a '
                'single instance of that service is running.'),

For notification agents the ceilometer charm just does not set workload_partitioning to True, however, there is no code in the charm to only run a single notification agent. So we most likely have multiple notification agents running at the same time and only one of them ever consumes notifications as we do not see duplicate events:

https://github.com/openstack/ceilometer/blob/stable/queens/ceilometer/notification.py#L28
"""Notification service.
    When running multiple agents, additional queuing sequence is required for
    inter process communication. Each agent has two listeners: one to listen
    to the main OpenStack queue and another listener(and notifier) for IPC to
    divide pipeline sink endpoints. Coordination should be enabled to have
    proper active/active HA.
    """"
https://github.com/openstack/ceilometer/blob/stable/queens/ceilometer/notification.py#L49-L52
    cfg.BoolOpt('workload_partitioning',
                default=False,
                help='Enable workload partitioning, allowing multiple '
                     'notification agents to be run simultaneously.'),

https://github.com/openstack/ceilometer/blob/stable/queens/ceilometer/polling/manager.py#L34
https://specs.openstack.org/openstack/ceilometer-specs/specs/juno/central-agent-partitioning.html

Given that coordination is needed we cannot just run multiple active cental agents so hacluster-ceilometer is needed in every deployment:

https://github.com/openstack/charm-ceilometer/blob/stable/18.02/hooks/ceilometer_hooks.py#L295
        'res_ceilometer_agent_central': 'lsb:ceilometer-agent-central',

====

The feature request is to utilize a coordination service via tooz library for central, polling and notification agents.

Tags: cpe-onsite
James Page (james-page)
Changed in charm-ceilometer:
status: New → Triaged
importance: Undecided → Wishlist
milestone: none → 18.08
James Page (james-page)
Changed in charm-ceilometer:
milestone: 18.08 → 18.11
James Page (james-page)
Changed in charm-ceilometer:
milestone: 18.11 → 19.04
David Ames (thedac)
Changed in charm-ceilometer:
milestone: 19.04 → 19.07
David Ames (thedac)
Changed in charm-ceilometer:
milestone: 19.07 → 19.10
David Ames (thedac)
Changed in charm-ceilometer:
milestone: 19.10 → 20.01
James Page (james-page)
Changed in charm-ceilometer:
milestone: 20.01 → 20.05
David Ames (thedac)
Changed in charm-ceilometer:
milestone: 20.05 → 20.08
James Page (james-page)
Changed in charm-ceilometer:
milestone: 20.08 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.