Add logging for agent heartbeats

Bug #1453978 reported by Eugene Nikanorov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Fix Released
High
Elena Ezhova
6.1.x
Won't Fix
High
MOS Maintenance
7.0.x
Fix Released
High
Elena Ezhova

Bug Description

Original bug: https://bugs.launchpad.net/neutron/+bug/1452582

When troubleshooting problems with cluster it would be very convenient to have information about agent heartbeats logged with some searchable identifier which could create 1-to-1 mapping between events in agent's logs and server's logs.

Currently agent's heartbeats are not logged at all on server side.
Since on a large cluster that could create too much logging (even for troubleshooting cases), it might make sense to make this configurable both on neutron-server side and on agent-side.

Revision history for this message
Eugene Nikanorov (enikanorov) wrote :

Setting to high as this could make neutron failure analysis much simpler

tags: added: neutron
Changed in mos:
status: New → Confirmed
milestone: none → 6.1
Changed in mos:
status: Confirmed → In Progress
Changed in mos:
status: In Progress → Confirmed
Changed in mos:
status: Confirmed → Won't Fix
Revision history for this message
Alexander Ignatov (aignatov) wrote :

This will not go for 6.1 directly. It was proposed with Sustaining team to move those bug to 6.1-updates and fix will be recommended as post-deployment step for 6.1 deployments.

Changed in mos:
milestone: 6.1 → 6.1.1
status: Won't Fix → Triaged
no longer affects: mos/6.1.x
Changed in mos:
milestone: 6.1 → 6.1.1
tags: added: release-note
tags: added: release-notes
removed: release-note
Revision history for this message
Alexander Ignatov (aignatov) wrote :

Decided to cherry-pick this patch https://review.openstack.org/#/c/181132/ into 7.0 branch

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/neutron (openstack-ci/fuel-7.0/2015.1.0)

Fix proposed to branch: openstack-ci/fuel-7.0/2015.1.0
Change author: Eugene Nikanorov <email address hidden>
Review: https://review.fuel-infra.org/9417

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Change abandoned on openstack/neutron (openstack-ci/fuel-7.0/2015.1.0)

Change abandoned by Elena Ezhova <email address hidden> on branch: openstack-ci/fuel-7.0/2015.1.0
Review: https://review.fuel-infra.org/9417
Reason: Need to restore original Change-Id

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/neutron (openstack-ci/fuel-7.0/2015.1.0)

Reviewed: https://review.fuel-infra.org/9420
Submitter: Eugene Nikanorov <email address hidden>
Branch: openstack-ci/fuel-7.0/2015.1.0

Commit: e8b38a00fea0b210b2a264a9a8524397f8e13d8b
Author: Eugene Nikanorov <email address hidden>
Date: Thu Jul 16 13:47:55 2015

Add logging of agent heartbeats

When troubleshooting problems with cluster it would be
very convenient to have information about agent heartbeats
logged with some searchable identifier which could create
1-to-1 mapping between events in agent's logs and server's logs.

Currently agent's heartbeats are not logged at all on server side.
Since on a large cluster that could create too much logging
(even for troubleshooting cases), it might make sense to make
this configurable both on neutron-server side and on agent-side.

DocImpact

Cherry-picked from https://review.openstack.org/#/c/181132/
Closes-Bug: #1453978

Conflicts:
 neutron/db/agents_db.py

Change-Id: I0a127ef274a84bba5de47395d47b62f48bd4be16

Anna Babich (ababich)
tags: added: on-verification
Revision history for this message
Anna Babich (ababich) wrote :

VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "7.0"
  openstack_version: "2015.1.0-7.0"
  api: "1.0"
  build_number: "187"
  build_id: "2015-08-18_03-05-20"
  nailgun_sha: "4710801a2f4a6d61d652f8f1e64215d9dde37d2e"
  python-fuelclient_sha: "4c74a60aa60c06c136d9197c7d09fa4f8c8e2863"
  fuel-agent_sha: "57145b1d8804389304cd04322ba0fb3dc9d30327"
  fuel-nailgun-agent_sha: "e01693992d7a0304d926b922b43f3b747c35964c"
  astute_sha: "e24ca066bf6160bc1e419aaa5d486cad1aaa937d"
  fuel-library_sha: "0062e69db17f8a63f85996039bdefa87aea498e1"
  fuel-ostf_sha: "17786b86b78e5b66d2b1c15500186648df10c63d"
  fuelmain_sha: "c9dad194e82a60bf33060eae635fff867116a9ce"

Verified on cluster: neutron+vlan, 3 controllers, 2 computes

Verification scenario
1. On every controller update .ini files for dhcp-, l3-, metadata- and ovs-agents with log_agent_heartbeats = True
2. Restart agents:
pcs resource disable p_neutron-dhcp-agent && pcs resource disable p_neutron-l3-agent && pcs resource disable p_neutron-metadata-agent && pcs resource disable p_neutron-plugin-openvswitch-agent
pcs resource enable p_neutron-dhcp-agent && pcs resource enable p_neutron-l3-agent && pcs resource enable p_neutron-metadata-agent && pcs resource enable p_neutron-plugin-openvswitch-agent
3. On every controller check that in neutron-server.log the records like
"Heartbeat received from %(type)s agent on "
                         "host %(host)s, uuid %(uuid)s after %(delta)s"),
                     {'type': agent_db.agent_type,
                      'host': agent_db.host,
                      'uuid': state.get('uuid'),
                      'delta': delta}
started to appear for every agent

Verification result for every controller looks like: http://paste.openstack.org/show/441241/

4. Finally, disable heartbeat logging for all agents on all controllers with log_agent_heartbeats = False and check that heartbeat records stopped to appear

Verification result for every controller looks like: http://paste.openstack.org/show/441251/

tags: removed: on-verification
tags: added: release-notes-done-7.0
removed: release-notes
tags: added: release-notes-done rn7.0
removed: release-notes-done-7.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.