The exporter service is constantly restarted every 5 mins
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Prometheus Openstack Exporter Charm |
Fix Committed
|
Undecided
|
Hua Zhang |
Bug Description
One customer got an alert on nagios which reported the content:
service: fcb-sbibits-
Status Information: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds.
We identified it as a result of the exporter service consatantly restarting every 5 mins,
2023-06-22 02:16:25 INFO unit.prometheus
2023-06-22 02:21:27 INFO unit.prometheus
2023-06-22 02:27:22 INFO unit.prometheus
I can't reproduce the problem by:
./generate-
I was using latest/stable 0.1.7 (revision 28), but the customer is using latest/candidate 1.1.10 (revision 23), so then I switched to 1.1.10 by:
juju refresh prometheus-
juju config prometheus-
but I can't reproduce the problem as well after switching to 1.1.10.
Howerver, the customer is indeed hitting this issue, and the following log can confirm the handler do_restart was called by the hook update-status every 5 mins.
$ sudo grep -r -E 'do_restart|
2023-06-22 02:41:50 INFO unit.prometheus
2023-06-22 02:41:51 INFO unit.prometheus
2023-06-22 02:47:12 INFO unit.prometheus
2023-06-22 02:47:13 INFO unit.prometheus
2023-06-22 02:51:54 INFO unit.prometheus
2023-06-22 02:51:55 INFO unit.prometheus
Related branches
- Eric Chen: Approve
- 🤖 prod-jenkaas-bootstack: Approve (continuous-integration)
- Tianqi Xiao (community): Approve
- Edward Hope-Morley (community): Approve
- Robert Gildein: Approve
- BootStack Reviewers: Pending requested
-
Diff: 64 lines (+27/-0)3 files modifiedsrc/reactive/openstack_exporter.py (+1/-0)
src/tests/unit/requirements.txt (+3/-0)
src/tests/unit/test_reactive_openstack_exporter.py (+23/-0)
tags: | added: sts |
Changed in charm-prometheus-openstack-exporter: | |
assignee: | nobody → Hua Zhang (zhhuabj) |
status: | New → Fix Committed |
milestone: | none → 23.10 |