swact is not triggered after killing dnsmasq process within 90 seconds
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Bin Qian |
Bug Description
Brief Description
-----------------
When a process fails twice whitin 90 seconds SM does not execute the expected impact.
for instance, I have tested it against dnsmasq process and after killing it twice swact is not triggered.
dnsmasq is an example, but I tested against
* dcmanager-audit
* dcmanager-api
* dcmanager-manager
* dcdbsync-api
* dcorch-engine
* hw-mond
* hbsAgent
* rabbitmq
* sysinv-conductor
* mtcAgent
* fmManager
* (it seems all the processes monitored by sm)
and they have the same behavior.
Severity
--------
Critical
Steps to Reproduce
------------------
I've created a script to test it.
1. Kill process
2. wait 60 seconds.
3. kill it again.
====== script ====
pid_file=$1
t=$2
date;
pid=$(cat $pid_file 2>/dev/null)
echo "killing $pid"
echo "ROOT_PASSWORD" | sudo -S kill -9 $pid &>/dev/null;
sleep 5;
date;
fm alarm-list;
sleep $(expr $t - 5);
date;
pid=$(cat $pid_file 2>/dev/null)
echo "killing $pid"
echo "ROOT_PASSWORD" | sudo -S kill -9 $pid &>/dev/null;
sleep 5;
date;
fm alarm-list;
======= end script ===
run it with: ./script_name.sh PID_FILE INTERVAL
for instance: sh ./script_name.sh /var/run/
here are the output logs:
[sysadmin@
vie ago 28 03:17:49 UTC 2020
killing 3410146
vie ago 28 03:17:54 UTC 2020
+------
| Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+------
+------
vie ago 28 03:18:50 UTC 2020
killing 3434943
vie ago 28 03:18:55 UTC 2020
+------
| Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+------
+------
and swact was not triggered.
Expected Behavior
------------------
swact has to be triggered if process was killed more than once within 90 seconds.
Actual Behavior
----------------
swact is not triggered.
Reproducibility
---------------
Reproducible
System Configuration
-------
IPv4 distributed cloud
Branch/Pull Time/Commit
-------
BUILD_ID=
Last Pass
---------
N/A
Timestamp/Logs
--------------
N/A: logs are above.
Test Activity
-------------
Feature Testing
CVE References
description: | updated |
summary: |
- dnsmasq process monitoring expected impact doesn't work. + swact is not triggered after killing dnsmasq process within 90 seconds |
description: | updated |
tags: | added: stx.ha |
Changed in starlingx: | |
assignee: | Don Penney (dpenney) → Bin Qian (bqian20) |
tags: | added: in-r-stx50 |
stx.5.0 / medium priority - issue related to process recovery