Removing chrony from Apparmor fails for Debian on redeploy

Bug #1915549 reported by Ana Peric
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla-ansible
Triaged
Medium
Unassigned

Bug Description

Removing chrony from Apparmor fails for Debian on redeploy

**Bug Report**

chronyd debian-based kolla container is in a constant crash loop if server is rebooted / docker container restarted. More details below.

**Environment**:

- Base OS: Debian GNU/Linux 10 (buster) | docker images used (Ubuntu) from 26th Jan 2021 (likely cf305aaaf commit in kolla repo)
- Kernel: Linux 4.19.0-13-cloud-amd64 #1 SMP Debian 4.19.160-2 (2020-11-28) x86_64 GNU/Linux
- kolla-ansible branch: stable/victoria (commit-id: 05e6d4a4d (HEAD) Fix dpdk deploy failed)
- Used images pulled from docker-hub on 26th Jan 2021 (Debian / stable-victoria)
- chrondyd_enabled = True
- Ansible is configured to fail on errors
- Strategy used: ALWAYS_COPY

### Problem Description

Bug https://bugs.launchpad.net/kolla-ansible/+bug/1882513 addressed the issue where chronyd should be de-installed from the host if we are running chrony in kolla docker.
Nevertheless, it seems that this was done/tested on Ubuntu and RHEL, but not on Debian based OS.

How to reproduce the issue:

1. kolla deploy was run for the first time (all OK)
2. if we rerun the kolla-bootstrap-servers again (or kolla deploy) we will see that execution will fail on a try to disable apparmour for chronyd. This is 100% reproducible.

This fails on the second run, most likely as if the first run removed the policy we can't do it once again (by the subsequent runs) as policy is not there.
This is why one of the fix approaches may be taken below.

Error example (when deploying for the second time):

```
ok: [control02]TASK [baremetal : Remove apparmor profile for chrony] ************************************************************************************************************************************************************************************************

fatal: [compute02]: FAILED! => {"changed": true, "cmd": ["apparmor_parser", "-R", "/etc/apparmor.d/usr.sbin.chronyd"], "delta": "0:00:00.058667", "end": "2021-02-03 14:11:33.514542", "msg": "non-zero return code", "rc": 254, "start": "2021-02-03 14:11:33.455875", "stderr": "apparmor_parser: Unable to remove \"/usr/sbin/chronyd\". Profile doesn't exist", "stderr_lines": ["apparmor_parser: Unable to remove \"/usr/sbin/chronyd\". Profile doesn't exist"], "stdout": "", "stdout_lines": []}

```

## Potential resolution:

One option _may_ be to combine -C and -R commands, but this will only going to work if we have buster for example, that will by default enable apparmour on chronyd.

```
diff --git a/ansible/roles/baremetal/tasks/post-install.yml b/ansible/roles/baremetal/tasks/post-install.yml
index 5fdc471b0..66286b629 100644
--- a/ansible/roles/baremetal/tasks/post-install.yml
+++ b/ansible/roles/baremetal/tasks/post-install.yml
@@ -168,7 +168,9 @@
     - enable_chrony | bool

 - name: Remove apparmor profile for chrony
- command: apparmor_parser -R /etc/apparmor.d/usr.sbin.chronyd
+ shell: |
+ apparmor_parser -C /etc/apparmor.d/usr.sbin.chronyd
+ apparmor_parser -R /etc/apparmor.d/usr.sbin.chronyd
   become: True
   when:
     - ansible_os_family == "Debian"
```

Now this may not be working of course if older versions are there, that do not have apparmor enabled on chronyd to begin with.

Thus to fix this in kolla for different versions, we need slightly different approach:
1. Execute removal (-R) of the policy if and only if:
  - apparmour status (apparmor_status --json) command succeeds
  - apparmour has chronyd policy set (from (apparmor_status.stdout --json output) and do not exec -R by default.
I will try to create this fix and we can take it from there.

Ana Peric (anperic)
description: updated
Revision history for this message
Mark Goddard (mgoddard) wrote :

Hi, we had a recent fix for a similar issue for the libvirtd profile. Would something like it work?

https://opendev.org/openstack/kolla-ansible/commit/891ec51dd417af894f7dde0dfa68b2333f497dcf

Changed in kolla-ansible:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Ana Peric (anperic) wrote :

Hi Mark,

Thank you for looking into it.

we solved this in our automation triggering before kolla-ansible on a similar way (actually we tricked it to place it to complain mode first so when kolla runs it will always be able to remove as we set it for a moment to complain, so profile can be removed.

Anyway, i think this fix would work, I will test it real quick today (applied to chrony, of course), and let you know.

Thanks & cheers,
Ana

Revision history for this message
Ana Peric (anperic) wrote :

Hi Mark, Team,

Now the thing w/ disabled profile will not work on Debian definatelly, as profile will not go to disabled folder if -R-erd.

I did take another approach that works properly:
1. Both on pass1 (first run of kolla deploy)
2. Second or any subsequent passes

Change is as follows:

diff --git a/ansible/roles/baremetal/tasks/post-install.yml b/ansible/roles/baremetal/tasks/post-install.yml
index 5fdc471b0..0d9960285 100644
--- a/ansible/roles/baremetal/tasks/post-install.yml
+++ b/ansible/roles/baremetal/tasks/post-install.yml
@@ -167,6 +167,12 @@
     - ansible_os_family == "Debian"
     - enable_chrony | bool

+- name: Get status of chronyd apparmor profile
+ command: apparmor_status --json
+ become: True
+ register: apparmor_status
+ ignore_errors: True
+
 - name: Remove apparmor profile for chrony
   command: apparmor_parser -R /etc/apparmor.d/usr.sbin.chronyd
   become: True
@@ -174,6 +180,8 @@
     - ansible_os_family == "Debian"
     - enable_chrony | bool
     - apparmor_chronyd_profile.stat.exists
+ - apparmor_status is succeeded
+ - (apparmor_status.stdout | from_json).profiles["/usr/sbin/chronyd"] | default(false)

 - name: Create docker group
   group:

###
Idea behind is, that the only reliable way to check if profile is removed is to take a look at apparmor status and hunt if chronyd is there.
The config file will not get deleted (thus this stat does not say much), and disabled profile folder is always empty (unless we make a symlink when we -R).

If you are ok with this path, maybe we can make code change with this and send it for a review.

Thanks & cheers,
Ana

Revision history for this message
Mark Goddard (mgoddard) wrote :

Hi Ana, I'm no expert in Apparmor yet, but if you test the change and submit it for review, we can discuss it there.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.