Avoid stopping services on upgrade

Bug #2027613 reported by Mitch Burton
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Landscape Client
Fix Released
Undecided
Unassigned
landscape-client (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Committed
Undecided
Unassigned
Jammy
Fix Committed
Undecided
Unassigned

Bug Description

[ Impact ]

debian/rules override_dh_installsystemd uses --no-start to prevent the landscape-client service from being started post-install until landscape-config is run. This has the side-effect of the services stopping on upgrade so that when landscape upgrades itself, it cannot report the result.

Patching this will allow users to upgrade Landscape Client to this version (and future versions) and transition to the new version without interruption of the service.

The patch fixes this by using the default dh_installsystemd behaviour combined with a service ExecCondition that checks the registration status of Landscape Client. If registered, the service must be running, and so should be started again after upgrade.

[ Test Plan ]

The following tests will be performed, in addition to the ones laid out in https://wiki.canonical.com/Landscape/ClientSRUTests/23.02

On jammy and focal:

test 1: install version 19.12, register with Landscape Server, service is running, upgrade to version 23
expected behaviour: landscape-client service restarts, running the new version (23).

test 2: install version 19.12, register with Landscape Server, systemctl stop landscape-client, upgrade to version 23
expected behaviour: landscape-client service restarts, running the new version (23).

test 3: install version 19.12, register with Landscape Server, systemctl stop landscape-client, systemctl disable landscape-client, upgrade to version 23
expected-behaviour: landscape-client service does not start (as it is disabled), but when enabled, the new version (23) is running.

test 4: install version 19.23, do not register with Landscape Server, service is not running, upgrade to version 23
expected-behaviour: landscape-client service never runs, registration has not occurred.

[ Where problems could occur ]

As this is a change to the systemd unit behaviour, problems would probably present as the service failing to start after upgrade, or failing to stop, continuing to run the previous version.

Revision history for this message
Mitch Burton (mitchburton) wrote :
affects: landscape-client → landscape-client (Ubuntu)
Changed in landscape-client (Ubuntu):
status: New → Fix Released
Changed in landscape-client:
status: New → Fix Released
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thanks for outlining the motivation behind this, I think this is the right start.
But the discussion on LP: #2006402 has shown that people are rather unsure about it.
I realized my text got long so I split it in sections for readability, hope that helps.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Download full text (9.4 KiB)

#1 current situation

## Jammy

override_dh_installsystemd:
        dh_installsystemd --no-enable --no-start

Leading to
root@j:~# grep -C 5 landscape-client.service /var/lib/dpkg/info/landscape-client.p*
/var/lib/dpkg/info/landscape-client.postinst- # Migrate old sysvinit config to systemd
/var/lib/dpkg/info/landscape-client.postinst- DEFAULTS=/etc/default/landscape-client
/var/lib/dpkg/info/landscape-client.postinst- if [ -f $DEFAULTS ]; then
/var/lib/dpkg/info/landscape-client.postinst- RUN=$(. $DEFAULTS; echo $RUN)
/var/lib/dpkg/info/landscape-client.postinst- if [ "$RUN" = "1" ]; then
/var/lib/dpkg/info/landscape-client.postinst: systemctl enable landscape-client.service
/var/lib/dpkg/info/landscape-client.postinst- fi
/var/lib/dpkg/info/landscape-client.postinst- rm -f $DEFAULTS
/var/lib/dpkg/info/landscape-client.postinst- fi
/var/lib/dpkg/info/landscape-client.postinst- ;;
/var/lib/dpkg/info/landscape-client.postinst-
--
/var/lib/dpkg/info/landscape-client.postinst-fi
/var/lib/dpkg/info/landscape-client.postinst-
/var/lib/dpkg/info/landscape-client.postinst-# End automatically added section
/var/lib/dpkg/info/landscape-client.postinst-# Automatically added by dh_installsystemd/13.6ubuntu1
/var/lib/dpkg/info/landscape-client.postinst-if [ "$1" = "configure" ] || [ "$1" = "abort-upgrade" ] || [ "$1" = "abort-deconfigure" ] || [ "$1" = "abort-remove" ] ; then
/var/lib/dpkg/info/landscape-client.postinst: if deb-systemd-helper debian-installed 'landscape-client.service'; then
/var/lib/dpkg/info/landscape-client.postinst- # This will only remove masks created by d-s-h on package removal.
/var/lib/dpkg/info/landscape-client.postinst: deb-systemd-helper unmask 'landscape-client.service' >/dev/null || true
/var/lib/dpkg/info/landscape-client.postinst-
/var/lib/dpkg/info/landscape-client.postinst: if deb-systemd-helper --quiet was-enabled 'landscape-client.service'; then
/var/lib/dpkg/info/landscape-client.postinst- # Create new symlinks, if any.
/var/lib/dpkg/info/landscape-client.postinst: deb-systemd-helper enable 'landscape-client.service' >/dev/null || true
/var/lib/dpkg/info/landscape-client.postinst- fi
/var/lib/dpkg/info/landscape-client.postinst- fi
/var/lib/dpkg/info/landscape-client.postinst-
/var/lib/dpkg/info/landscape-client.postinst- # Update the statefile to add new symlinks (if any), which need to be cleaned
/var/lib/dpkg/info/landscape-client.postinst- # up on purge. Also remove old symlinks.
/var/lib/dpkg/info/landscape-client.postinst: deb-systemd-helper update-state 'landscape-client.service' >/dev/null || true
/var/lib/dpkg/info/landscape-client.postinst-fi
/var/lib/dpkg/info/landscape-client.postinst-# End automatically added section
/var/lib/dpkg/info/landscape-client.postinst-
/var/lib/dpkg/info/landscape-client.postinst-
/var/lib/dpkg/info/landscape-client.postinst-exit 0
--
/var/lib/dpkg/info/landscape-client.postrm-fi
/var/lib/dpkg/info/landscape-client.postrm-# End automatically added section
/var/lib/dpkg/info/landscape-client.postrm-# Automatically added by dh_installsystemd/13.6ubuntu1
/var/lib/dpkg/info/landscape-c...

Read more...

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

#2 What do we want

Let me try to start this here in more detail.
I think we need a table here to list all desired behavior in the different possible states.
And then look for a solution providing all of that.
I know this is more intense now, but hopefully allows us to fix it just once :-)
(I hope this is monospace for all of you...)

This is what I think you described mitch, please correct me.

Action | Situation | Solution |
Pkg-install | no config, should not start | --no-enable --no-start |
------------|-----------------------------------|-----------------------------------|
Pkg-upgrade | if it runs, do not stop it | --no-enable --no-start |
            | because it would stay off | + --no-stop-on-upgrade |
------------|-----------------------------------|-----------------------------------|
Pkg-upgrade | if does not run, do not start | --no-enable --no-start |
------------|-----------------------------------|-----------------------------------|

#3 What do we really want?

Now the problem with that is that just using the predefined `--no-enable --no-start` or the same with `--no-stop-on-upgrade` won't do all you need.

The table should IMHO actually look like (please really correct me if that is wrong):

Action | Situation | Solution |
Pkg-install | no config, should not start | --no-enable --no-start |
------------|-----------------------------------|-----------------------------------|
Pkg-install | config present, e.g. by config | neither of --no-enable --no-start |
            | or from old install or any | would be allowed |
            | orchestration SW, should start | |
------------|-----------------------------------|-----------------------------------|
Pkg-upgrade | if it runs, restart it to pick up | --no-stop-on-upgrade not allowed |
            | the new code, imagine you have a | |
            | CVE - you need to restart | |
------------|-----------------------------------|-----------------------------------|
Pkg-upgrade | if not configure, do not touch it | need to check if configured |
------------|-----------------------------------|-----------------------------------|

There might be more, feel free to add them to help any sponsor to think with you about it.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

# Suggestion to think about

So IMHO you have a service that is neither never nor always (re-)started.
And IMHO the arguments to dh_installsystemd are never enough to express that.

But systemd has mechanisms for such, for example [1].

I haven't dealt with all your config.
There is /var/lib/landscape/client/ and /etc/landscape/client.conf and sometimes landscape-client.conf.
That is up to you to know better which might be good.

Sadly most of the checks are for existence and your files all exist on install, they are just not fully ready yet. I registered a client and compared the config pre/post attach.

I realize that having or not-having account_name in /etc/landscape/client.conf might be the perfect condition. But systemd can't check on file content AFAIK. What would you think of using a /etc/landscape/registered or anything in /var/lib/landscape/client that you think you can rely on to be present "after config" but "before service start".

Gladly systemd adds a special way for everything :-)
For "advanced" conditions there is [2] which you might use here.
And you have - in the new version - gladly `landscape-config --is-registered`

I'm not sure how this internally works, --is-registered needs to be able to return true after being configured, but before the service has been started. I think you check /var/lib/landscape/client/broker.bpickle which might only exist after the first start, but I'm sure you could work something out that solves this and then would really cover all use-cases better.

As an example something like the following might cover all of the above cases much better (please check and amend with your subject matter expertise):

$ git diff
diff --git a/debian/landscape-client.service b/debian/landscape-client.service
index 6dc721a3..aa5c0261 100644
--- a/debian/landscape-client.service
+++ b/debian/landscape-client.service
@@ -11,6 +11,7 @@ WantedBy=multi-user.target
 [Service]
 Type=simple
 Group=landscape
+ExecCondition=/usr/bin/landscape-config --is-registered
 ExecStart=/usr/bin/landscape-client
 # Don't kill cgroup as child dpkg may restart the service during an upgrade.
 KillMode=process
diff --git a/debian/rules b/debian/rules
index ad30f5c9..d6a156c6 100755
--- a/debian/rules
+++ b/debian/rules
@@ -23,4 +23,4 @@ override_dh_auto_install:
        install -D -o root -g root -m 755 apt-update/apt-update $(CURDIR)/$(root_dir)$(LIBDIR)/apt-update

 override_dh_installsystemd:
- dh_installsystemd --no-enable --no-start --no-stop-on-upgrade
+ dh_installsystemd

[1]: https://www.freedesktop.org/software/systemd/man/systemd.unit.html#ConditionFileNotEmpty=
[2]: https://www.freedesktop.org/software/systemd/man/systemd.service.html#ExecCondition=

Revision history for this message
Mitch Burton (mitchburton) wrote (last edit ):

Thanks, Christian. I really like the idea of using ExecCondition and going back to default dh_installsystemd.

A little bit of info regarding landscape-config --is-registered and configuration/registration in general. When landscape-config is run to register landscape, after /etc/landscape/client.conf is updated with the provided values, landscape-config runs `systemctl restart landscape-client.service` is run to complete the registration process with the server. This is the "first start" of the service, and if it doesn't happen, then there's no possibility of registration being completed successfully.

Basically /var/lib/landscape/client/broker.bpickle existing is a prerequisite of being registered. There's no way for --is-registered to be able to return true before the service has been started because the service starting is required for registration in the first place.

I'm going to go ahead and test out the proposed patch.

Revision history for this message
Mitch Burton (mitchburton) wrote :

Testing summary, tests run on jammy:

test 1: installed version 19.12-0ubuntu13, registered, service is running, upgraded to version 23
result: version 23 is now the running service

test 2: installed version 19.12-0ubuntu13, registered, systemctl stop landscape-client, upgraded to version 23
result: version 23 is now the running service

test 3: installed version 19.12-0ubuntu13, registered, systemctl stop landscape-client, systemctl disable landscape-client, upgraded to version 23
result: service is not running

test 4: installed version 19.23-0ubuntu13, did not register, service is not running, upgraded to version 23
result: service is not running

Repeated tests on focal with same results (expect upgrading from version 19.12-0ubuntu4.3).

The version I used for these tests can be found in this PPA: https://launchpad.net/~mitchburton/+archive/ubuntu/landscape-client-sru-tests

I think this is almost an adequate solution, but it's still concerning that a manually stopped (but not disabled) service is restarted on upgrade.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I'm happy that it seems to solve all your cases

I do not think that remaining concern of yours is a real concern we need to have.
That is the standard behavior and usually "unless you have a reason, you want to be normal".
Here an example of a service people consider mundane:

root@m:~# apt install apache2
root@m:~# systemctl status apache2 | head -n 3
● apache2.service - The Apache HTTP Server
     Loaded: loaded (/lib/systemd/system/apache2.service; enabled; preset: enabled)
     Active: active (running) since Mon 2023-07-24 06:04:24 UTC; 1min 29s ago
root@m:~# systemctl stop apache2
root@m:~# systemctl status apache2 | head -n 3
○ apache2.service - The Apache HTTP Server
     Loaded: loaded (/lib/systemd/system/apache2.service; enabled; preset: enabled)
     Active: inactive (dead) since Mon 2023-07-24 06:06:15 UTC; 2s ago
root@m:~# apt install --reinstall apache2
...
root@m:~# systemctl status apache2 | head -n 3
● apache2.service - The Apache HTTP Server
     Loaded: loaded (/lib/systemd/system/apache2.service; enabled; preset: enabled)
     Active: active (running) since Mon 2023-07-24 06:06:34 UTC; 5s ago

So it behaves just like yours.
Remember, stopped services would also come back up on reboot.
People know (or need to learn and understand) that "stop" really just means "stop now".

And your own tests confirm that "disable" does well what it is intended to do.

It is up to you, but my gut feeling says that with this current proposed solution it would be good?
If you agree you want to get that uploaded to mantic and then augment the ongoing SRUs to include the same.

Revision history for this message
Mitch Burton (mitchburton) wrote :

Right, thanks for making the stop/disabled difference incredibly clear. I'm happy with the proposed solution and will get it merged upstream and update the SRUs to include it. Thanks again!

Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

Since this is going to be fixed via SRU I created tasks for Focal and Jammy. Could you please add the SRU template to the bug description?

https://wiki.ubuntu.com/StableReleaseUpdates#SRU_Bug_Template

description: updated
Revision history for this message
Steve Langasek (vorlon) wrote :

this is looking good, but the test plan in the description needs to include a description of the expected behavior for each test, not just stating what you will run.

Changed in landscape-client (Ubuntu Jammy):
status: New → Incomplete
Revision history for this message
Mitch Burton (mitchburton) wrote :

I've updated the description with the expected behaviour for each test.

description: updated
Revision history for this message
Steve Langasek (vorlon) wrote : Please test proposed package

Hello Mitch, or anyone else affected,

Accepted landscape-client into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/landscape-client/23.02-0ubuntu1~22.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in landscape-client (Ubuntu Jammy):
status: Incomplete → Fix Committed
tags: added: verification-needed verification-needed-jammy
Revision history for this message
Steve Langasek (vorlon) wrote :

Hello Mitch, or anyone else affected,

Accepted landscape-client into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/landscape-client/23.02-0ubuntu1~20.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in landscape-client (Ubuntu Focal):
status: New → Fix Committed
tags: added: verification-needed-focal
Revision history for this message
Mitch Burton (mitchburton) wrote :

In the process of verification testing, I discovered a gap in my previous testing: I never tested simply installing the new version of the package and running `landscape-config`.

Unfortunately `landscape-config` relies on starting the service to register and cannot because of the new ExecCondition, which requires registration to already have happened! I can think of a few different ways to resolve this:

  1. restructure `landscape-config` so that it doesn't need to start the service to register (nontrivial amount of work)
  2. modify `landscape-config --is-registered` so that it is aware that we are currently in the `landscape-config` wizard
  3. change the ExecCondition to something like
     /usr/bin/landscape-config --is-registered || ! systemctl is-active landscape-client.service
     so that we can also start the service when it is manually enabled, such as the case where we are not registered but `landscape-config` is being run

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.