unattended upgrade ran one day after schedule

Bug #1824088 reported by Carl Winbäck
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
apt (Ubuntu)
Triaged
Low
Unassigned

Bug Description

I have noticed a strange behaviour in unattended upgrades. The host in question is scheduled to run upgrades on Mondays every second week. However, this week it ran upgrades on Tuesday instead (2019-04-09).

This is the first time I’ve noticed this behaviour. I checked the logs in /var/log/apt/history.log* and I saw that worked as intended up until this week. Upgrades ran as expected on 2019-03-11 and 2019-03-25, which where both Mondays.

The schedule was set with the line 'APT::Periodic::Unattended-Upgrade "14";' in the file /etc/apt/apt.conf.d/20auto-upgrades.

Could it be daylight savings time that has caused skewing of the schedule?

The server is located in Sweden and on 2019-03-31 we switched from CET to CEST. If the time diff is calculated with hours instead of calendar days passed, perhaps the missing hour on 2019-03-31 caused the scheduler to believe that on Monday 2019-03-08, two weeks (336 hours) had not yet passed.

ADDITIONAL INFO

Description: Ubuntu 16.04.6 LTS
Release: 16.04

apt:
  Installed: 1.2.29ubuntu0.1
  Candidate: 1.2.31
  Version table:
     1.2.31 500
        500 http://se.archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages
 *** 1.2.29ubuntu0.1 500
        500 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages
        100 /var/lib/dpkg/status
     1.2.10ubuntu1 500
        500 http://se.archive.ubuntu.com/ubuntu xenial/main amd64 Packages

unattended-upgrades:
  Installed: 0.90ubuntu0.10
  Candidate: 0.90ubuntu0.10
  Version table:
 *** 0.90ubuntu0.10 500
        500 http://se.archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages
        500 http://se.archive.ubuntu.com/ubuntu xenial-updates/main i386 Packages
        500 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages
        500 http://security.ubuntu.com/ubuntu xenial-security/main i386 Packages
        100 /var/lib/dpkg/status
     0.90 500
        500 http://se.archive.ubuntu.com/ubuntu xenial/main amd64 Packages
        500 http://se.archive.ubuntu.com/ubuntu xenial/main i386 Packages

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: apt 1.2.29ubuntu0.1
ProcVersionSignature: Ubuntu 4.4.0-142.168-generic 4.4.167
Uname: Linux 4.4.0-142-generic x86_64
ApportVersion: 2.20.1-0ubuntu2.18
Architecture: amd64
Date: Wed Apr 10 09:06:49 2019
InstallationDate: Installed on 2017-12-28 (467 days ago)
InstallationMedia: Ubuntu-Server 16.04.3 LTS "Xenial Xerus" - Release amd64 (20170801)
SourcePackage: apt
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Carl Winbäck (giffengrabber) wrote :
Revision history for this message
Carl Winbäck (giffengrabber) wrote :
Revision history for this message
Magnus Jonsson (kmjonsson) wrote :

This IS related to daylight saving.

The script: /usr/lib/apt/apt.systemd.daily check that the script is not run twice at the same period.

The script checks the date of today and the date of the last timestamp to determine the the diff.
The interval is calculated using the number of seconds per day multiplied by number of days.

If a day in the interval has less then 24h this check fails.

This also affects the apt-daily.service that updates.

This is not only Xenial but all the way up to 20.04/Focal.

Revision history for this message
Julian Andres Klode (juliank) wrote :

Yes we know that the script does not necessarily run once a day, that's not a surprise.

Revision history for this message
Julian Andres Klode (juliank) wrote :

It's hardly and issue given that the script runs twice per day, which compensates for that by running it after 36 hours instead of 24.

Revision history for this message
Magnus Jonsson (kmjonsson) wrote :

It will not run after 36h either. It will not run until next day because "yesterday" did not have 24 hours until next day.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in apt (Ubuntu):
status: New → Confirmed
Revision history for this message
Carl Winbäck (giffengrabber) wrote :

If we look at lines 131 and 132 in /usr/lib/apt/apt.systemd.daily, we will se the following:

interval="${interval%d}"
interval=$((interval*60*60*24))

I’m not 100% certain but I think that this time logic is what cause the bug to occur, in accordance with what Magnus has pointed out in his comments.

(It’s possible that lines 122-130 are also relevant.)

Any ideas on how to modify the code in order to avoid this bug?

Revision history for this message
Carl Winbäck (giffengrabber) wrote :

How about this idea?

What if we add this line to the beginning of usr/lib/apt/apt.systemd.daily:

export TZ="UTC0"

(I got this idea from the GNU Coreutils manual. See https://www.gnu.org/software/coreutils/manual/html_node/Relative-items-in-date-strings.html)

Revision history for this message
Magnus Jonsson (kmjonsson) wrote :

Setting TZ for the complete script will not work as it will make strange behaviour if you are far away from UTC/GMT TZ. You can set it for just the date command that returns the epoch time.

I did a patch that does that for the bionic version. It also applys directly on Focal.

I attach the patch to the bug.

Revision history for this message
Magnus Jonsson (kmjonsson) wrote :
Revision history for this message
Carl Winbäck (giffengrabber) wrote :

Awesome work Magnus. Well done!

Since the bug is related to APT, is Debian then considered the upstream of this source?

Would it be appropriate to also file a bug in their bug tracker? If so, I can arrange that.

Revision history for this message
Julian Andres Klode (juliank) wrote :

I'm not convinced there is a bug here or that the patch actually fixes it. We convert our times to Unix time stamps, which are independent of time zones and DST.

The proposed patch should not be doing anything.

And no, we apt folks do not need or want a duplicate bug report in Debian.

Revision history for this message
Julian Andres Klode (juliank) wrote :

I know that we can sometimes miss a beat so to say, but fixing this will require fixing systemd to allow retrying of failed timer services, and then removing the options for intervals inside of apt.

Revision history for this message
Julian Andres Klode (juliank) wrote :

It seems reasonable that setting TZ=UTC0 for date --iso-8601 could help (as that's where a time zone is relevant, the +%s conversion it is not relevant for).

But then this causes unwanted behaviour for people far off UTC, I guess, so it's not a solution - It's entirely unclear to me whether a solution exists given that we allow arbitrary intervals in settings.

Always-on machines (or well, always connected machines) should be configured with the interval set to 0 (disabling timestamp checks), and the timer should be configured to run at the appropriate times.

To resolve issues with skipped runs elsewhere, we really need to resolve the issue that systemd can't reschedule failed timer services, and then remove the options inside apt.conf, as having dynamic timers in systemd and then a fixed check inside the script does not really work (hence why we annoyingly run it twice a day to give it a chance to fixup time skew).

Changed in apt (Ubuntu):
status: Confirmed → Won't Fix
status: Won't Fix → Triaged
Revision history for this message
Julian Andres Klode (juliank) wrote :

Arguably, a workaround might be to remove support for the interval options now, and then treat it as "one day", and just check that %Y-%m-%d for today != the one for the stamp.

This addresses the problem with DST, and breaks every non-default configuration. Which is something we will have to live with eventually anyway I guess.

So basically speaking, under systemd, intervals will always be "not the same day", other settings will be ignored and produce a warning, and you're supposed to configure the timer. This still allows us to play catch up by running the timer multiple times per day, until we get a fixed systemd.

Under non-systemd systems in Debian, stuff will continue to work as it does now.

Revision history for this message
Julian Andres Klode (juliank) wrote :

Oh but we have the other issue where we have more interval options than systemd services. Sigh.

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "apt.systemd.daily.diff" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
tags: removed: patch
Revision history for this message
Julian Andres Klode (juliank) wrote :

https://salsa.debian.org/apt-team/apt/-/merge_requests/115 implements the proposed change for daily runs, and warns if intervals are configured to anything other than 1 or 0 (and soon, always), telling people to configure their systemd timer instead.

Revision history for this message
Magnus Jonsson (kmjonsson) wrote :

The interesting about the code is that it always only treat the timestamp of the file as "YYYY-MM-DD" and nothing else (this also includes "now").

The code has functions for allowing the interval to be set down to 1s but is does not mean anything to the outcome of the script as it always looks for complete day(s).

I can only guess that the code might have has some old reference where the script (or an other script) did something else.

As of the output of "date +%s" it is dependent of the timezone when used with the --date command.

It kind of looks like the "date" is inputed in the local timezone rather witch will make epoch time drift dependent of TZ.

# Bionic
$ TZ=UTC0 date --date="2020-01-01" +%s
1577836800
$ TZ=CET date --date="2020-01-01" +%s
1577833200

Using "date -u" will also fix this behaviour of date.

$ TZ=UTC date -u --date="2020-01-01" +%s
1577836800
$ TZ=CET date -u --date="2020-01-01" +%s
1577836800

An easy fix for the original problem would be add the "-u" option to the date commands in the script:

stamp=$(date -u --date="$(date -r "$stamp_file" --iso-8601)" +%s 2>/dev/null)
now=$(date -u --date="$(date --iso-8601)" +%s 2>/dev/null)

Revision history for this message
Julian Andres Klode (juliank) wrote :

But if I live in UTC+11, I don't want it to run when the day has changed in UTC+0?

As for date +%s being time zone dependent, hmm, I thought iso-8601 included a timezone, but it does not.

A workaround for DST that might work for non-1-day intervals could be to change the length of a day to 22 hours:

$ git diff
diff --git a/debian/apt.systemd.daily b/debian/apt.systemd.daily
index 0d336ddb2..567316df9 100755
--- a/debian/apt.systemd.daily
+++ b/debian/apt.systemd.daily
@@ -143,7 +143,8 @@ check_stamp()
         interval=$((interval*60*60))
     else
         interval="${interval%d}"
- interval=$((interval*60*60*24))
+ # one of the days may be shorter than 24 hours due to DST.
+ interval=$(((interval-1)*60*60*24 + 60*60*22))
     fi

     debug_echo "check_stamp: interval=$interval, now=$now, stamp=$stamp, delta=$delta (sec)"

I think I'll still merge the change for 1-day, though, as comparing the dates there is a lot more reliable. It will also deal with you changing time zones, I guess.

Revision history for this message
Magnus Jonsson (kmjonsson) wrote :

The thing is that it's only the delta that is handled in UTC not the iso-8601-part so it will work just fine.

My first idé was to fix the interval calculations must like yours.

The workaround for us was to set interval to always. We have local mirrors that can handle the load just fine even if we do updates every hour.

Revision history for this message
Julian Andres Klode (juliank) wrote :

It seems systemd actually supports restarting oneshot services now, so any options for timestamps will be removed (anything other than 0 will behave as always), and systemd timers will have to be configured via drop-ins instead. Finally.

That said, it seems a bit late for focal now.

Hmm, yeah, I guess that works, because we then take the date in UTC and compare that.

Changed in apt (Ubuntu):
importance: Undecided → Low
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.