apt clients stuck on parallel access

Bug #1983856 reported by Alexander Gaengel
104
This bug affects 20 people
Affects Status Importance Assigned to Milestone
apt-cacher-ng (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Ubuntu 22.04.1 LTS; apt-cacher-ng; 3.7.4-1build1

If multiple clients (tested: 10-20) accessing the apt-cacher-ng very simultaneously with the very same requests (orchestrated by ansible here), one or more of the apt-get clients seems to stuck infinity. I don't see any related logs on the client side, only the processes running without consuming CPU-Time:

_apt 82650 82271 0 12:55 ? 00:00:00 /usr/lib/apt/methods/http
_apt 82651 82271 0 12:55 ? 00:00:00 /usr/lib/apt/methods/http
_apt 82653 82271 0 12:55 ? 00:00:00 /usr/lib/apt/methods/gpgv
_apt 82808 82271 0 12:55 ? 00:00:00 /usr/lib/apt/methods/store

On the apt-cacher-ng side, I see something in the apt-cacher.err:

Mon Aug 8 12:55:34 2022|Failure to move file /data/apt-cacher/ubuntu.mirror.lrz.de/ubuntu/dists/jammy-updates/InRelease out of the way or cannot create /data/apt-cacher/ubuntu.mirror.lrz.de/ubuntu/dists/jammy-updates/InRelease.1659956134 - errno: File exists
Mon Aug 8 12:55:34 2022|Error creating file item for ubuntu.mirror.lrz.de/ubuntu/dists/jammy-updates/InRelease -- check file permissions!

The file permission are _not_ the problem, it seems to be a kind of race-condition?
I don't even know if it's related, because I get such error-messages even on a successful run withouts stuck clients:

Mon Aug 8 13:28:57 2022|/data/apt-cacher/ubuntu.mirror.lrz.de/ubuntu/dists/jammy/InRelease.1659958137 storage error [Checked size beyond EOF], check file AND directory permissions, last errno: File exists
Mon Aug 8 13:28:57 2022|/data/apt-cacher/ubuntu.mirror.lrz.de/ubuntu/dists/jammy-updates/InRelease.1659958137 storage error [Checked size beyond EOF], check file AND directory permissions, last errno: File exists
Mon Aug 8 13:28:57 2022|/data/apt-cacher/ubuntu.mirror.lrz.de/ubuntu/dists/jammy-backports/InRelease.1659958137 storage error [Checked size beyond EOF], check file AND directory permissions, last errno: File exists
Mon Aug 8 13:28:57 2022|Failure to move file /data/apt-cacher/ubuntu.mirror.lrz.de/ubuntu/dists/jammy-updates/InRelease out of the way or cannot create /data/apt-cacher/ubuntu.mirror.lrz.de/ubuntu/dists/jammy-updates/InRelease.1659958137 - errno: File exists
Mon Aug 8 13:28:57 2022|Error creating file item for ubuntu.mirror.lrz.de/ubuntu/dists/jammy-updates/InRelease -- check file permissions!
Mon Aug 8 13:28:57 2022|Failure to move file /data/apt-cacher/ubuntu.mirror.lrz.de/ubuntu/dists/jammy-updates/InRelease out of the way or cannot create /data/apt-cacher/ubuntu.mirror.lrz.de/ubuntu/dists/jammy-updates/InRelease.1659958137 - errno: File exists
Mon Aug 8 13:28:57 2022|Error creating file item for ubuntu.mirror.lrz.de/ubuntu/dists/jammy-updates/InRelease -- check file permissions!
Mon Aug 8 13:28:57 2022|Failure to move file /data/apt-cacher/ubuntu.mirror.lrz.de/ubuntu/dists/jammy-backports/InRelease out of the way or cannot create /data/apt-cacher/ubuntu.mirror.lrz.de/ubuntu/dists/jammy-backports/InRelease.1659958137 - errno: File exists
Mon Aug 8 13:28:57 2022|Error creating file item for ubuntu.mirror.lrz.de/ubuntu/dists/jammy-backports/InRelease -- check file permissions!
Mon Aug 8 13:28:57 2022|/data/apt-cacher/ubuntu.mirror.lrz.de/ubuntu/dists/focal/InRelease.1659958137 storage error [Checked size beyond EOF], check file AND directory permissions, last errno: Success
Mon Aug 8 13:28:58 2022|/data/apt-cacher/ubuntu.mirror.lrz.de/ubuntu/dists/jammy-updates/InRelease.1659958138 storage error [Checked size beyond EOF], check file AND directory permissions, last errno: File exists

Tags: jammy
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in apt-cacher-ng (Ubuntu):
status: New → Confirmed
Revision history for this message
Christian Rohmann (christian-rohmann) wrote :

I can confirm this issue for parallel calls of e.g. "/usr/bin/apt update" on 15 clients, which then shows:

--- cut ---
[...]
W: Failed to fetch http://de.archive.ubuntu.com/ubuntu/dists/focal/InRelease 503 Error with cache data, please consult apt-cacher.err [IP: xx.xx.xx.xx 3142]
W: Some index files failed to download. They have been ignored, or old ones used instead.
[...]
--- cut ---

Revision history for this message
Robert Hrovat (robi-hipnos) wrote :

Even with less clients. I run this with 5-10 at the same time and get this errors on lots of them.

Revision history for this message
Mathieu MD (mathieu.md) wrote :

I'm afraid that even with a single client it seems to randomly logs this line:
  storage error [Checked size beyond EOF], check file AND directory permissions, last errno: File exists

I'm deleting the failed temporary files:
  sudo find /var/cache/apt-cacher-ng/ -name 'InRelease.[0-9]*' -delete
But it only randomly helps a bit for newly runned apt-get clients' processes.

When client fails to update it's cache, it says:
  500 Cache Error, check apt-cacher.err [IP : x.x.x.x 3142]

However, that's quite an annoying bug, because it crashes client's Unattended Upgrade. Their apt-get process stays waiting forever, which blocks any other apt-get! If not supervised, we would miss security updates without even knowing it!

Revision history for this message
nicomen (mendoza-pvv) wrote :

Confirmed. This basically started happening when upgrading to Jammy. Happens with just two concurrent clients also.

Revision history for this message
Walter (wdoekes) wrote :

This is also a bug in apt-get, as it should not stall forever:
https://bugs.launchpad.net/ubuntu/+source/apt/+bug/2003851

Ken Sharp (kennybobs)
tags: added: jammy
Revision history for this message
Mathieu MD (mathieu.md) wrote :

What alternative for APT caching are you using, in the meantime, for servers which cannot reach Internet directly?

Revision history for this message
Walter (wdoekes) wrote :

apt-cacher-ng version 3.7 from Ubuntu/Jammy was very troublesome.

We're now using 3.6.4-1 (or newer? but 3.6.x), based on an Debian/Bullseye image.

This alleviated the problem to a very high degree (or maybe even 100%).

Revision history for this message
Walter (wdoekes) wrote :

P.S. If you tag affects-me-too on 2003851, maybe apt can become a little sturdier in the face of intermittent network problems, which are bound to happen sometimes.

Revision history for this message
Mathieu MD (mathieu.md) wrote :

Thanks Walter. I had already tagged your #2003851, which I had too (see #4 above) and is indeed scary.

OK for Debian+ACNG3.6, but I wondered if there were alternatives to ACNG? Scalded cat fears cold water...

Revision history for this message
Guenther Grill (guenthgr) wrote :

We have the same issue and use 3.6.4-1

Revision history for this message
H4xor (jonathan-selea) wrote (last edit ):

I confirm that this issue persist.
When installing a package on multiple machines, with for example Ansible - this error is triggered.
Is there any workaround at the moment? Or am I forced to look for alternatives, like for example squid`?

EDIT: I have actually installed squid on the same machine to provide access for apt-packages as a temporary solution, I just edited the same listen port to be the same as apt-cacher-ng.

I'll plan to keep it that way until this issue is fixed.

Revision history for this message
Mathieu MD (mathieu.md) wrote :

@guenthgr Do you mean that 3.6.4-1 has, for you, the same bug than 3.7.4-1build1?

Revision history for this message
Koen Roggemans (koen-roggemans) wrote :

Same problem here with 3.7.4 on Ubuntu 22.04 server
We try to keep 1500 laptops up-to-date ...
I just upgraded from 3.3.1 to get rid of a blocking bug there (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=986749)... :-(

Revision history for this message
Guenther Grill (guenthgr) wrote :

@Mathieu: Yes, I ment that 3.6.x seems to have the same issue as 3.7.x

Revision history for this message
Walter (wdoekes) wrote :

After downgrading apt-cacher-ng from 3.7.x to 3.6.x in February we've hardly had any stalling apt anymore (a handful at most). Previously it was several a week (over hundreds of machines).

The change from 3.7.x (3.7.4-1build1, jammy) back to 3.6.x (3.6.4-1, bullseye) definitely improves the situation.

Revision history for this message
Dominic Davis-Foster (domdf) wrote :

I've been seeing this too but only from one machine and only for archive.ubuntu.com (gb.archive.ubuntu.com is fine, as are all other upstream repositories)

apt-cacher-ng 3.7.4 on Ubuntu 22.04

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.