The unminimize command fails to reinstall packags with missing files in the `share` and `locale` directories

Bug #1996489 reported by Naomi Rose
42
This bug affects 7 people
Affects Status Importance Assigned to Milestone
cloud-images
Fix Released
Undecided
Unassigned
livecd-rootfs (Ubuntu)
Fix Released
Undecided
Utkarsh Gupta
Jammy
New
Undecided
Utkarsh Gupta
Lunar
New
Undecided
Utkarsh Gupta

Bug Description

I am building a custom Docker image based on:

    https://github.com/microsoft/vscode-dev-containers/tree/main/containers/ubuntu

I added a line that runs `yes | unminmize':

    https://github.com/nomirose/devcontainer/commit/1ba51c651c06b9ce71c73e7bf016939d989cfa2c#diff-4c1f94864a9642897e2fa6c1d532830b6d1a2ba9d6b6f2f149178dde32cf0e77R10-R11

Here's from build log :

https://github.com/nomirose/devcontainer/actions/runs/3451129967/jobs/5760151170#step:6:1565

And here are the errors from the `uminmize` command:

    #9 47.54 Reinstalling packages with system documentation in /usr/share/doc/ ..
    #9 48.76 dpkg-query: error: --search needs at least one file name pattern argument
    #9 48.76
    #9 48.76 Use --help for help about querying packages.
    #9 48.80 Reading package lists...
    #9 49.49 Building dependency tree...
    #9 49.62 Reading state information...
    #9 49.76 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
    #9 49.76 Restoring system translations...
    #9 50.97 dpkg-query: error: --search needs at least one file name pattern argument
    #9 50.97
    #9 50.97 Use --help for help about querying packages.
    #9 51.01 Reading package lists...
    #9 51.71 Building dependency tree...
    #9 51.85 Reading state information...
    #9 52.00 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
    #9 53.18 Documentation has been restored successfully.
    #9 DONE 54.3s

Specifically, the error is:

    dpkg-query: error: --search needs at least one file name pattern argument

I copped `/usr/local/sbin/unminimize` to my local directory to see if I could debug what was doing on.

When looking for packages with missing files in the `/usr/bin/man` directory, the script does this:

    dpkg -S /usr/share/man/ |sed 's|, |\n|g;s|: [^:]*$||' | DEBIAN_FRONTEND=noninteractive xargs apt-get install --reinstall -y

However, for the `/usr/share/docs/` the script does this:

    dpkg --verify --verify-format rpm | awk '/..5...... \/usr\/share\/doc/ {print $2}' | sed 's|/[^/]*$||' | sort |uniq \
        | xargs dpkg -S | sed 's|, |\n|g;s|: [^:]*$||' | uniq | DEBIAN_FRONTEND=noninteractive xargs apt-get install --reinstall -y

The script runs the same command for the `/usr/share/locale/` directory.

In both cases, no packages ever make it to the `xargs dpkg -S` command, so the script throws an error. However, because the error occurs in a pipe, it is lost, and the script continues despite the `set -e` at the start (which is a bug in itself).

However, with a little experimentation, I found that the original command for `/usr/local/bin` works perfectly for the `/usr/share/docs/` and `/usr/share/locale/` directories.

Why not run the same command for all three directories? You could even wrap it in a function.

However, you could improve performance by combining all three into a single command.

The `unminimize` script notes:

    # Reinstallation takes place in two steps because a single dpkg --verified
    # command generates very long parameter list for "xargs dpkg -S" and may go
    # over ARG_MAX. Since many packages have man pages the second download
    # handles a much smaller amount of packages.

I understand the concern about hitting `ARG_MAX`, but I think a better solution to that problem would be to use `xargs -n` to set a limit on the maximum number of arguments. This change would allow xargs to run `dpkg -S` as many times as need (to avoid hitting ARG_MAX).

Doing it this way would also speed up the script.

There are many packages with missing files in multiple minmized directories, meaning that as it stands, even if the bug was fixed, some packages would be re-installed twice, maybe even three times.

(I have seen this for myself because I essentially wrote a shorter version of the script for my own use.)

Related branches

Revision history for this message
John Chittum (jchittum) wrote :

Thank you for reporting the issue and giving suggestions in great detail! Also very timely as myself and another script maintainer were just having conversations about the minimize/unminimize flow.

I've added `livecd-rootfs` package for tracking, as the script lives within those build scripts. We'll take some time to dig into the suggestions and issues and come back with more info.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in livecd-rootfs (Ubuntu):
status: New → Confirmed
Revision history for this message
Ferenc Wágner (wferi) wrote :

I think the main problem with the unminimize script in jammy and later is that dpkg 1.21 changed the way it reports missing files: instead of reporting a checksum failure (..5......) it explicitly says "missing" since https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=963087 was implemented, breaking the unminimize logic entirely, as demonstrated by /usr/share/doc/libprocps8/NEWS.Debian.gz still being missing at the end of

$ docker run --rm -it ubuntu:jammy sh -c 'yes | unminimize; dpkg --verify libprocps8'

This affects all packages having anything besides their copyright and changelog under /usr/share/doc but having no man pages nor localization.

The above also explains why unminimize uses different logic for different files: the changelog and copyright files are always kept around, while manuals and localization files are excluded entirely.

The ugly but harmless errors from dpkg -S could be avoided by simply using the --no-run-if-empty option of xargs.

xargs should automatically take care of not exceeding the argument length limit of the system, so I assume the comment mentioning that is stale.

Revision history for this message
kalvdans (4-launchpad-kalvdans-no-ip-org) wrote :

> However, because the error occurs in a pipe, it is lost, and the script continues despite the `set -e` at the start (which is a bug in itself).

It is not, Posix shell is supposed to behave that way. In bash, you can `set -o pipefail` to get the wanted behaviour.

Revision history for this message
Utkarsh Gupta (utkarsh) wrote :

The fix in livecd-rootfs has landed via https://launchpad.net/ubuntu/+source/livecd-rootfs/23.10.30. I've fixed the regex to do the right thing w.r.t the current output dpkg -V emits.

Marking this as done. \o/

Changed in livecd-rootfs (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Philip Roche (philroche) wrote :

I don't think we can mark this as Done yet. Only mantic has been fixed

Utkarsh Gupta (utkarsh)
Changed in cloud-images:
status: New → Fix Released
Utkarsh Gupta (utkarsh)
Changed in livecd-rootfs (Ubuntu):
assignee: nobody → Utkarsh Gupta (utkarsh)
Changed in livecd-rootfs (Ubuntu Jammy):
assignee: nobody → Utkarsh Gupta (utkarsh)
Changed in livecd-rootfs (Ubuntu Lunar):
assignee: nobody → Utkarsh Gupta (utkarsh)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.