Attached patch addresses the reference loop in apt_inst.DebFile, ensuring that no longer referencing the object will release it immediately, as opposed to when the cycle GC runs.
Subject: [PATCH] apt_inst.DebFile: Avoid reference cycle with control,data
members
apt_inst.DebFile provides two members `data` and `control` for
easy access to those tarballs. Each of those members stores a
reference to the DebFile as its owner:
v-----------------\
control ----\ | -> deb -|
data ----/ | ^-----------------/
This means that whenever a DebFile is successfully constructed,
and no longer needed, it won't be collected until the GC runs,
which is bad, as the DebFile holds an open FileFd.
Introduce a __FileFd wrapper that holds the FileFd and becomes
the owner of both control and data, and replaces the direct use
of the FileFd in ArArchive/DebFile:
v-----------------------------\
control ----\ \ -> __FileFd <- deb -|
data ----/ / ^-----------------------------/
This avoids the reference cycle, ensuring the memory and file
descriptor are released by the reference counter as soon as
the reference count drops to 0.
A future version should move `apt_inst.__FileFd` to `apt_pkg.FileFd`
and expose all the methods, such that people can make use of FileFd's
extensive compression support.
We have a similar cycle in TagFile that we have yet to address,
the problem there is arguably more frustrating, as the buffer
I believe is stored inside the TagFile, and that's really shared
between the TagSection objects.
This is related to LP: #1899193 and CVE-2020-27351, but an additional
hardening measure - the fix for those bugs was for more direct leaks.
Attached patch addresses the reference loop in apt_inst.DebFile, ensuring that no longer referencing the object will release it immediately, as opposed to when the cycle GC runs.
Subject: [PATCH] apt_inst.DebFile: Avoid reference cycle with control,data
members
apt_inst.DebFile provides two members `data` and `control` for
easy access to those tarballs. Each of those members stores a
reference to the DebFile as its owner:
control ----\ |
data ----/ |
This means that whenever a DebFile is successfully constructed,
and no longer needed, it won't be collected until the GC runs,
which is bad, as the DebFile holds an open FileFd.
Introduce a __FileFd wrapper that holds the FileFd and becomes
the owner of both control and data, and replaces the direct use
of the FileFd in ArArchive/DebFile:
control ----\ \
data ----/ /
This avoids the reference cycle, ensuring the memory and file
descriptor are released by the reference counter as soon as
the reference count drops to 0.
A future version should move `apt_inst.__FileFd` to `apt_pkg.FileFd`
and expose all the methods, such that people can make use of FileFd's
extensive compression support.
We have a similar cycle in TagFile that we have yet to address,
the problem there is arguably more frustrating, as the buffer
I believe is stored inside the TagFile, and that's really shared
between the TagSection objects.
This is related to LP: #1899193 and CVE-2020-27351, but an additional
hardening measure - the fix for those bugs was for more direct leaks.