soyuz upload system pays attention to ftp sessions
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
txpkgupload |
Triaged
|
Low
|
Unassigned |
Bug Description
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
affects /products/soyuz
done
I am informed that soyuz, when handling incoming uploads, assumes that
all of the components of an upload (the .changes and the files listed
in it) will appear in a single FTP session. This is a violation of
the implied semantics of FTP, and can cause practical problems.
For example, if you use dupload but your upload fails (eg due to
network problems) after successfully completing some files, then
dupload will record success for those files but soyuz will delete
then. If you then rerun dupload it will upload only the files which
were not successfully transferred the first time, but the
already-uploaded files will have been previously deleted by soyuz.
As another example, you might reasonably upload the different parts of
an upload from different systems to save bandwidth on small links.
(Often the .orig.tar.gz is very large.)
As a third example, you might be behind an application relay (web
proxy) which starts a new FTP connection for each transfer. That's
obviously not ideal and is slow and wasteful but it's not demonstrably
wrong.
The correct solution is for soyuz to keep files hanging around rather
than giving each new upload connection a new blank directory. Clashes
between files of different names can be resolved in favour of the most
recent, if each target distribution or namespace has a separate upload
directory. Races can be avoided by careful programming.
Ian.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.6 <http://
iD8DBQFD43hN8jy
DI+R1DR6Cp52eJf
=CRKM
-----END PGP SIGNATURE-----
Changed in soyuz: | |
status: | Needs Info → Confirmed |
Changed in soyuz: | |
assignee: | Celso Providelo (cprov) → nobody |
tags: | added: poppy |
affects: | launchpad → txpkgupload |
tags: | removed: poppy |
information type: | Public → Private |
description: | updated |
information type: | Private → Public |
I agree with some of your points and we do have a plan for rearranging the directory structure on the server to allow 'lazy' uploads and also allow 'Personal Package Archives'uploads to be landed. Something like:
{{{ <distro_ name>/ <person_ name>/distros/ <distro_ name>/
/distros/
/people/
}}}
Each directory beyond those paths will be processed contextually.
Following you suggestion, I think, we need to support a single control file inside the upload directory which controls its processing, of course, it needs support from 'dput' side. It would be something like:
{{{ ubuntu/ <upload_ directory_ name>/< control_ file> containing:
at /distros/
proccess = False
}}}
I don't know how to define a reliable way to name the upload directory, since it will be required in the' dput' side to continue the upload later.
Also it needs a way to minimize exploits by having a lot of locked upload_directories, maybe removing directories older than a some age (one day would be a lot, IMO).
Anyway, the current prototype hasidentified problem, let's discuss available mid-term solutions.