Comment 2 for bug 1118469

Revision history for this message
Jeremy Stanley (fungi) wrote :

I think it wouldn't be too hard to push this a little further. We already agree that we don't trust workers which run code from arbitrary projects, and to build an sdist tarball this is necessary. By extension, we should not implicitly trust these tarballs. I think an ideal solution would be:

0. untrusted worker builds the sdist tarball as currently based on the usual triggers (git tag, et cetera)

1. trusted worker checks out the associated git tag, retrieves and untars the tarball, copies the .git directory from the checkout into the directory where the tarball was unpacked and confirms that git indicates no changes (note this means we may need to track a couple of sdist-related files in .gitignore which we currently do not, but that seems like a good idea anyway)

2. if the validation succeeds, use gpg to generate a detached signature of the tarball with a locally-installed key specific to that trusted worker

3. upload the detached signature into the same directory as the tarball is being served from, attesting to its validity

This mechanism provides additional assurance beyond a simple checksum file. If someone is going to compromise a tarball, they're most likely to do it by MitM'ing the download (in which case they can also just modify the checksum file in flight as well) or by altering it at rest where it's published (and can again do the same to the checksum published with it). Signing the file not only guards against these (because the signature cannot be forged), but also against compromise of the aforementioned untrusted worker which built the tarball and generated the checksum.

If we still want to provide checksums for convenience (inclusion in release announcements and the like), we can do that in step 2 and sign the checksum list with the same key while we're at it, then upload it in step 3 along with the signature of the tarball. This provides similar assurances that a checksum has not been falsified.

To expand on this, if we want to protect against compromise of the trusted worker, the script it uses to vet tarballs will be published and easy to run independently. Further, we could create a release-sigs branch in each project where individuals are expected to upload detached signatures of tarballs they've vetted, and then code review those such that CI automation can run check/gate tests to confirm the signature does actually verify the tarball correctly, uploading it to be served alongside that tarball once approved in Gerrit. Going a step further, our upload jobs (for example PyPI uploads) could be triggered not off the signed git tag, but instead off an approved signature upload matching a predetermined project-specific keyring, and then include that signature in the upload (also revalidating it at upload time for sanity).