ISourcePackagePublishingHistory.changes_file_text is a Bytes field exported as text, which causes encoding errors

Bug #351725 reported by Leonard Richardson
24
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
High
Julian Edwards

Bug Description

Changes files are stored in the librarian, which means they could conceivably be binary files. cprov says, "changesfiles are always text, but we have a hard time figuring out which encoding it uses during upload processing." So changes files might be stored in the librarian with an arbitrarily strange encoding.

ISourcePackagePublishingHistory.changes_file_text is exported as a Text field. The lazr.restful marshaller for Text assumes you can call unicode() on a string and get Unicode. This works for all other Text fields because their data comes from text fields in the database. But if a changes file is encoded with an encoding that's not compatible with UTF-8, unicode() will fail, and you'll get an OOPS like OOPS-1184EA10.

There are two solutions. One is to coerce uploaded changes files to UTF-8 before storing them in the librarian. The other is to define a BytesPublishableAsText field (subclass of Bytes, which is what we use for librarian files) and do the coercion in the marshaller. In either case an encoding sniffing library like UnicodeDammit (part of Beautiful Soup) would be helpful.

Tags: lp-soyuz
Changed in soyuz:
assignee: nobody → julian-edwards
importance: Undecided → High
milestone: none → 2.2.4
status: New → Triaged
Revision history for this message
Celso Providelo (cprov) wrote :

'source_package_publishing.changes_file_text' was temporarily removed from the API for 2.2.3 and it solves the issue for this cycle (code available for tests in staging.launchpad.net)

We will fix the encoding problem and re-export it in 2.2.4.

Changed in soyuz:
status: Triaged → In Progress
Revision history for this message
Diogo Matsubara (matsubara) wrote : Bug fixed by a commit

Fixed in devel r8247.

Changed in soyuz:
status: In Progress → Fix Committed
Changed in soyuz:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.