VM

Coding system bug - Julian Bradfield

Bug #716041 reported by Uday Reddy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
VM
Fix Released
High
Uday Reddy

Bug Description

Julian Bradfield reports (vm.info, 2011-1-1)

OK, here's the first real bug. The symptoms are corruption of data
when converting mime objects to other types. It's something I
encountered a while ago in old vm, and fixed in mine, but I'm not sure
what design of fix is correct, so I'll describe it rather than just
sending in a patch.

When a mime object is type-converted, it's fed to an external
program. The write to that program is (following an earlier partial
discovery of this problem by me) done in binary.

However, the read *from* the program is done in the default coding
system, probably for most people utf-8. That seems right for text, but
not if you're converting to another binary object.
However...it's not even right for text, because the output of the program
is decoded again by vm later on when it parses the new mime part
that's been created by conversion.
So the output should be read in binary, regardless of what it is.

The question then is what to do in the explicit decoding; or rather
what charset the type-conversion should assign to the new mime part,
so that decoding is done correctly. At present, no charset is
assigned, so the decoding is done in binary; which is wrong if what
the conversion program output was utf-8 text, and so your
type-converted stuff gets corrupted. (In my case, converting msword to
text.)

My solution to this was to allow the coding-system to be specified
explicitly as an extra element of the elements of
vm-mime-type-converter-alist.

However, I'm not sure that's correct. Arguably, the coding system
should simply be the default for text output types, and binary for
others, because any program outputting text is going to pickup the
Unix locale, and any program not outputting text isn't.

If that's agreed, I'll do that and send along a patch.

Revision history for this message
Uday Reddy (reddyuday) wrote :

> However, I'm not sure that's correct. Arguably, the coding system
> should simply be the default for text output types, and binary for
> others, because any program outputting text is going to pickup the
> Unix locale, and any program not outputting text isn't.
>
> If that's agreed, I'll do that and send along a patch.

This solution sounds right to me. Please do send me a patch whenever
you are ready.

Revision history for this message
Uday Reddy (reddyuday) wrote :

Julian sends a patch:

I think this patch is probably right, but it should be checked on
FSFmacs.

Revision history for this message
Uday Reddy (reddyuday) wrote :

Applied the patch to revision 1041 of trunk, and revision 773 of the 8.1.x branch.

Changed in vm:
status: Triaged → In Progress
Uday Reddy (reddyuday)
Changed in vm:
status: In Progress → Fix Committed
Revision history for this message
Uday Reddy (reddyuday) wrote :

Fixed in revision 1051 of trunk and revision 778 of the 8.1.x branch.

Bug 717505 points to other similar problems that might continue to exist.

Uday Reddy (reddyuday)
tags: added: 7.19
Uday Reddy (reddyuday)
Changed in vm:
status: Fix Committed → Fix Released
Uday Reddy (reddyuday)
no longer affects: vm/8.1.x
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.