Legacy encodings ID3 tags support
Bug #223547 reported by
Jiahua Huang
This bug report is a duplicate of:
Bug #135985: media file content and filename encoding is not consistient.
Edit
Remove
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Exaile |
Confirmed
|
Medium
|
Unassigned |
Bug Description
Now the ID3v1 was supposed to be encoded in utf-8(Latin1), but
many mp3's need to use the legacy charset like gb18030 (big5, euc-jp or euc-kr) in order to see,
especially when obtaining the mp3's from p2p programs.
So it need to guesses and converts ID3 tags from legacy encodings to Unicode.
To post a comment you must log in.
I am also suffering from this bug, and strongly suggest the patch provided by Jiahua Huang be adopted.
Most mp3 files(about 90%+ without exaggeration) in mainland China still have tags encoded in gb18030. So this is a critical bug affecting all Chinese users. We Chinese are trying hard to find music players supporting legacy encoding in Linux, but failed. This is even the most important reason why my friends say Linux has poor support on Chinese language.
Besides, Huang's patch doesn't affect well encoded UTF-8 tags, but will add support for legacy encodings. The patch also works for Exaile 0.2.99, by adding the code block into xl/metadata/_id3.py before "class ID3Format", like:
...
from mutagen import id3
_unicode=unicode 'utf8', errors= 'strict' ): decode( 'utf8') .encode( 'iso8859- 1')
def unicode(string, encoding=
try:
string = string.
except:
return _unicode(string)
for enc in ('utf8', 'gb2312', 'big5', 'gb18030', 'big5hkscs', 'euc-jp', 'euc_kr', 'cp1251', 'utf16'):
try:
return string.decode(enc)
except:
pass
return string
class ID3Format( BaseFormat) :
...