dash interpreter don't handle some unicode characters correctly
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
dash (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
Binary package hint: dash
The bug happens, in particular, when using Cyrillic letters "ш" and "с".
To reproduce, open a terminal, cd /tmp (or somewhere else - this don't matter), type
$ dash -c "name=прстуфхцч
$ ls
One would expect a file named прстуфхцчшщ to be created with contents, similar to it's name. However, instead of that ls shows a file, named пр?туфхцч?щ. Inside the file a proper string прстуфхцчшщ can be found.
The wrong filename hexdump-s to the following:
0000000 bfd0 80d1 d1d1 d182 d183 d184 d185 d186
0000010 d187 89d1 000a
On the other hand, the correct filename would hexdump to:
0000000 bfd0 80d1 81d1 82d1 83d1 84d1 85d1 86d1
0000010 87d1 88d1 89d1 000a
If bash is used instead of dash, the problem is not present.
The bug is present on both jaunty x86 and jaunty x86_64.
The bug is significant, because "dash" is a default "sh" interpreter for these systems. It is used by system("...") function. In particular, I found the bug while debugging my authomatic Python file-converting script, that failed on files with Cyrillic names, containing "с" and "ш".
-------
ivze@ubuntu-
Description: Ubuntu 9.04
Release: 9.04
Dash package version: 0.5.4-12ubuntu2
Some tests, performed, revealed that the same issue is with capital "Ё" letter.