You are correct. It is beyond my reach to make that assumption for everyone else
on the planet. That's why I restricted myself to doing so in a Fedora Core RFE,
in Fedora bugzilla.
In the context of _Fedora_ it's perfectly reasonable to label those who refuse
to use UTF-8 as Luddites. You just have to look at the quality of the
alternative 'solution' which was proposed -- hacking all the RPM formats from
specfile through to the database to tag data in random formats instead of just
storing it in a consistent encoding in the first place.
Since you persist in trolling the Fedora bugzilla and talking about non-Fedora
issues, I suppose I might as well capitulate and discuss it...
There's no excuse for avoiding UTF-8 in RPM internals, even outside the context
of Fedora. That would really be pointless -- there's certainly no need to
'extend' its file formats when we can just store data in UTF-8, which can
represent the older encodings.
We can quite happily fix rpmq to convert from UTF-8 to the current locale in its
output, and fix rpmbuild to convert _to_ UTF-8 from the current locale when
reading the specfile. Although we certainly wouldn't want the latter in Fedora
-- if I check out the current libxml2/devel branch from CVS and attempt to build
it, for example, it should _fail_. It certainly shouldn't use _my_ locale (and
it'd fail anyway because of course my locale is UTF-8).
You'd need a way to handle existing RPM databases, which may contain random data
in unknown encodings. Probably an 'rpm --rebuilddb --oldcharset=FOO' on RPM
upgrade? This isn't a new problem _anyway_ since an existing RPM database
without either a consistent charset or charset tagging is just line noise.
And of course you might have to call it 'RPM-CHARSET' instead of 'UTF-8' to
appease those who have religious objections to UTF-8.
You are correct. It is beyond my reach to make that assumption for everyone else
on the planet. That's why I restricted myself to doing so in a Fedora Core RFE,
in Fedora bugzilla.
In the context of _Fedora_ it's perfectly reasonable to label those who refuse
to use UTF-8 as Luddites. You just have to look at the quality of the
alternative 'solution' which was proposed -- hacking all the RPM formats from
specfile through to the database to tag data in random formats instead of just
storing it in a consistent encoding in the first place.
Since you persist in trolling the Fedora bugzilla and talking about non-Fedora
issues, I suppose I might as well capitulate and discuss it...
There's no excuse for avoiding UTF-8 in RPM internals, even outside the context
of Fedora. That would really be pointless -- there's certainly no need to
'extend' its file formats when we can just store data in UTF-8, which can
represent the older encodings.
We can quite happily fix rpmq to convert from UTF-8 to the current locale in its
output, and fix rpmbuild to convert _to_ UTF-8 from the current locale when
reading the specfile. Although we certainly wouldn't want the latter in Fedora
-- if I check out the current libxml2/devel branch from CVS and attempt to build
it, for example, it should _fail_. It certainly shouldn't use _my_ locale (and
it'd fail anyway because of course my locale is UTF-8).
You'd need a way to handle existing RPM databases, which may contain random data
in unknown encodings. Probably an 'rpm --rebuilddb --oldcharset=FOO' on RPM
upgrade? This isn't a new problem _anyway_ since an existing RPM database
without either a consistent charset or charset tagging is just line noise.
And of course you might have to call it 'RPM-CHARSET' instead of 'UTF-8' to
appease those who have religious objections to UTF-8.