Comment 22 for bug 42264

Revision history for this message
Alexander Schulze (schulze) wrote : Re: locale dependant segfault for dd

As Laurent already mentioned, the problem disappears when adding "#, c-format" in the PO format of the file, in front of the problematic entry. I think this is the key to understand what is really going wrong here.

It seems that first running msgunfmt, followed by msgfmt, does not necessarily yield the same MO file again, as one could guess. If C99 format specifications like PRIuMAX are used, msgfmt creates an invalid MO (see below) if "c-format" is not specified explicitly, leading to the crash. In my opinion, msgunfmt should be modified to output "#, c-format" lines for entries that use system-dependent modifiers like PRIuMAX (which should be a rather simple fix), or (better, IMO, as this is the real cause of the problem) msgfmt's heuristic to detect C printf strings should be modified not to require "#, c-format" lines in these cases. As I understand the manual, "#, c-format" should only indicate that the programs should do *additional checks*, but should not *change the behaviour* that dramatically if the strings are valid printf formats.

For me, it seems to be a bug in the gettext package, and it seems to be still without fix in the newest available version. Forwarding the bug report upstream is recommended.

Note that this problem may also affect other package's MO files and may lead to crashes in many other programs. Perhaps the seriousness of the bug should be changed, especially as there is a simple workaround that works quite well here: Just do a msgunfmt-msgfmt cycle on each MO file, but insert a simple (sed?) script in between that prefixes all records with "#, c-format" lines. Until now I have seen no negative side effects, but the problems are gone. This workaround could be applied to all MO files in all language packs before a fixed gettext becomes available, and requires only a minimal amount of resources (CPU time etc.).

Comparing Ubuntu's MO files with those of Debian stable (or SuSE), I find that the binary content of the MOs in these distributions corresponds to that which I get when running msgfmt with the "#, c-format" fixed POs (i.e., %<PRIuMAX> does not appear inside the strings, at least not in the translations, but is replaced by a single % instead, and PRIuMAX appears as a system-dependent string of its own, which is not the case for Ubuntu's MOs). These distributions seem to generate the POs by xgettext or similar, so that all c-format hints are available and used. I'm not quite sure what the translation teams of Ubuntu use, but that may make the difference why it is hitting just Ubuntu alone (as far as I know and could test here).