I've been trying a couple of possibilities. I think the changes to gtxml are ready-ish to be committed, but I'd like your opinions on whether these are real errors or not and perhaps modify the behaviour a bit. Okay: Either we check, for each tag in the msgstr, whether that tag is included in the corresponding msgid. This is somewhat strict and will complain if someone wants Something in cases where that does not enter in the English version. Else we check each tag against a collection which could be extracted from all msgids. The two checks result in vastly different size of output (as tested on the Spanish documentation). The strict one yields 2000 lines while the other one 100 lines. Here are the two files: http://www.student.dtu.dk/~ashj/opendir/gtxml-output.txt http://www.student.dtu.dk/~ashj/opendir/gtxml-output-nodatabase.txt Some examples: ./es/ghex-help.master.es.po, line 1339: Unrecognized element "placeholder-1" found in msgstr ------------------------------------------------------------------------------ #: C/legal.xml:28(legalnotice/para) msgid "" "DOCUMENT AND MODIFIED VERSIONS OF THE DOCUMENT ARE PROVIDED UNDER THE TERMS " "OF THE GNU FREE DOCUMENTATION LICENSE WITH THE FURTHER UNDERSTANDING THAT: " "<_:orderedlist-1/>" msgstr "" "EL DOCUMENTO Y LAS VERSIONES MODIFICADAS DEL MISMO SE PROPORCIONAN CON " "SUJECIÓN A LOS TÉRMINOS DE LA GFDL, QUEDANDO BIEN ENTENDIDO, ADEMÁS, " "QUE: " The "placeholder-1" tag is incorrect. The "compare-to-msgid" method will discover this, but the "database"-method will only discover it if the tag "placeholder-1" is not in fact used anywhere at all. Here is another one: ./es/filters~blur.master.es.po, line 729: Unrecognized element "citation" found in msgstr ------------------------------------------------------------------------------ #: src/filters/blur/introduction.xml:110(para) msgid "" "You can find a nice explanation of the Abraham Lincoln effect at . You will see the Salvador Dali's " "painting Gala Contemplating the Mediterranean Sea turning to " "an Abraham Lincoln's portrait when looking at it from a distance." msgstr "" "Puede ver una interesante explicación, en inglés, del efecto Abraham " "Lincoln en Bach04." Is "citation" actually an illegal tag or has it been chosen for some good reason? Here is a case of a correct-looking tag ("guiicon") not being found anywhere in any msgid. But it looks as if it has been used on purpose: ./es/gnote-help.master.es.po, line 620: Unrecognized element "guiicon" found in msgstr ------------------------------------------------------------------------------ #: C/gnote-addin-timestamp.page:22(page/p) msgid "" "The Tools button is represented by the " "icon. When you click the Tools icon on the toolbar present on your note, a " "menu will appear." msgstr "" "El botón Herramientas se representa con el icono " " . Cuando pulse en el icono " "Herramientas de la barra de herramientas presente en su " "nota, aparecerá un menú." Any comments?