Hmm, eventually we will have to fix the framework to use unicode everywhere, but that's not feasible right now.
The rule is as follows: as a best practice developers should write all strings in code and XML in plain English (and use the translation system for other languages). But of course this does not prevent them from using non-ASCII characters in some circumstances (symbols, quick prototypes/tests, etc.). So the framework/server should of course allow non-ASCII characters in XML.
A bug was introduced by the fix for bug 608029 because it returns one of the parameters passed to _get_source() without checking its type, causing _get_source() to return inconsistent types. This is the case both in 5.0.12+ and in trunk.
Raphael's patch is correct (because _get_source() should always return unicode and node.set() expects unicode), but not sufficient because we need to also make sure that _get_source() does return unicode in all circumstances.
Here is what I will do:
1. In both 5.0 and trunk: apply Raphael's patch + also jsh's suggestion: ensure _get_source() returns unicode even when it returns its 'source' parameter
2. In trunk: eventually we should improve things even more: _get_source() should document that it expects only unicode arguments and always returns unicode arguments. We should add an assert verifying that the parameters are indeed unicode, and fix all callers in the framework so that they do indeed pass unicode (this method is private and should not be used directly by addons).
Hmm, eventually we will have to fix the framework to use unicode everywhere, but that's not feasible right now.
The rule is as follows: as a best practice developers should write all strings in code and XML in plain English (and use the translation system for other languages). But of course this does not prevent them from using non-ASCII characters in some circumstances (symbols, quick prototypes/tests, etc.). So the framework/server should of course allow non-ASCII characters in XML.
A bug was introduced by the fix for bug 608029 because it returns one of the parameters passed to _get_source() without checking its type, causing _get_source() to return inconsistent types. This is the case both in 5.0.12+ and in trunk.
Raphael's patch is correct (because _get_source() should always return unicode and node.set() expects unicode), but not sufficient because we need to also make sure that _get_source() does return unicode in all circumstances.
Here is what I will do:
1. In both 5.0 and trunk: apply Raphael's patch + also jsh's suggestion: ensure _get_source() returns unicode even when it returns its 'source' parameter
2. In trunk: eventually we should improve things even more: _get_source() should document that it expects only unicode arguments and always returns unicode arguments. We should add an assert verifying that the parameters are indeed unicode, and fix all callers in the framework so that they do indeed pass unicode (this method is private and should not be used directly by addons).