I have been doing some experiments and Python regex engine seems to support unicode if unicode arguments and re.U flag are provided (example 3).
$ python Python 2.7.1+ (r271:86832, Apr 11 2011, 18:05:24) >>> import re >>> print re.search("\w+", "aaaáÁá...").group() #1 aaa >>> print re.search(u"\w+", u"aaaáÁá...").group() #2 aaa >>> print re.search(u"\w+", u"aaaáÁá...", re.U).group() #3 aaaáÁá >>> print re.search("\w+", "aaaáÁá...", re.U).group() #4 aaa
I have been doing some experiments and Python regex engine seems to support unicode if unicode arguments and re.U flag are provided (example 3).
$ python ..").group( ) #1 ..").group( ) #2
Python 2.7.1+ (r271:86832, Apr 11 2011, 18:05:24)
>>> import re
>>> print re.search("\w+", "aaaáÁá.
aaa
>>> print re.search(u"\w+", u"aaaáÁá.
aaa
>>> print re.search(u"\w+", u"aaaáÁá...", re.U).group() #3
aaaáÁá
>>> print re.search("\w+", "aaaáÁá...", re.U).group() #4
aaa