assertItemsEqual output is misleading when comparing unicode with str
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
testtools |
New
|
Undecided
|
Unassigned |
Bug Description
In Python's unittest, assertItemsEqual considers items which are unicode strings (e.g. u'foo') equal to strings (e.g. 'foo') if their content is identical, whereas testtools doesn't. The difference is unfortunate and can lead to nasty surprises, although which of those two behaviours is more correct is debatable but outside the scope of this bug report. However, there is a bigger problem which is that the output from testtools.
--------- 8< --------- 8< --------- 8< --------- 8< --------- 8< ---------
#!/usr/bin/python
# Run via: python -m testtools.run assert_
import testtools
class TestAssertItems
def test_assertItem
expected = [
'long value blah blah blah a',
'long value blah blah blah b',
'long value blah blah blah c',
'long value blah blah blah d',
'long value blah blah blah e',
'long value blah blah blah f',
'long value blah blah blah g',
'long value blah blah blah h',
'long value blah blah blah i',
]
got = [
u'long value blah blah blah a',
u'long value blah blah blah b',
u'long value blah blah blah c',
u'long value blah blah blah d',
u'long value blah blah blah extra',
u'long value blah blah blah e',
u'long value blah blah blah f',
u'long value blah blah blah g',
u'long value blah blah blah h',
u'long value blah blah blah i',
]
--------- 8< --------- 8< --------- 8< --------- 8< --------- 8< ---------
The resulting output is
--------- 8< --------- 8< --------- 8< --------- 8< --------- 8< ---------
Tests running...
=======
FAIL: assert_
-------
Traceback (most recent call last):
File "assert_
self.
File "/home/
return self.assertSequ
File "/home/
self.fail(msg)
File "/home/
raise self.failureExc
AssertionError: Sequences differ: ['long value blah blah blah a', 'long valu[232 chars]h i'] != [u'long value blah blah blah a', u'long va[277 chars]h i']
First differing element 5:
long value blah blah blah f
long value blah blah blah extra
Second sequence contains 1 additional elements.
First extra element 9:
long value blah blah blah i
Diff is 714 characters long. Set self.maxDiff to None to see it.
Ran 1 test in 0.005s
FAILED (failures=1)
--------- 8< --------- 8< --------- 8< --------- 8< --------- 8< ---------
which makes it look like 'long value blah blah blah i' is in got but not expected, even though that is not the case at all. It requires an eagle eye to spot the u'' difference on the AssertionError line.
In contrast, the output from unittest.
--------- 8< --------- 8< --------- 8< --------- 8< --------- 8< ---------
Tests running...
=======
FAIL: assert_
-------
Traceback (most recent call last):
File "assert_
self.
File "/usr/lib64/
self.fail(msg)
File "/usr/lib64/
raise self.failureExc
AssertionError: Element counts were not equal:
First has 0, Second has 1: u'long value blah blah blah extra'
Ran 1 test in 0.012s
FAILED (failures=1)
--------- 8< --------- 8< --------- 8< --------- 8< --------- 8< ---------
This issue is actually not exclusive to unicode vs. string comparisons - it applies anywhere where the obtained iterable is sufficiently different to the expected one to exceed the maxDiff threshold, at which point the diff is not shown and the remaining output is too cryptic to easily understand. However it's particularly misleading in the unicode vs. string scenario because there the only useful part of the output is the hard-to-spot u'' difference.