testtools

assertItemsEqual output is misleading when comparing unicode with str

Bug #1817800 reported by Adam Spiers on 2019-02-26

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	testtools	New	Undecided	Unassigned

Bug Description

In Python's unittest, assertItemsEqual considers items which are unicode strings (e.g. u'foo') equal to strings (e.g. 'foo') if their content is identical, whereas testtools doesn't. The difference is unfortunate and can lead to nasty surprises, although which of those two behaviours is more correct is debatable but outside the scope of this bug report. However, there is a bigger problem which is that the output from testtools.assertItemsEqual() is very misleading when comparing unicode items with str items. Here is a minimal testcase:

--------- 8< --------- 8< --------- 8< --------- 8< --------- 8< ---------
#!/usr/bin/python

# Run via: python -m testtools.run assert_items_equal_testtools

import testtools

class TestAssertItemsEqual(testtools.TestCase):

    def test_assertItemsEqual(self):
        expected = [
            'long value blah blah blah a',
            'long value blah blah blah b',
            'long value blah blah blah c',
            'long value blah blah blah d',
            'long value blah blah blah e',
            'long value blah blah blah f',
            'long value blah blah blah g',
            'long value blah blah blah h',
            'long value blah blah blah i',
        ]
        got = [
            u'long value blah blah blah a',
            u'long value blah blah blah b',
            u'long value blah blah blah c',
            u'long value blah blah blah d',
            u'long value blah blah blah extra',
            u'long value blah blah blah e',
            u'long value blah blah blah f',
            u'long value blah blah blah g',
            u'long value blah blah blah h',
            u'long value blah blah blah i',
        ]
        #self.maxDiff = None
        self.assertItemsEqual(expected, got)
--------- 8< --------- 8< --------- 8< --------- 8< --------- 8< ---------

The resulting output is

--------- 8< --------- 8< --------- 8< --------- 8< --------- 8< ---------
Tests running...
======================================================================
FAIL: assert_items_equal_testtools.TestAssertItemsEqual.test_assertItemsEqual
----------------------------------------------------------------------
Traceback (most recent call last):
  File "assert_items_equal_testtools.py", line 34, in test_assertItemsEqual
    self.assertItemsEqual(expected, got)
  File "/home/adam/SUSE/cloud/OpenStack/git/nova/.tox/functional/lib/python2.7/site-packages/unittest2/case.py", line 1182, in assertItemsEqual
    return self.assertSequenceEqual(expected, actual, msg=msg)
  File "/home/adam/SUSE/cloud/OpenStack/git/nova/.tox/functional/lib/python2.7/site-packages/unittest2/case.py", line 1014, in assertSequenceEqual
    self.fail(msg)
  File "/home/adam/SUSE/cloud/OpenStack/git/nova/.tox/functional/lib/python2.7/site-packages/unittest2/case.py", line 690, in fail
    raise self.failureException(msg)
AssertionError: Sequences differ: ['long value blah blah blah a', 'long valu[232 chars]h i'] != [u'long value blah blah blah a', u'long va[277 chars]h i']

First differing element 5:
long value blah blah blah f
long value blah blah blah extra

Second sequence contains 1 additional elements.
First extra element 9:
long value blah blah blah i

Diff is 714 characters long. Set self.maxDiff to None to see it.

Ran 1 test in 0.005s
FAILED (failures=1)
--------- 8< --------- 8< --------- 8< --------- 8< --------- 8< ---------

which makes it look like 'long value blah blah blah i' is in got but not expected, even though that is not the case at all. It requires an eagle eye to spot the u'' difference on the AssertionError line.

In contrast, the output from unittest.assertItemsEqual() is crystal clear:

Ran 1 test in 0.012s
FAILED (failures=1)
--------- 8< --------- 8< --------- 8< --------- 8< --------- 8< ---------

This issue is actually not exclusive to unicode vs. string comparisons - it applies anywhere where the obtained iterable is sufficiently different to the expected one to exceed the maxDiff threshold, at which point the diff is not shown and the remaining output is too cryptic to easily understand. However it's particularly misleading in the unicode vs. string scenario because there the only useful part of the output is the hard-to-spot u'' difference.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.