pyexiv2

Bug #1016066
Comment #2

Comment 2 for bug 1016066

Revision history for this message

Hobson Lane (hobs) wrote on 2012-08-03:

Nikon image with non ascii (128) in the UserComment EXIF tag Edit (2.1 MiB, image/jpeg)

I think this is the same old bug that you fixed before and should rear it's head on any of those previously submitted Nikon images that have binary or unicode in the the UserComment string. I've attached one that I used to duplicate it this morning on the latest pyexiv2 for precise pangolin ('0.3.2').

To duplicate, run this following function on an image instance that has been loaded and read by pexiv2 (or you can clone tagim from <email address hidden>:hobsonlane/tagim.git and do `tagim -i 'DSCN2162.JPG' --debug` ....

def display_meta_str(im):
    keysets = {'EXIF':im.exif_keys, 'IPTC':im.iptc_keys, ' XMP':im.xmp_keys}
    for name,keys in keysets.items():
        title = ' %s Data '%name
        print '-'*30 + title + '-'*30
        for k in keys:
            print u'{0}: {1}'.format(str(k),str(im[k].value))
        print '-'*(60+len(title))
    print '-'*30 + ' Comment '+'-'*30
    print im.comment
    print '-'*(60+len(title))
    return keysets.values()

Here's the output for this particular image (with error message at end):

Image file name: 'DSCN2161.JPG'
------------------------------ IPTC Data ------------------------------
-----------------------------------------------------------------------
------------------------------ XMP Data ------------------------------
-----------------------------------------------------------------------
------------------------------ EXIF Data ------------------------------
Exif.Image.ImageDescription:
Exif.Image.Make: NIKON
Exif.Image.Model: COOLPIX L18
Exif.Image.Orientation: 1
Exif.Image.XResolution: 300
Exif.Image.YResolution: 300
Exif.Image.ResolutionUnit: 2
Exif.Image.Software: COOLPIX L18 V1.1
Exif.Image.DateTime: 2009-07-23 22:04:56
Exif.Image.YCbCrPositioning: 2
Exif.Image.ExifTag: 230
Exif.Photo.ExposureTime: 1/60
Exif.Photo.FNumber: 14/5
Exif.Photo.ExposureProgram: 2
Exif.Photo.ISOSpeedRatings: 565
Exif.Photo.ExifVersion: 0220
Exif.Photo.DateTimeOriginal: 2009-07-23 22:04:56
Exif.Photo.DateTimeDigitized: 2009-07-23 22:04:56
Exif.Photo.ComponentsConfiguration:
Exif.Photo.CompressedBitsPerPixel: 2
Exif.Photo.ExposureBiasValue: 0
Exif.Photo.MaxApertureValue: 3
Exif.Photo.MeteringMode: 5
Exif.Photo.LightSource: 0
Exif.Photo.Flash: 25
Exif.Photo.FocalLength: 57/10
Traceback (most recent call last):
  File "/home/hobs/bin/tagim", line 346, in <module>
    tagim.display_meta_str(im)
  File "/home/hobs/src/tagim/tg/tagim.py", line 212, in display_meta_str
    print u'{0}: {1}'.format(str(k),str(im[k].value))
UnicodeDecodeError: 'ascii' codec can't decode byte 0x8e in position 52: ordinal not in range(128)

And here's the workaround -- use unicode() instead of str()):

    Assumes any single-byte str is UTF-8 or ASCII.
    """
    if type(s)==unicode: return s
    elif type(s)==str: return s.decode('UTF-8',errors=errors)
    else: return unicode(s)

Or run tagim without the `--debug` flag.

To duplicate, run this following function on an image instance that has been loaded and read by pexiv2 (or you can clone tagim from git@github.com:hobsonlane/tagim.git and do `tagim -i 'DSCN2162.JPG' --debug` ....

def display_meta_str(im):
    keysets = {'EXIF':im.exif_keys, 'IPTC':im.iptc_keys, ' XMP':im.xmp_keys}
    for name,keys in keysets.items():
        title = ' %s Data '%name
        print '-'*30 + title + '-'*30
        for k in keys:
            print u'{0}: {1}'.format(str(k),str(im[k].value))
        print '-'*(60+len(title)) 
    print '-'*30 + ' Comment '+'-'*30
    print im.comment
    print '-'*(60+len(title)) 
    return keysets.values()

Here's the output for this particular image (with error message at end):

Image file name: 'DSCN2161.JPG'
------------------------------ IPTC Data ------------------------------
-----------------------------------------------------------------------
------------------------------  XMP Data ------------------------------
-----------------------------------------------------------------------
------------------------------ EXIF Data ------------------------------
Exif.Image.ImageDescription:           
Exif.Image.Make: NIKON
Exif.Image.Model: COOLPIX L18
Exif.Image.Orientation: 1
Exif.Image.XResolution: 300
Exif.Image.YResolution: 300
Exif.Image.ResolutionUnit: 2
Exif.Image.Software: COOLPIX L18 V1.1
Exif.Image.DateTime: 2009-07-23 22:04:56
Exif.Image.YCbCrPositioning: 2
Exif.Image.ExifTag: 230
Exif.Photo.ExposureTime: 1/60
Exif.Photo.FNumber: 14/5
Exif.Photo.ExposureProgram: 2
Exif.Photo.ISOSpeedRatings: 565
Exif.Photo.ExifVersion: 0220
Exif.Photo.DateTimeOriginal: 2009-07-23 22:04:56
Exif.Photo.DateTimeDigitized: 2009-07-23 22:04:56
Exif.Photo.ComponentsConfiguration: 
Exif.Photo.CompressedBitsPerPixel: 2
Exif.Photo.ExposureBiasValue: 0
Exif.Photo.MaxApertureValue: 3
Exif.Photo.MeteringMode: 5
Exif.Photo.LightSource: 0
Exif.Photo.Flash: 25
Exif.Photo.FocalLength: 57/10
Traceback (most recent call last):
  File "/home/hobs/bin/tagim", line 346, in <module>
    tagim.display_meta_str(im)
  File "/home/hobs/src/tagim/tg/tagim.py", line 212, in display_meta_str
    print u'{0}: {1}'.format(str(k),str(im[k].value))
UnicodeDecodeError: 'ascii' codec can't decode byte 0x8e in position 52: ordinal not in range(128)

And here's the workaround -- use unicode() instead of str()):

print u'{0}: {1}'.format(unicode_noerr(k),unicode_noerr(im[k].value, errors='replace'))
def unicode_noerr(s,errors='replace'):
    """
    Coerce input into a unicode (multibyte) string regardless of the type of input, without raising exceptions.
   
    Assumes any single-byte str is UTF-8 or ASCII.
    """
    if type(s)==unicode: return s
    elif type(s)==str: return s.decode('UTF-8',errors=errors)
    else: return unicode(s)

Or run tagim without the `--debug` flag.