16

I'm having a problem with Python's string.format() and passing Unicode strings to it. This is similar to this older question, except that in my case the test code explodes on the print, not on the logging.info() call. Passing the same Unicode string object to a logging handler works fine.

This fails equally well with the older % formatting as well as string.format(). Just to make sure it was the string object that is the problem, and not print interacting badly with my terminal, I tried assigning the formatted string to a variable before printing.

def unicode_test():
    byte_string = '\xc3\xb4'
    unicode_string = unicode(byte_string, "utf-8")
    print "unicode object type: {}".format(type(unicode_string))
    output_string = "printed unicode object: {}".format(unicode_string)
    print output_string

if __name__ == '__main__':
    unicode_test()

The string object seems to assume it's getting ASCII.

% python -V
Python 2.7.2

% python ./unicodetest.py
unicode object type: <type 'unicode'>
Traceback (most recent call last):
  File "./unicodetest.py", line 10, in <module>
    unicode_test()
  File "./unicodetest.py", line 6, in unicode_test
    output_string = "printed unicode object: {}".format(unicode_string)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf4' in position 0: ordinal not in range(128)

Trying to cast output_string as Unicode doesn't make any difference.

output_string = u"printed unicode object: {}".format(unicode_string)

Am I missing something here? The documentation for the string object seems pretty clear that this should work as I'm attempting to use it.

8
  • Using your code as above but prepending printed unicode object with u works for me (Python 2.6.5 and 2.7). Is the error you are getting when you do that the same one as listed above? Commented Dec 2, 2012 at 22:45
  • Wait... you're encoding a unicode byte stream which is supposed to represent an already encoded unicode stream? What character should print above for '\xc3\xb4': ô or ô? Commented Dec 2, 2012 at 22:57
  • It should be ô. The encoding example was copied pretty much verbatim from the referenced older post about the logging module. Commented Dec 2, 2012 at 23:08
  • And yes, I get the same error when I prepend the string with u. Commented Dec 2, 2012 at 23:16
  • What is your default encoding? try: import sys; print sys.getdefaultencoding() Commented Dec 2, 2012 at 23:32

1 Answer 1

23

No this should not work (can you cite the part of the documentation that says so ?), but it should work if the formatting pattern is unicode (or with the old formatting which 'promotes' the pattern to unicode instead of trying to 'demote' arguments).

>>> x = "\xc3\xb4".decode('utf-8')
>>> x
u'\xf4'
>>> x + 'a'
u'\xf4a'
>>> 'a' + x
u'a\xf4'
>>> 'a %s' % x
u'a \xf4'
>>> 'a {}'.format(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module> UnicodeEncodeError: 'ascii' codec 
  can't encode character u'\xf4' in position 0: ordinal not in range(128)
>>> u'a {}'.format(x)
u'a \xf4'
>>> print u"Foo bar {}".format(x)
Foo bar ô

Edit: The print line may not work for you if the unicode string can't be encoded using your console's encoding. For example, on my Windows console:

>>> import sys
>>> sys.stdout.encoding
'cp852'
>>> u'\xf4'.encode('cp852')
'\x93'

On a UNIX console this may related to your locale settings. It will also fail if you redirect output (like when using | in shell). Most of this issues have been fixed in Python 3.

Sign up to request clarification or add additional context in comments.

2 Comments

@mpounsett: Well, as you can see in the console session I posted, u'Whatever {}'.format(u'\xf4') works, so you may want to recheck your code. Is the error exactly the same? Does it happen in the same line or is it more like: ideone.com/Z3y5Kg ?
Hrm.. I had thought the error was exactly the same, but on a recheck I see it actually migrates to the print statement.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.