Encoding issue when reading file in Python

Question

I have a file containing

    foo = "Gro\xdfbritannien"

I'm using the following, but it always displays the original text with the \x

    import codecs
    f = codecs.open('myfile', 'r', 'utf8')
    for line in f:
      print line
      print line.encode('utf-8')
      print line.decode('utf-8')

I can't see how to display the proper encoded text, as when I'm doing

    >>> print u'Gro\xdfbritannien'
    Großbritannien

Any hint would be appreciated!

If your file literally has a quoted string with a backslash and an x in it, you'll need to parse the string literal with something like decode('string-escape'). — user2357112
– user2357112, Commented Feb 13, 2014 at 9:10

Tim Pietzcker · Accepted Answer · 2014-02-13 09:18:18Z

4

When your file contains the line

foo = "Gro\xdfbritannien"

it contains an actual backslash character, followed by x , d and f. So if that line is read into a Python string, it is read as

'foo = "Gro\\xdfbritannien"'

(and since those are all ASCII characters, it doesn't matter if you open it with the utf-8 codec or not).

So you need to decode it first using the string_escape codec:

>>> foo.decode("string_escape")
'Gro\xdfbritannien'

and then decode it to the correct Unicode object

>>> _.decode("latin1")
u'Gro\xdfbritannien'

which you can then print

>>> print _
Großbritannien

edited Feb 13, 2014 at 9:18

answered Feb 13, 2014 at 9:12

Tim Pietzcker

337k59 gold badges520 silver badges572 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

apassant Over a year ago

Thanks - works perfectly with print line.decode("string_escape").decode("latin1")

UnZike · Accepted Answer · 2014-02-13 09:20:44Z

-1

There is no business of codec. You should do like this 'foo = "Gro\xdfbritannien"'

>>> print u'Gro\\xdfbritannien'
Gro\xdfbritannien

answered Feb 13, 2014 at 9:20

UnZike

1031 silver badge7 bronze badges

Collectives™ on Stack Overflow

Encoding issue when reading file in Python

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related