0

This is probably something very simple and I know that there are tons of similar cases like mine here on SO, but I just can't figure out how to fix this. I'm still rather new to Python.

Problem

I have a JSON file (expr.json) with the following contents:

{
    "vowel": "a|e|i|o|u|y|ä|ö",
    "consonant": "b|c|d|f|g|h|j|k|l|m|n|p|r|s|š|t|v|z|ž"
}

I want tho read the file and parse it's contents using Python's JSON module. I want to compile the values of the keys using re.compile later. Here is my code (main.py):

#!/usr/bin/python
# vim: set fileencoding=utf-8 :

import json

myfile = open('expr.json')
data = myfile.read()
myfile.close()

json_data = json.loads(data)
print json_data    # {u'consonant': u'b|c|d|f|g|h|j|k|l|m|n|p|r|s|\u0161|t|v|z|\u017e', u'vowel': u'a|e|i|o|u|y|\xe4|\xf6'}

But when I try to acceess 'vowel':

json_data['vowel']

I get the following error message:

Traceback (most recent call last):

File "/path to main.py", line 11, in

print json_data['vowel']

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 12: ordinal > not in range(128) [Finished in 0.1s with exit code 1]

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 25: ordinal not in range(128)

What have I tried

1) Trying to encode string before calling json.loads using data.encode('utf-8') => Still the same error message

2) Escaping error causing characters (ä, ö) using their escaped versions: (\u00E4, \u00F6) => No error, but when I try to compile them using re.compile they do not work as expected (does not match the escaped characters)

3) Escaping characters using double backslash \\ => Still the same error message


I am using Python version 2.7.2 on Mac OSX. My editor is Sublime Text 2 and I've read the values from my editor's built-in console. I come from the world of javascript where I don't have the same problem.

Thank you in advance and I'm terribly sorry if my question is duplicate!

Edit 1: Added full error message given by the Sublime Text's console.

3
  • That's not the full code that gives you that error. Commented Jun 12, 2013 at 19:36
  • Does Sublime Text have some sort of built-in Python interpreter which you're using to run the code? Commented Jun 12, 2013 at 19:52
  • Thanks for everyone who helped me with this problem! :) The solution turned out to be something very different than I initially thought. Commented Jun 12, 2013 at 20:02

1 Answer 1

1

If you try

print repr(json_data['vowel'])

you'll see that the value is shown i.e., the problem is not json but printing Unicode. Try

print u"\xe4"

it should produce the same UnicodeEncodeError. Configure your editor to allow printing Unicode from Python. You could try to set PYTHONIOENCODING=utf-8 environment variable for editor's builtin console (or the encoding that it uses).

Unrelated to your issue, you could simplify slightly loading of utf-8 encoded json file:

import json

with open("expr.json", "rb") as file:
    json_data = json.load(file)
Sign up to request clarification or add additional context in comments.

2 Comments

Ugh, you are right! Running print u"\xe4" does raise UnicodeEncodeError. So the problem is not with my code but with the editor? It has been driving me nuts! Thank you!
I tried running print u"\xe4" in terminal and it gave me exactly the result it was expected to! Thank you for solving this to me! It would have taken me ages to solve it by myself, desperately trying to fox my code =P

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.