Send a non-ASCII POST request in Python?

Question

I'm trying to send a POST request to a web app. I'm using the mechanize module (itself a wrapper of urllib2). Anyway, when I try to send a POST request, I get UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 0: ordinal not in range(128). I tried putting the unicode(string), the unicode(string, encoding="utf-8"), unicode(string).encode() etc, nothing worked - either returned the error above, or the TypeError: decoding Unicode is not supported

I looked at the other SO answers to similar questions, but none helped.

Thanks in advance!

EDIT: Example that produces an error:

prda = "šđćč" #valid UTF-8 characters
prda # typing in python shell 
'\xc5\xa1\xc4\x91\xc4\x87\xc4\x8d'
print prda # in shell
šđćč
prda.encode("utf-8") #in shell
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 0: ordinal not in range(128)
unicode(prda)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 0: ordinal not in range(128)

I would help if you showed a small, self-contained example that produces the error. — ekhumoro
– ekhumoro, Commented Jan 7, 2012 at 23:46

Laurence Gonsalves · Accepted Answer · 2012-01-07 23:52:57Z

9

I assume you're using Python 2.x.

Given a unicode object:

myUnicode = u'\u4f60\u597d'

encode it using utf-8:

mystr = myUnicode.encode('utf-8')

Note that you need to specify the encoding explicitly. By default it'll (usually) use ascii.

answered Jan 7, 2012 at 23:52

Laurence Gonsalves

144k38 gold badges264 silver badges315 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Bo Milanovich Over a year ago

Thanks for the reply. How would I go about converting it to an unicode object if I have a string variable (instead of the string)? It's buried deep into the code for me to simply add u' prefix before the string variable is assigned.

ekhumoro · Accepted Answer · 2012-01-08 01:24:09Z

In your example, you use a non-unicode string literal containing non-ascii characters, which results in prda becoming a bytes string.

To achieve this, python uses sys.stdin.encoding to automatically encode the string. In your case, this means the string is gets encoded as "utf-8".

To convert prda to a unicode object, you need to decode it using the appropriate encoding:

>>> print prda.decode('utf-8')
šđćč

Note that, in a script or module, you cannot rely on python to automatically guess the encoding - you would need to explicitly delare the encoding at the top of the file, like this:

# -*- coding: utf-8 -*-

Whenever you encounter unicode errors in Python 2, it is very often because your code is mixing bytes strings with unicode strings. So you should always check what kind of string is causing the error, by using type(string).

If the string object is <type 'str'>, but you need unicode, decode it using the appropriate encoding. If the string object is <type 'unicode'>, but you need bytes, encode it using the appropriate encoding.

Giacomo Lacava · Accepted Answer · 2012-01-08 01:23:16Z

0

You don't need to wrap your chars in unicode calls, because they're already encoded :) if anything, you need to DE-code it to get a unicode object:

>>> s = '\xc5\xa1\xc4\x91\xc4\x87\xc4\x8d'   # your string
>>> s.decode('utf-8')
u'\u0161\u0111\u0107\u010d'
>>> type(s.decode('utf-8'))
<type 'unicode'>

I don't know mechanize so I don't know exactly whether it handles it correctly or not, I'm afraid.

What I'd do with a regular urllib2 POST call, would be to use urlencode :

>>> from urllib import urlencode
>>> postData = urlencode({'test': s })   # note I'm NOT decoding it
>>> postData
'test=%C5%A1%C4%91%C4%87%C4%8D'
>>> urllib2.urlopen(url, postData)   # etc etc etc

answered Jan 8, 2012 at 1:23

Giacomo Lacava

1,84315 silver badges27 bronze badges

Collectives™ on Stack Overflow

Send a non-ASCII POST request in Python?

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related