2

I am trying to convert a string type to Unicode in Python. I want it to work for any non-english string, for example Japanese, Chinese or Spanish.

For example, japanese_var has some japanese characters [ドキュメントを翻訳します].

Printing it would give,

'\x83h\x83L\x83\x85\x83\x81\x83\x93\x83g\x82\xf0\x96|\x96\xf3\x82\xb5\x82\xdc\x82\xb7'

Checking its type,

type(japanese_var)
<type 'str'>

How can I convert it to type 'unicode'?

Should i use japanese_var.decode('mbcs')? What could be the consequences of using this code as i will be using it on different OS platforms & different foreign Locale?

I am using python 2.5.4

I am reading the parameter which can be any non-english string of a file from its properties.

3
  • 1
    You need to know the encoding of the string. There isn't really a simple solution that will work for any string. Commented Dec 9, 2013 at 9:41
  • Which python? Python 2 or 3? Commented Dec 9, 2013 at 9:41
  • Where is this string coming from? (If it's a literal, stick a u directly in front of it, though you may need to be careful about source code encoding.) Commented Dec 9, 2013 at 9:44

2 Answers 2

4

You need to know the encoding of the input string. There is no reliable universal solution.

The encoding should be available from the source of the input string. For instance, if you're taking text from a web page, the encoding should be indicated as part of the HTTP Content-Type, either as a HTTP response header from the server or as <meta> tag in the page source.

Once you know the encoding, use the decode method.

This string appears to be Shift-JIS:

>>> x = '\x83h\x83L\x83\x85\x83\x81\x83\x93\x83g\x82\xf0\x96|\x96\xf3\x82\xb5\x82\xdc\x82\xb7'
>>> print x.decode( "shift-jis" )
ドキュメントを翻訳します
Sign up to request clarification or add additional context in comments.

Comments

0

It worked for me by passing "mbcs" to decode for any locale.

Thanks guys for your help.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.