I am trying to convert a string type to Unicode in Python. I want it to work for any non-english string, for example Japanese, Chinese or Spanish.
For example, japanese_var has some japanese characters [ドキュメントを翻訳します].
Printing it would give,
'\x83h\x83L\x83\x85\x83\x81\x83\x93\x83g\x82\xf0\x96|\x96\xf3\x82\xb5\x82\xdc\x82\xb7'
Checking its type,
type(japanese_var)
<type 'str'>
How can I convert it to type 'unicode'?
Should i use japanese_var.decode('mbcs')? What could be the consequences of using this code as i will be using it on different OS platforms & different foreign Locale?
I am using python 2.5.4
I am reading the parameter which can be any non-english string of a file from its properties.
udirectly in front of it, though you may need to be careful about source code encoding.)