0

I'm using OpenGL and I need to pass to a function array of bytes.

glCallLists(len('text'), GL_UNSIGNED_BYTES, 'text');

This way it's working fine. But I need to pass unicode text. I think that it should work like this:

text = u'unicode text'
glCallLists(len(text), GL_UNSIGNED_SHORT, convert_to_array_of_words(text));

Here I use GL_UNSIGNED_SHORT that says I'll give array where each element takes 2 bytes, and somehow convert unicode text to array of words.

So, how can I convert unicode string to "raw" array of chars' numbers?

8
  • I don't think any conversion will be necessary. Unicode text should already be an array of unsigned shorts Commented Feb 12, 2010 at 5:17
  • @John: Depends on whether the library is built to use UCS-2 or UCS-4. Commented Feb 12, 2010 at 5:28
  • Yes, probably, but I get this error when trying to pass unicode string: ctypes.ArgumentError: argument 3: <type 'exceptions.TypeError'>: No array-type handler for type <type 'unicode'> (value: u'\u0439') registered Commented Feb 12, 2010 at 5:29
  • @Ignacio: how is a string literal in his code a library issue? Do you mean the OpenGL? library Commented Feb 12, 2010 at 5:32
  • @John: No, I mean the Python library. Commented Feb 12, 2010 at 5:32

1 Answer 1

2

The UTF encoding that takes up 2 bytes per character is UTF-16:

print repr(u'あいうえお'.encode('utf-16be'))
print repr(u'あいうえお'.encode('utf-16le'))
Sign up to request clarification or add additional context in comments.

2 Comments

Yes they can. However, it was inaccurate for me to say that it uses 2 bytes per character. Some will take up 4 bytes, being composed of a "surrogate pair".
@Mike: I think you mean to say not all code points can be represented in UCS-2.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.