I'm sure this is terribly wrong, and I'm having a couple of problems. I've written out an array of WIN32_FIND_DATAW structures to disk, one after another, and I'd like to consume and parse them in my Python script.
The code I'm currently using is:
>>> fp = open('findData', 'r').read()
>>> data = ctypes.cast(fp, ctypes.POINTER(wintypes.WIN32_FIND_DATAW))
>>> print str(data[0].cFileName)
The first problem is that the third line doesn't print a nice string like I would expect. Instead of printing $Recycle.Bin it prints UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-5: ordinal not in range(128)
This is the result of just printing the data stored there:
>>> data[0].cFileName
u'\U00520024\U00630065\U00630079\U0065006c\U0042002e\U006e0069'
This looks relatively reasonable. $ is ASCII 0x24, R is ASCII 0x52 and so on.
So why can't I print it like a string?
My second question is that doing:
>>> data[1].cFileName
Gives me ridiculous data. I'm fairly sure I'm not using that ctypes.cast correctly. How should I be doing it to access these? To clarify, in C, I'd just point a PWIN32_FIND_DATAW pointer to the beginning of the buffer and access the individual structs in the array using similar code, and I'm trying to do the same in Python.
Update
Doing:
>>> data[0].cFileName.encode('windows-1252')
Yields this error:
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-5: character maps to <undefined>
Update
The beginning of the first entry (data[0] up to the first part of cFileName) looks like the following:
user@ubuntu:~/data$ hexdump -C findData | head -n 6
00000000 16 00 00 00 dc 5a 9f d2 31 04 ca 01 ba 81 89 1a |.....Z..1.......|
00000010 81 e2 cd 01 ba 81 89 1a 81 e2 cd 01 00 00 00 00 |................|
00000020 00 00 00 00 00 00 00 00 00 00 00 00 24 00 52 00 |............$.R.|
00000030 65 00 63 00 79 00 63 00 6c 00 65 00 2e 00 42 00 |e.c.y.c.l.e...B.|
00000040 69 00 6e 00 00 00 00 00 00 00 00 00 00 00 00 00 |i.n.............|
00000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
I can post more data if needed.
rb)?ctypes.wintypeson Linux? Did you create a newwintypesmodule by copying from the original? Ac_wcharis 2 bytes on Windows, but 4 bytes on other platforms. Please show what you're using forWIN32_FIND_DATAWon Linux.c_chararrays can be annoying because they try to create Python strings instead of just returning the array. So it's stopping at the first null. You'd need to usec_ubyteinstead. Then it'sbytarray(data[0].cFileName).decode('utf-16le').