1

I am trying to read a string of bytes from a file using NumPy fromfile in Python 3. My goal is to convert the bytes to a normal Python 3 string. For example:

$ echo "1234" > t.txt

Now the file t.txt contains 4 bytes of text. Then:

import numpy as np

values=np.fromfile('t.txt',dtype='|S1',count=4)
print ("values={}".format(values))
values=np.fromfile('t.txt',dtype='|U1',count=4)
print ("values={}".format(values))

gives:

values=[b'1' b'2' b'3' b'4']
Traceback (most recent call last):
  File "./t.py", line 12, in <module>
    print ("values={}".format(values))
  File "/home/hakon/.pyenv/versions/3.4.2/lib/python3.4/site-packages/numpy/core/numeric.py", line 1715, in array_str
    return array2string(a, max_line_width, precision, suppress_small, ' ', "", str)
  File "/home/hakon/.pyenv/versions/3.4.2/lib/python3.4/site-packages/numpy/core/arrayprint.py", line 454, in array2string
    separator, prefix, formatter=formatter)
  File "/home/hakon/.pyenv/versions/3.4.2/lib/python3.4/site-packages/numpy/core/arrayprint.py", line 328, in _array2string
    _summaryEdgeItems, summary_insert)[:-1]
  File "/home/hakon/.pyenv/versions/3.4.2/lib/python3.4/site-packages/numpy/core/arrayprint.py", line 500, in _formatArray
    word = format_function(a[-1])
UnicodeDecodeError: 'utf-32-le' codec can't decode bytes in position 0-3: codepoint not in range(0x110000)

I would like to obtain a normal Python 3 string like values='1234'. How can this be done?

1
  • 1
    what if you use dtype='|S4'? Commented Oct 15, 2014 at 14:04

2 Answers 2

2

You could use astype to convert the bytes to str:

import numpy as np

values = np.fromfile('t.txt',dtype='|S1',count=4).astype('|U1')
print(values)
# ['1' '2' '3' '4']

print(values.view('|U4'))
# ['1234']

print(values.dtype)
# <U1
Sign up to request clarification or add additional context in comments.

1 Comment

Here is an alternative: np.fromfile('t.txt',dtype='int8',count=4).tostring().decode()
1

I know the question explicitly asks for np.fromfile, but why not simply use the built-in file interface directly?

f = open('t.txt', 'r')
values = f.read().rstrip('\n')
f.close()

Note: Python 3 strings are Unicode by default.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.