Convert a numpy.ndarray to string(or bytes) and convert it back to numpy.ndarray

Question

I'm having a little trouble here,

I'm trying to convert a numpy.ndarray to string, I've already done that like this:

randomArray.tostring()

It works, but I'm wondering if I can transform it back to a numpy.ndarray.

What's the best way to do this?

I'm using numpy 1.8.1

Context: The objective is to send the numpy.ndarray as a message in rabbitmq (pika library)

You might find this answer useful: [1]: stackoverflow.com/questions/5387208/… — Singularity
– Singularity, Commented May 11, 2015 at 12:30
Sadly the tostring() method returns bytes and I don't know how to convert it even with this solution. — Ampo
– Ampo, Commented May 11, 2015 at 12:33
Note that .tostring() is deprecated in NumPy 1.19, with the preferred spelling being .tobytes(). The two otherwise have identical behavior. — Eric
– Eric, Commented Apr 1, 2020 at 9:33

Augustin · Accepted Answer · 2020-07-10 15:47:53Z

55

You can use the fromstring() method for this:

arr = np.array([1, 2, 3, 4, 5, 6])
ts = arr.tostring()
print(np.fromstring(ts, dtype=int))

>>> [1 2 3 4 5 6]

Sorry for the short answer, not enough points for commenting. Remember to state the data types or you'll end up in a world of pain.

Note on fromstring from numpy 1.14 onwards:

sep : str, optional

The string separating numbers in the data; extra whitespace between elements is also ignored.

Deprecated since version 1.14: Passing sep='', the default, is deprecated since it will trigger the deprecated binary mode of this function. This mode interprets string as binary bytes, rather than ASCII text with decimal numbers, an operation which is better spelt frombuffer(string, dtype, count). If string contains unicode text, the binary mode of fromstring will first encode it into bytes using either utf-8 (python 3) or the default encoding (python 2), neither of which produce sane results.

edited Jul 10, 2020 at 15:47

Augustin

2,6421 gold badge25 silver badges25 bronze badges

answered May 11, 2015 at 12:51

ajsp

2,69025 silver badges35 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

Julien Spronck Over a year ago

i did not know about fromstring, nice ! however, it does not seem to work multi-dimensional arrays (returns a flat version of the multi-dimensional array). I guess you can reshape the array afterwards if you know the dimensions.

Ampo Over a year ago

This may work, the weird thing is that my tostring() method returns weird things (bytes?) the fromstring() isn't working perfectly.

ajsp Over a year ago

@Ampo you can use repr(ts) to view the binary, but you will have to convert it using np.fromstring(ts,dtype=int), remember to use the correct data type. Are you using floats or integers? Post the type of array you are trying to send.

ajsp Over a year ago

Frankly I would not serialize with numpy, my advice is to dump the lot into JSON and parse it at the other end...no headaches.

Scott Over a year ago

np.fromstring() is depricated, use np.frombuffer() instead

|

simleo · Accepted Answer · 2015-05-11 13:49:08Z

28

If you use tostring you lose information on both shape and data type:

>>> import numpy as np
>>> a = np.arange(12).reshape(3, 4)
>>> a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
>>> s = a.tostring()
>>> aa = np.fromstring(a)
>>> aa
array([  0.00000000e+000,   4.94065646e-324,   9.88131292e-324,
         1.48219694e-323,   1.97626258e-323,   2.47032823e-323,
         2.96439388e-323,   3.45845952e-323,   3.95252517e-323,
         4.44659081e-323,   4.94065646e-323,   5.43472210e-323])
>>> aa = np.fromstring(a, dtype=int)
>>> aa
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
>>> aa = np.fromstring(a, dtype=int).reshape(3, 4)
>>> aa
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

This means you have to send the metadata along with the data to the recipient. To exchange auto-consistent objects, try cPickle:

>>> import cPickle
>>> s = cPickle.dumps(a)
>>> cPickle.loads(s)
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

answered May 11, 2015 at 13:49

simleo

3,00525 silver badges25 bronze badges

1 Comment

mertyildiran Over a year ago

dtype important: np.uint8 / np.uint16

Julien Spronck · Accepted Answer · 2015-05-11 13:31:19Z

13

Imagine you have a numpy array of integers (it works with other types but you need some slight modification). You can do this:

a = np.array([0, 3, 5])
a_str = ','.join(str(x) for x in a) # '0,3,5'
a2 = np.array([int(x) for x in a_str.split(',')]) # np.array([0, 3, 5])

If you have an array of float, be sure to replace int by float in the last line.

You can also use the __repr__() method, which will have the advantage to work for multi-dimensional arrays:

from numpy import array
numpy.set_printoptions(threshold=numpy.nan)
a = array([[0,3,5],[2,3,4]])
a_str = a.__repr__() # 'array([[0, 3, 5],\n       [2, 3, 4]])'
a2 = eval(a_str) # array([[0, 3, 5],
                 #        [2, 3, 4]])

edited May 11, 2015 at 13:31

answered May 11, 2015 at 12:40

Julien Spronck

15.5k5 gold badges50 silver badges57 bronze badges

3 Comments

Ampo Over a year ago

Since I use a 3D-Array (image) the __repr__() method should work but it doesn't. The array is really big (1000000+ values in it) I end up with 1000 values after converting it with __repr__() and eval() crashes(?)

Julien Spronck Over a year ago

@Ampo yes, __repr__() crashes with larger arrays because of the representation of large numpy arrays (large arrays have ... instead of full arrays). You can change that behaviour (with set_printoptions) ... I just edited my answer, see if that works better.

Anurag A S Over a year ago

It may be helpful to add import numpy to your second code, since it gives an error (numpy.set_printoptions) for those who don't know.

shantanu pathak · Accepted Answer · 2021-05-11 19:15:00Z

4

I know, I am late but here is the correct way of doing it. using base64. This technique will convert the array to string.

import base64
import numpy as np
random_array = np.random.randn(32,32)
string_repr = base64.binascii.b2a_base64(random_array).decode("ascii")
array = np.frombuffer(base64.binascii.a2b_base64(string_repr.encode("ascii"))) 
array = array.reshape(32,32)

For array to string

Convert binary data to a line of ASCII characters in base64 coding and decode to ASCII to get string repr.

For string to array

First, encode the string in ASCII format then Convert a block of base64 data back to binary and return the binary data.

edited May 11, 2021 at 19:15

shantanu pathak

2,20721 silver badges28 bronze badges

answered Apr 1, 2020 at 9:23

aman5319

6725 silver badges17 bronze badges

1 Comment

shantanu pathak Over a year ago

This works. Only thing i needed to add was to do reshape(32,32) at the end

Jadiel de Armas · Accepted Answer · 2020-02-28 19:11:29Z

2

This is a fast way to encode the array, the array shape and the array dtype:

def numpy_to_bytes(arr: np.array) -> str:
    arr_dtype = bytearray(str(arr.dtype), 'utf-8')
    arr_shape = bytearray(','.join([str(a) for a in arr.shape]), 'utf-8')
    sep = bytearray('|', 'utf-8')
    arr_bytes = arr.ravel().tobytes()
    to_return = arr_dtype + sep + arr_shape + sep + arr_bytes
    return to_return

def bytes_to_numpy(serialized_arr: str) -> np.array:
    sep = '|'.encode('utf-8')
    i_0 = serialized_arr.find(sep)
    i_1 = serialized_arr.find(sep, i_0 + 1)
    arr_dtype = serialized_arr[:i_0].decode('utf-8')
    arr_shape = tuple([int(a) for a in serialized_arr[i_0 + 1:i_1].decode('utf-8').split(',')])
    arr_str = serialized_arr[i_1 + 1:]
    arr = np.frombuffer(arr_str, dtype = arr_dtype).reshape(arr_shape)
    return arr

To use the functions:

a = np.ones((23, 23), dtype = 'int')
a_b = numpy_to_bytes(a)
a1 = bytes_to_numpy(a_b)
np.array_equal(a, a1) and a.shape == a1.shape and a.dtype == a1.dtype

edited Feb 28, 2020 at 19:11

answered Feb 28, 2020 at 19:06

Jadiel de Armas

8,8428 gold badges50 silver badges65 bronze badges

1 Comment

Aref Over a year ago

Thanks for the solution. Just a small fix: In numpy_to_bytes the output type should be "bytearray" and the bytes_to_numpy input type should be "bytearray" as well.

SachaDee · Accepted Answer · 2022-04-30 01:46:59Z

2

The right answer for for numpy version >1.9

arr = np.array([1, 2, 3, 4, 5, 6])
ts = arr.tobytes()
#Reverse to array
arr = np.frombuffer(ts, dtype=arr.dtype)
print(arr)

tostring() is deprecated

You don't need any external library (except numpy) and its there is no faster method to retrive the value!

edited Apr 30, 2022 at 1:46

answered Apr 30, 2022 at 1:38

SachaDee

9,6833 gold badges27 silver badges36 bronze badges

Comments

Sudheer Raja · Accepted Answer · 2019-07-26 01:40:51Z

This is a slightly improvised answer to ajsp answer using XML-RPC.

On the server-side when you convert the data, convert the numpy data to a string using the '.tostring()' method. This encodes the numpy ndarray as bytes string. On the client-side when you receive the data decode it using '.fromstring()' method. I wrote two simple functions for this. Hope this is helpful.

ndarray2str -- Converts numpy ndarray to bytes string.
str2ndarray -- Converts binary str back to numpy ndarray.

    def ndarray2str(a):
        # Convert the numpy array to string 
        a = a.tostring()

        return a

On the receiver side, the data is received as a 'xmlrpc.client.Binary' object. You need to access the data using '.data'.

    def str2ndarray(a):
        # Specify your data type, mine is numpy float64 type, so I am specifying it as np.float64
        a = np.fromstring(a.data, dtype=np.float64)
        a = np.reshape(a, new_shape)

        return a

Note: Only problem with this approach is that XML-RPC is very slow while sending large numpy arrays. It took me around 4 secs to send and receive a (10, 500, 500, 3) size numpy array for me.

I am using python 3.7.4.

heilala · Accepted Answer · 2023-04-03 16:25:58Z

0

I needed it to save the ndarray in an SQLite table.

My solution was to dump the array and convert it into hexadecimal:

array_5_str = array_5.dumps().hex()  # to save it in the table

To convert it into a ndarray again:

array_5_from_str = pickle.loads(bytes.fromhex(array_5_str))

You can compare the two ndarray with:

comparison = array_5 == array_5_from_str
equal_arrays = comparison.all()
print(equal_arrays)

edited Apr 3, 2023 at 16:25

heilala

86210 silver badges21 bronze badges

answered Feb 27, 2023 at 6:01

Leo Mag

11 bronze badge

1 Comment

xlmaster Over a year ago

As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.

Max Kleiner · Accepted Answer · 2018-07-12 15:34:54Z

-4

Imagine you have a numpy array of text like in a messenger

 >>> stex[40]
 array(['Know the famous thing ...

and you want to get statistics from the corpus (text col=11) you first must get the values from dataframe (df5) and then join all records together in one single corpus:

 >>> stex = (df5.ix[0:,[11]]).values
 >>> a_str = ','.join(str(x) for x in stex)
 >>> a_str = a_str.split()
 >>> fd2 = nltk.FreqDist(a_str)
 >>> fd2.most_common(50)

answered Jul 12, 2018 at 15:34

Max Kleiner

1,6581 gold badge15 silver badges14 bronze badges

2 Comments

user2357112 Over a year ago

This does not answer the question that was asked.

Max Kleiner Over a year ago

think not cause stex is an numpy array type(stex) <class 'numpy.ndarray'> then I convert it to a_str and after fd2 back to save the freqdist() in an array

Collectives™ on Stack Overflow

Convert a numpy.ndarray to string(or bytes) and convert it back to numpy.ndarray

9 Answers 9

9 Comments

1 Comment

3 Comments

1 Comment

1 Comment

Comments

Comments

1 Comment

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

9 Answers 9

9 Comments

1 Comment

3 Comments

1 Comment

1 Comment

Comments

Comments

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related