26

How can I read a Numpy array from a string? Take a string like:

"[[ 0.5544  0.4456], [ 0.8811  0.1189]]"

and convert it to an array:

a = from_string("[[ 0.5544  0.4456], [ 0.8811  0.1189]]")

where a becomes the object: np.array([[0.5544, 0.4456], [0.8811, 0.1189]]).

I'm looking for a very simple interface. A way to convert 2D arrays (of floats) to a string and then a way to read them back to reconstruct the array:

arr_to_string(array([[0.5544, 0.4456], [0.8811, 0.1189]])) should return "[[ 0.5544 0.4456], [ 0.8811 0.1189]]".

string_to_arr("[[ 0.5544 0.4456], [ 0.8811 0.1189]]") should return the object array([[0.5544, 0.4456], [0.8811, 0.1189]]).

Ideally arr_to_string would have a precision parameter that controlled the precision of floating points converted to strings, so that you wouldn't get entries like 0.4444444999999999999999999.

There's nothing I can find in the NumPy docs that does this both ways. np.save lets you make a string but then there's no way to load it back in (np.load only works for files).

5
  • json.loads and json.dumps might be of use Commented Feb 24, 2016 at 21:34
  • I take that back, I didn't see the missing commas in the arrays... Commented Feb 24, 2016 at 21:38
  • i'm basically looking for the inverse of np.array_str (docs.scipy.org/doc/numpy-1.10.1/reference/generated/…) but i can't find it Commented Feb 24, 2016 at 21:40
  • 1
    Is it possible for you to save the shape and just save the flattened array? Because if you can do that, you can easily use the existing methods. Just reshape it when you are ready to reconstitute. Also, are you sending the array to string in order to serialize? Does it have to be human readable? Commented Feb 24, 2016 at 22:46
  • Have you tried pickle? Commented Feb 24, 2016 at 23:31

5 Answers 5

26

The challenge is to save not only the data buffer, but also the shape and dtype. np.fromstring reads the data buffer, but as a 1d array; you have to get the dtype and shape from else where.

In [184]: a=np.arange(12).reshape(3,4)

In [185]: np.fromstring(a.tostring(),int)
Out[185]: array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [186]: np.fromstring(a.tostring(),a.dtype).reshape(a.shape)
Out[186]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

A time honored mechanism to save Python objects is pickle, and numpy is pickle compliant:

In [169]: import pickle

In [170]: a=np.arange(12).reshape(3,4)

In [171]: s=pickle.dumps(a*2)

In [172]: s
Out[172]: "cnumpy.core.multiarray\n_reconstruct\np0\n(cnumpy\nndarray\np1\n(I0\ntp2\nS'b'\np3\ntp4\nRp5\n(I1\n(I3\nI4\ntp6\ncnumpy\ndtype\np7\n(S'i4'\np8\nI0\nI1\ntp9\nRp10\n(I3\nS'<'\np11\nNNNI-1\nI-1\nI0\ntp12\nbI00\nS'\\x00\\x00\\x00\\x00\\x02\\x00\\x00\\x00\\x04\\x00\\x00\\x00\\x06\\x00\\x00\\x00\\x08\\x00\\x00\\x00\\n\\x00\\x00\\x00\\x0c\\x00\\x00\\x00\\x0e\\x00\\x00\\x00\\x10\\x00\\x00\\x00\\x12\\x00\\x00\\x00\\x14\\x00\\x00\\x00\\x16\\x00\\x00\\x00'\np13\ntp14\nb."

In [173]: pickle.loads(s)
Out[173]: 
array([[ 0,  2,  4,  6],
       [ 8, 10, 12, 14],
       [16, 18, 20, 22]])

There's a numpy function that can read the pickle string:

In [181]: np.loads(s)
Out[181]: 
array([[ 0,  2,  4,  6],
       [ 8, 10, 12, 14],
       [16, 18, 20, 22]])

You mentioned np.save to a string, but that you can't use np.load. A way around that is to step further into the code, and use np.lib.npyio.format.

In [174]: import StringIO

In [175]: S=StringIO.StringIO()  # a file like string buffer

In [176]: np.lib.npyio.format.write_array(S,a*3.3)

In [177]: S.seek(0)   # rewind the string

In [178]: np.lib.npyio.format.read_array(S)
Out[178]: 
array([[  0. ,   3.3,   6.6,   9.9],
       [ 13.2,  16.5,  19.8,  23.1],
       [ 26.4,  29.7,  33. ,  36.3]])

The save string has a header with dtype and shape info:

In [179]: S.seek(0)

In [180]: S.readlines()
Out[180]: 
["\x93NUMPY\x01\x00F\x00{'descr': '<f8', 'fortran_order': False, 'shape': (3, 4), }          \n",
 '\x00\x00\x00\x00\x00\x00\x00\x00ffffff\n',
 '@ffffff\x1a@\xcc\xcc\xcc\xcc\xcc\xcc#@ffffff*@\x00\x00\x00\x00\x00\x800@\xcc\xcc\xcc\xcc\xcc\xcc3@\x99\x99\x99\x99\x99\x197@ffffff:@33333\xb3=@\x00\x00\x00\x00\x00\x80@@fffff&B@']

If you want a human readable string, you might try json.

In [196]: import json

In [197]: js=json.dumps(a.tolist())

In [198]: js
Out[198]: '[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]'

In [199]: np.array(json.loads(js))
Out[199]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

Going to/from the list representation of the array is the most obvious use of json. Someone may have written a more elaborate json representation of arrays.

You could also go the csv format route - there have been lots of questions about reading/writing csv arrays.


'[[ 0.5544  0.4456], [ 0.8811  0.1189]]'

is a poor string representation for this purpose. It does look a lot like the str() of an array, but with , instead of \n. But there isn't a clean way of parsing the nested [], and the missing delimiter is a pain. If it consistently uses , then json can convert it to list.

np.matrix accepts a MATLAB like string:

In [207]: np.matrix(' 0.5544,  0.4456;0.8811,  0.1189')
Out[207]: 
matrix([[ 0.5544,  0.4456],
        [ 0.8811,  0.1189]])

In [208]: str(np.matrix(' 0.5544,  0.4456;0.8811,  0.1189'))
Out[208]: '[[ 0.5544  0.4456]\n [ 0.8811  0.1189]]'
Sign up to request clarification or add additional context in comments.

1 Comment

What a sophisticated and complete answer! Pickle is the best choice here. I also needed to transfer quite large 2dim float arrays via AMQP and pickle did the job (even without json). Thanks a lot!
11

I'm not sure there's an easy way to do this if you don't have commas between the numbers in your inner lists, but if you do, then you can use ast.literal_eval:

import ast
import numpy as np
s = '[[ 0.5544,  0.4456], [ 0.8811,  0.1189]]'
np.array(ast.literal_eval(s))

array([[ 0.5544,  0.4456],
       [ 0.8811,  0.1189]])

EDIT: I haven't tested it very much, but you could use re to insert commas where you need them:

import re
s1 = '[[ 0.5544  0.4456], [ 0.8811 -0.1189]]'
# Replace spaces between numbers with commas:
s2 = re.sub('(\d) +(-|\d)', r'\1,\2', s1)
s2
'[[ 0.5544,0.4456], [ 0.8811,-0.1189]]'

and then hand on to ast.literal_eval:

np.array(ast.literal_eval(s2))
array([[ 0.5544,  0.4456],
       [ 0.8811, -0.1189]])

(you need to be careful to match spaces between digits but also spaces between a digit an a minus sign).

2 Comments

i don't have commas between the numbers, only spaces
@mvd You might try my edit, but I haven't tested it thoroughly.
10

Forward to string:

import numpy as np
def array2str(arr, precision=None):
    s=np.array_str(arr, precision=precision)
    return s.replace('\n', ',')

Backward to array:

import re
import ast
import numpy as np
def str2array(s):
    # Remove space after [
    s=re.sub('\[ +', '[', s.strip())
    # Replace commas and spaces
    s=re.sub('[,\s]+', ', ', s)
    return np.array(ast.literal_eval(s))

If you use repr() to convert array to string, the conversion will be trivial.

2 Comments

This answer works nicely, as it can be used with configparser; which means comments can be the in a text file. See: stackoverflow.com/questions/30691797/…
The \s in [,\s]+ in strarray not only replaces spaces, but also replaces whitespace as well -- so if your array is [1 2]\n[3 4], then it works for such cases as well.
2

In my case I found following command helpful for dumping:

string = str(array.tolist())

And for reloading:

array = np.array( eval(string) )

This should work for any dimensionality of numpy array.

Comments

1

numpy.fromstring() allows you to easily create 1D arrays from a string. Here's a simple function to create a 2D numpy array from a string:

import numpy as np

def str2np(strArray):

    lItems = []
    width = None
    for line in strArray.split("\n"):
        lParts = line.split()
        n = len(lParts)
        if n==0:
            continue
        if width is None:
            width = n
        else:
            assert n == width, "invalid array spec"
        lItems.append([float(str) for str in lParts])
    return np.array(lItems)

Usage:

X = str2np("""
    -2  2
    -1  3
     0  1
     1  1
     2 -1
     """)
print(f"X = {X}")

Output:

X = [[-2.  2.]
 [-1.  3.]
 [ 0.  1.]
 [ 1.  1.]
 [ 2. -1.]]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.