3

I want to use numpy.savez in a loop to save multiples numpy arrays multiple times, here is an example :

import numpy as np

a = np.array([1, 2, 3])
b = np.array([5, 6, 12])

for i in range(3):
    np.savez("file_info", info1 = a, info2 = b)
    print('a => ', a)
    print('b => ', b)
    a = a * 3
    b = b * 2

The output :

a =>  [1 2 3]
b =>  [ 5  6 12]
a =>  [3 6 9]
b =>  [10 12 24]
a =>  [ 9 18 27]
b =>  [20 24 48]

But when I read the saved file :

npzfile = np.load("file_info.npz")
npzfile['info1']

I get only the last array (cuz the content is removed at each loop) :

array([ 9, 18, 27])

So, my question is, how can I save all the numpy arrays in the same file ?

1
  • 1
    By using different output files. Or if you want everything within one file, make one big array of with one more outer-dim. Commented Mar 12, 2017 at 15:54

1 Answer 1

3

When you save a new file of the same name it over-writes the old file. Why don't you take your save out of your for loop:

import numpy as np

a = np.array([1, 2, 3])
b = np.array([5, 6, 12])

save_info = np.zeros([3, 2, 3]) #array to store all the info
#one dimension for each entry in a, one for as many arrays as you have
#generating info, and one for the number of times you'll loop over it

for i in range(3): #therefore i will be [0, 1, 2]
    save_info[:, 0, i] = a #save from the a array
    save_info[:, 1, i] = b #save from the b array
    a = a * 3
    b = b * 2

np.savez("file_info", info1=save_info) #save outside for loop doesn't overwrite

I can then read information from the file:

>>> import numpy as np
>>> data = np.load("file_info.npz") #load file to data object
>>> data["info1"]
array([[[  1.,   3.,   9.],
        [  5.,  10.,  20.]],

       [[  2.,   6.,  18.],
        [  6.,  12.,  24.]],

       [[  3.,   9.,  27.],
        [ 12.,  24.,  48.]]])

Edit: Or if you're avoiding creating one big array you could rename the file that you're saving to each time you loop through:

import numpy as np

a = np.array([1, 2, 3])
b = np.array([5, 6, 12])

for i in range(3): #therefore i will be [0, 1, 2]
    np.savez("file_info_"+str(i), info1=a, info2=b)
    #will save to "file_info_0.npz" on first run
    #will save to "file_info_1.npz" on second run
    #will save to "file_info_2.npz" on third run

    a = a * 3
    b = b * 2

Edit: You might prefer to make two smaller arrays, one for a and one for b:

import numpy as np

a = np.array([1, 2, 3])
b = np.array([5, 6, 12])

save_a = np.zeros([3, 3]) #array to store all the a runs
save_b = np.zeros([3, 3]) #array to store all the b runs

for i in range(3): #therefore i will be [0, 1, 2]
    save_a[:, i] = a #save from the a array
    save_b[:, i] = b #save from the b array
    a = a * 3
    b = b * 2

np.savez("file_info", info1=save_a, info2=save_b) #save outside for loop doesn't overwrite
Sign up to request clarification or add additional context in comments.

4 Comments

I was avoiding to create an array to store all the info because I'm dealing with HUGE data.
You could save it to different files then? e.g. np.savez("file_info"+str(i), info1=a, info2=b)?
Well, if i do that, i will get .. i don't know maybe 5000 files, I'll see if there is a way around. Thank you.
Numpy is good at big arrays (I use it to process 5000x3000 colour images all the time) so I wouldn't worry too much unless your data is really really huge? Good luck with the project.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.