zero padding numpy array

Question

Suppose I have a list contains un-equal length lists.

a = [ [ 1, 2, 3], [2], [2, 4] ]

What is the best way to obtain a zero padding numpy array with standard shape?

zero_a = [ [1, 2, 3], [2, 0, 0], [2, 4, 0] ]

I know I can use list operation like

n = max( map( len, a ) )
map( lambda x : x.extend( [0] * (n-len(x)) ), a )
zero_a = np.array(zero_a)

but I was wondering is there any easy numpy way to do this work?

@megawac I update my question. I am trying to find alternative numpy method. — Xingzhong
– Xingzhong, Commented Nov 9, 2013 at 16:38
+1 to the question because I've wanted something like this before myself, and couldn't think of anything clean enough. (I sometimes use pd.DataFrame(a).fillna(0).values, but I've been on a pandas kick for a while. There should really be something numpy-native.) — DSM
– DSM, Commented Nov 9, 2013 at 17:01
@alko: true, but the first thing it does is call narray = np.array(array) on the argument, which if it's a list of lists with varying lengths will give us an array with dtype=object and lists as elements. It's good for padding existing ndarrays, but I can't see how to get it to help here. — DSM
– DSM, Commented Nov 9, 2013 at 17:15

alko · Accepted Answer · 2013-11-09 18:02:26Z

As numpy have to know size of an array just prior to its initialization, best solution would be a numpy based constructor for such case. Sadly, as far as I know, there is none.

Probably not ideal, but slightly faster solution will be create numpy array with zeros and fill with list values.

import numpy as np
def pad_list(lst):
    inner_max_len = max(map(len, lst))
    map(lambda x: x.extend([0]*(inner_max_len-len(x))), lst)
    return np.array(lst)

def apply_to_zeros(lst, dtype=np.int64):
    inner_max_len = max(map(len, lst))
    result = np.zeros([len(lst), inner_max_len], dtype)
    for i, row in enumerate(lst):
        for j, val in enumerate(row):
            result[i][j] = val
    return result

Test case:

>>> pad_list([[ 1, 2, 3], [2], [2, 4]])
array([[1, 2, 3],
       [2, 0, 0],
       [2, 4, 0]])

>>> apply_to_zeros([[ 1, 2, 3], [2], [2, 4]])
array([[1, 2, 3],
       [2, 0, 0],
       [2, 4, 0]])

Performance:

>>> timeit.timeit('from __main__ import pad_list as f; f([[ 1, 2, 3], [2], [2, 4]])', number = 10000)
0.3937079906463623
>>> timeit.timeit('from __main__ import apply_to_zeros as f; f([[ 1, 2, 3], [2], [2, 4]])', number = 10000)
0.1344289779663086

ilmarinen · Accepted Answer · 2013-11-09 18:07:40Z

2

Not strictly a function from numpy, but you could do something like this

from itertools import izip, izip_longest
import numpy
a=[[1,2,3], [4], [5,6]]
res1 = numpy.array(list(izip(*izip_longest(*a, fillvalue=0))))

or, alternatively:

res2=numpy.array(list(izip_longest(*a, fillvalue=0))).transpose()

If you use python 3, use zip, and itertools.zip_longest.

answered Nov 9, 2013 at 18:07

ilmarinen

5,9173 gold badges19 silver badges14 bronze badges

1 Comment

alko Over a year ago

nice solution, but ties with manual padding on my machine (as expected -- key downside is generation of new list)

Collectives™ on Stack Overflow

zero padding numpy array

2 Answers 2

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related