fastest way to create a list from a numpy array of lists

Question

I have a numpy array of lists. this is of type numpy.ndarray:

   array([list([2692, 2711]), list([2751, 2770]), list([3455, 3462]),
   list([4020, 4027]), list([7707, 7726]), list([7893, 7912]),
   list([8118, 8126]), list([8174, 8179]), list([8215, 8234]),
   list([9227, 9246]), list([9518, 9537]), list([9839, 9859]),
   list([10002, 10021]), list([10024, 10043]), list([10158, 10178]),
   list([11346, 11365])], dtype=object)

I want to create a list from the first element of each sublist. I'm doing it by a list comprehension:

 lst = [ x[1] for x in m ]

Is there a quicker way to create this list?

Ibrahim Qasim · Accepted Answer · 2019-12-22 16:11:59Z

1

>>> m[:, 0]
array([2692, 2751, 3455, 4020, 7707, 7893, 8118, 8174, 8215, 9227, 9518,
       9839, 10002, 10024, 10158, 11346], dtype=object)

answered Dec 22, 2019 at 16:11

Ibrahim Qasim

212 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

afshin Over a year ago

I tried that. It works with type numpy.array but gives an error with type numpy.ndarray. IndexError: too many indices for array

Ibrahim Qasim Over a year ago

I created an ndarray m by copying exactly the list you posted. The type of m is ndarray. All numpy arrays are stored as ndarrays. So this indexing should work on any 2 dimensional ndarray. I can edit my answer to include the construction code as well?

afshin Over a year ago

I manually created this array and it works as you mentioned. For some reason I'm getting an error when I extract this array from a larger numpy matrix converted from a panda dataframe.

hpaulj Over a year ago

You can't recreate the 1d object dtype array with a copy and paste.

Paul Panzer · Accepted Answer · 2019-12-22 16:22:39Z

1

You can get a rather significant speedup by using m.tolist() instead of m. For an additional minor saving use zip:

[*zip(*m.tolist()).__next__()]
# [2692, 2751, 3455, 4020, 7707, 7893, 8118, 8174, 8215, 9227, 9518, 9839, 10002, 10024, 10158, 11346]

answered Dec 22, 2019 at 16:22

Paul Panzer

53.3k3 gold badges59 silver badges103 bronze badges

2 Comments

afshin Over a year ago

how can I create a list of the second element in each list?

Paul Panzer Over a year ago

@afshin For that your list comprehension applied to m.tolist() is hard to beat.

Andy L. · Accepted Answer · 2019-12-22 19:29:48Z

0

Your sample is 1-d array of list objects, so directly slicing on axis 1 as m[:, 0] will fail because it has no axis 1. If every sub-list has the different lengths, I can't think of any better solution than list comprehension. However, if every sub-list has the same length(as in your sample every sub-list has length of 2), you may use np.vstack to convert it to 2-d array and slice as follows

n = np.vstack(m)[:,0].tolist()

Out[371]:
[2692,
 2751,
 3455,
 4020,
 7707,
 7893,
 8118,
 8174,
 8215,
 9227,
 9518,
 9839,
 10002,
 10024,
 10158,
 11346]

edited Dec 22, 2019 at 19:29

answered Dec 22, 2019 at 19:24

Andy L.

25.3k4 gold badges20 silver badges30 bronze badges

3 Comments

afshin Over a year ago

I do need it back as a numpy array, so I tried n = np.vstack(m[:,0] It was slower than my list comprehension above

Andy L. Over a year ago

Numpy arrays provides simple and convenient way to slice elements on the whole axis. However, on mixed object types, converting those sub-lists to numpy array costing time. If time is crucial, I think your best solution in this case is list comprehension as you say.

Andy L. Over a year ago

Another solution using numpy is np.array(m.tolist())[:,1]. This solution is faster than np.vstack, but It is still slower than list comprehension for your data

Collectives™ on Stack Overflow

fastest way to create a list from a numpy array of lists

3 Answers 3

4 Comments

2 Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

2 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related