1

I'm trying to create a new array based on an index, which is the first element in each row. I feel like i'm missing something really simple here.

array looks like this and the first number in the row is the index.

[[ 1  0  1  2  3  4]
 [ 1  5  6  7  8  9]
 [ 2 10 11 12 13 14]
 [ 2 15 16 17 18 19]
 [ 4 20 21 22 23 24]]

the outcome I would like would be such like:

array 1:

range 1=
[[ 1  0  1  2  3  4]
 [ 1  5  6  7  8  9]]

array 2:

range2 =
[[ 2 10 11 12 13 14]
[ 2 15 16 17 18 19]]

Array 3:

range 3=
[[ 4 20 21 22 23 24]]

This is the code I currently have, but I have N number possible index numbers and I can't obviously make an if statement for all of them. I was planning on using a list then converting that list in an numpy array. I've also looked at zipping them before using hstack but I couldn't get that to work either.

import numpy as np

data = np.arange(25).reshape(5,5)
indexList = np.array(([[1,1,2,2,4]]))
indexList = np.transpose(indexList)
array = np.hstack((indexList, data))

range1 = []
range2 = []
range3 = []
for row in array:
    if row[0] == 1:
        range1.append(row)
    if row[0] == 2:
        range2.append(row)
    if row[0] == 3:
        range3.append(row)
0

3 Answers 3

1

You can create a numpy.array with your ranges like so:

import numpy as np

indices = np.unique(a[:, 0])
size = len(indices)
ranges = np.zeros((size,), dtype=object)

for i in range(size):
    ranges[i] = a[a[:, 0] == indices[i]]

Then, if you print out ranges, you get each one of your desired arrays. The index (1, 2 or 4 in your case) which correlate to an item in ranges would be determined by indices.

>>> list(ranges)
    [array([[1, 0, 1, 2, 3, 4],
            [1, 5, 6, 7, 8, 9]]),
     array([[ 2, 10, 11, 12, 13, 14],
            [ 2, 15, 16, 17, 18, 19]]),
     array([[ 4, 20, 21, 22, 23, 24]])]
Sign up to request clarification or add additional context in comments.

Comments

1

You're trying to essentially do a group-by in numpy, and there isn't a great solution to that within numpy itself (though you can take a look at some answers to similar questions).

I'd transform the array to a pandas dataframe, since these are nice for groupby operations, get each group's values, and assign them to a dictionary key. You can then access them like you would any other value in a dict:

import pandas as pd
df = pd.DataFrame(array)
gb = df.groupby(0)
dict_of_arrays = {f"range{g}": gb.get_group(g).to_numpy() for g in gb.groups.keys()}

>> print(dict_of_arrays["range1"])
[[1 0 1 2 3 4]
 [1 5 6 7 8 9]]

>>> print(dict_of_arrays["range2"])
[[ 2 10 11 12 13 14]
 [ 2 15 16 17 18 19]]

>>> print(dict_of_arrays["range4"])
[[ 4 20 21 22 23 24]]

1 Comment

Great answer thanks, I thought there would have been some sort of function built in but good to know for the future. I've been trying to keep all in numpy, so I didn't go with this, but i'll keep it saved for sure, thanks!
1

I would suggest to introduce a nested list for easier iterating and just compare current line with the previous. Later you can split the list just by index

import numpy as np

data = np.arange(25).reshape(5,5)
indexList = np.array(([[1,1,2,2,4]]))
indexList = np.transpose(indexList)
array = np.hstack((indexList, data))

range = [[]]
n=row[0]
for row in array:
    if row[0]!= n:
        n = row[0]
        range.append([])
        range[len(range)-1].append(row)
    else:
        range[len(range)-1].append(row)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.