1

I'am new in numpy and I want to split array 2D based on columns values if value in another list, I converted a pandas dataframe on numpy array of 2D and I have a another list, I want to split my numpy array on two others array, the first based on (if values of second column in list) and the second contains the rest of my numpy array, and I want to get the rest of my list(contains all values doesn't exist in my numpy array)

numpy_data = np.array([
        [1, 'p1', 2],
        [11, 'p2', 8],
        [1, 'p8', 21],
        [13, 'p10', 2] ])

list_value = ['p1', 'p3', 'p8']

The expected output :

data_in_list = [
        [1, 'p1', 2],
        [1, 'p8', 21]]
list_val_in_numpy = ['p1', 'p8'] # intersection of second column with my list

rest_data = [
        [11, 'p2', 8],
        [13, 'p10', 2]] 
rest_list_value = ['p3']

In my code I have found how to get first output :

first_output =  numpy_data[np.isin(numpy_data[:,1], list_value)]    

But I couldn't find the rest of my numpy, I have tried too, Browse my list and seek if values in second column of array and then delete this row, in this case I dont need the first output (That I called data_in_list, b-coz I do what I need on it), here I need the others output

for val in l :
    row = numpy_data[np.where(numpy_data[:,1]== val)]
    row.size != 0 :
        # My custom code
        # then remove this row from my numpy, I couldn't do it

Thanks in advance

3 Answers 3

2

Use python's invert ~ operator over the result of the np.isin:

rest = numpy_data[~np.isin(numpy_data[:,1], list_value)]    
Sign up to request clarification or add additional context in comments.

1 Comment

And for list, I use set(list_value)-set(list(numpy_data[:,1]))
0

There are multiple ways of doing this. I would prefer a vectorized way of using list comprehension. But for sake of clarity here is loop way of doing the same thing.

data_in_list=[]
list_val_in_numpy = []
rest_data=[]
for x in numpy_data:
    for y in x:
        if y in list_value:
            data_in_list.append(x)
            for x in list_value:
                if x == y:
                    list_val_in_numpy.append(x)
for x in numpy_data:
    if x in data_in_list:
        pass
    else:
        rest_data.append(x)

This gives you all the three lists you were looking for. Concatenate to get the list you want exactly.

2 Comments

Let me know if you want to see the vectorzied version. The beauty and elegence of list comprehensions often are lost in mind splitting nesting. But the entire code above can be squezed into couple of lines.
I didn't want to use list comprehesion, just b-coz I knew that it is solution using numpy which is the best in this case
0

list comprehension will solve it I guess:

numpy_data = [
        [1, 'p1', 2],
        [11, 'p2', 8],
        [1, 'p8', 21],
        [13, 'p10', 2],

]

list_value = ['p1', 'p3', 'p8']

output_list = [[item] for item in numpy_data if item[1] in list_value]
print(output_list)

output:

[[[1, 'p1', 2]], [[1, 'p8', 21]]]

3 Comments

I want to use numpy b-coz Ihave a big data
wanted to stick to his expected output
may be u right but using list comprehesion is the hard way & is the classic way to find my desired output, see Gilad Green's answer, it seems perfect

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.