2

From the following dataframe:

dim_0 dim_1                                             
0     0       40.54  23.40  6.70  1.70  1.82  0.96  1.62
      1      175.89  20.24  7.78  1.55  1.45  0.80  1.44
      2        0.00   0.00  0.00  0.00  0.00  0.00  0.00
1     0       21.38  24.00  5.90  1.60  2.55  1.50  2.36
      1      130.29  18.40  8.49  1.52  1.45  0.80  1.47
      2        0.00   0.00  0.00  0.00  0.00  0.00  0.00
2     0        6.30  25.70  5.60  1.70  2.16  1.16  1.87    
      1       73.45  21.49  6.88  1.61  1.61  0.94  1.63
      2        0.00   0.00  0.00  0.00  0.00  0.00  0.00
3     0       16.64  25.70  5.70  1.60  2.17  1.12  1.76
      1      125.89  19.10  7.52  1.43  1.44  0.78  1.40
      2        0.00   0.00  0.00  0.00  0.00  0.00  0.00
4     0       41.38  24.70  5.60  1.50  2.08  1.16  1.85
      1        0.00   0.00  0.00  0.00  0.00  0.00  0.00
      2        0.00   0.00  0.00  0.00  0.00  0.00  0.00
5     0      180.59  16.40  3.80  1.10  4.63  3.86  5.71
      1        0.00   0.00  0.00  0.00  0.00  0.00  0.00
      2        0.00   0.00  0.00  0.00  0.00  0.00  0.00
6     0       13.59  24.40  6.10  1.70  2.62  1.51  2.36
      1      103.19  19.02  8.70  1.53  1.48  0.76  1.38
      2        0.00   0.00  0.00  0.00  0.00  0.00  0.00
7     0        3.15  24.70  5.60  1.50  2.14  1.22  2.00
      1       55.90  23.10  6.07  1.50  1.86  1.12  1.87
      2      208.04  20.39  6.82  1.35  1.47  0.95  1.67

How can I get only the rows from dim_01 that match the array [1 0 0 1 2 0 1 2]?

Desired result is:

 0      175.89  20.24  7.78  1.55  1.45  0.80  1.44
 1       21.38  24.00  5.90  1.60  2.55  1.50  2.36
 2        6.30  25.70  5.60  1.70  2.16  1.16  1.87
 3      125.89  19.10  7.52  1.43  1.44  0.78  1.40
 4        0.00   0.00  0.00  0.00  0.00  0.00  0.00
 5      180.59  16.40  3.80  1.10  4.63  3.86  5.71
 7      103.19  19.02  8.70  1.53  1.48  0.76  1.38
 8      208.04  20.39  6.82  1.35  1.47  0.95  1.67

I've tried using slicing, cross-section, etc but no success.

Thanks in advance for the help.

0

3 Answers 3

1

Use MultiIndex.from_arrays and select by DataFrame.loc:

arr = np.array([1, 0, 0, 1, 2, 0, 1 ,2])

df = df.loc[pd.MultiIndex.from_arrays([df.index.levels[0], arr])]
print (df)
          2      3     4     5     6     7     8
0                                               
0 1  175.89  20.24  7.78  1.55  1.45  0.80  1.44
1 0   21.38  24.00  5.90  1.60  2.55  1.50  2.36
2 0    6.30  25.70  5.60  1.70  2.16  1.16  1.87
3 1  125.89  19.10  7.52  1.43  1.44  0.78  1.40
4 2    0.00   0.00  0.00  0.00  0.00  0.00  0.00
5 0  180.59  16.40  3.80  1.10  4.63  3.86  5.71
6 1  103.19  19.02  8.70  1.53  1.48  0.76  1.38
7 2  208.04  20.39  6.82  1.35  1.47  0.95  1.67

arr = np.array([1, 0, 0, 1, 2, 0, 1 ,2])
df = df.loc[pd.MultiIndex.from_arrays([df.index.levels[0], arr])].droplevel(1)
print (df)
        2      3     4     5     6     7     8
0                                             
0  175.89  20.24  7.78  1.55  1.45  0.80  1.44
1   21.38  24.00  5.90  1.60  2.55  1.50  2.36
2    6.30  25.70  5.60  1.70  2.16  1.16  1.87
3  125.89  19.10  7.52  1.43  1.44  0.78  1.40
4    0.00   0.00  0.00  0.00  0.00  0.00  0.00
5  180.59  16.40  3.80  1.10  4.63  3.86  5.71
6  103.19  19.02  8.70  1.53  1.48  0.76  1.38
7  208.04  20.39  6.82  1.35  1.47  0.95  1.67
Sign up to request clarification or add additional context in comments.

Comments

1

I'd go with advanced indexing using Numpy:

l = [1, 0, 0, 1, 2, 0, 1, 2]

i,j = df.index.levels
ix = np.array(l)+np.arange(i.max()+1)*(j.max()+1)
pd.DataFrame(df.to_numpy()[ix])

       0      1     2     3     4     5     6
0  175.89  20.24  7.78  1.55  1.45  0.80  1.44
1   21.38  24.00  5.90  1.60  2.55  1.50  2.36
2    6.30  25.70  5.60  1.70  2.16  1.16  1.87
3  125.89  19.10  7.52  1.43  1.44  0.78  1.40
4    0.00   0.00  0.00  0.00  0.00  0.00  0.00
5  180.59  16.40  3.80  1.10  4.63  3.86  5.71
6  103.19  19.02  8.70  1.53  1.48  0.76  1.38
7  208.04  20.39  6.82  1.35  1.47  0.95  1.67

Comments

1

Try the following code:

mask_array = [1 0 0 1 2 0 1 2]

df_first = pd.DataFrame() # < It's your first array > 

new_array = df_first[df_first['dim_1'].isin(mask_array)]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.