Converting pandas.Multindex to numpy.ndarray with dtype float

Question

When converting a pandas.Multiindex to a numpy.ndarray, the output is a one dimensional ndarray with dtype=object as seen in the following example:

df = pd.DataFrame({
    'A': [10, 20, 30, 40, 50, 60],
    'B': [0,1,2,3,4,5],
    'C': ['K0', 'K1', 'K2', 'K3', 'K4', 'K5']
}).set_index(['A','B'])

The df will be:

A	B	C
10	0	K0
20	1	K1
30	2	K2
40	3	K3
50	4	K4
60	5	K5

The output for df.index.to_numpy() is a one dimensional ndarray with dtype=object:

array([(10, 0), (20, 1), (30, 2), (40, 3), (50, 4), (60, 5)], dtype=object)

but I want:

array([[10,  0],
       [20,  1],
       [30,  2],
       [40,  3],
       [50,  4],
       [60,  5]])

On How to convert a Numpy 2D array with object dtype to a regular 2D array of floats, I found the following solution:

np.vstack(df.index)

Is there any more direct or better solution?

What do you mean by better? Isn't np.vstack(df.index) precisely the desired output? — fsl
– fsl, Commented Mar 3, 2021 at 1:27
Yeah, current solution seems fine, but I was wondering if there is any case that my solution won't work or if pandas can give me the correct output without the need to do np.vstack. — Ali_MM
– Ali_MM, Commented Mar 4, 2021 at 19:08
I was also thinking there can be a downside to my method, compared to, say, @delimiter's solution below (in terms of type conversion or what not), so I thought I can have some people doublecheck it. — Ali_MM
– Ali_MM, Commented Mar 4, 2021 at 19:16

delimiter · Accepted Answer · 2021-03-03 02:09:54Z

2

I am pretty sure you will get what you want by flattening the multi index and taking numpy array from the result. E.g. by using the following syntax

np.array(list(df.index))

answered Mar 3, 2021 at 2:09

delimiter

8156 silver badges13 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Ali_MM Over a year ago

This works, too. I was wondering if this is better( faster, more applicable to all situation, etc.) or the one I found.

delimiter Over a year ago

That would be a point for you to measure the performance, there are ways of doing it, but likely it won't be very noticeable if your dataset is not sizeable enough. In the meantime, don't hesitate to accept the response to your liking.

Ferris · Accepted Answer · 2021-03-03 02:23:03Z

2

turn the index to columns.

df.reset_index()[['A', 'B']].values

answered Mar 3, 2021 at 2:23

Ferris

5,6611 gold badge18 silver badges27 bronze badges

1 Comment

Ali_MM Over a year ago

This can work, too. I'm still wondering which method is better/faster/more general. For example, is it possible that in one of the solutions given so far, the dtype of the cells are changed( e.g. from int to float or the other way around).

Collectives™ on Stack Overflow

Converting pandas.Multindex to numpy.ndarray with dtype float

2 Answers 2

2 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related