How to convert a pandas dataframe to NumPy array [duplicate]

Question

How can I convert a pandas dataframe (21 x 31) into a numpy array?

For example:

array_1 (n_1, n_2, n_3, ... , n31)
array_2 (n_1, n_2, n_3, ... , n31)
...
array_21(n_1, n_2, n_3, ... , n31)

I tried the following code snippet:

np.array(df)

.. and get the following result:

array([[0.00290135, 0.00274017, 0.00531915, 0.00967118, 0.00676983,
        0.0082205 , 0.01096067, 0.01821406, 0.01450677, 0.02401676,
        0.0235332 , 0.03787879, 0.04239201, 0.04190845, 0.04819471,
        0.04932302, 0.06399097, 0.07865893, 0.06995487, 0.06914894,
        0.08107672, 0.06141199, 0.05157963, 0.05141844, 0.03852353,
        0.03546099, 0.02611219, 0.01595745, 0.00435203, 0.00322373,
        0.00257898],
       [0.        , 0.00392927, 0.00638507, 0.01866405, 0.00785855,
        0.01915521, 0.00491159, 0.02308448, 0.01178782, 0.01915521,
        0.03339882, 0.02996071, 0.03192534, 0.05451866, 0.03732809,
        0.04125737, 0.05304519, 0.05599214, 0.0589391 , 0.09528487,
        0.13752456, 0.05108055, 0.02603143, 0.05500982, 0.02799607,
        0.01424361, 0.05157171, 0.02799607, 0.        , 0.00049116,
        0.00147348],
       [0.        , 0.        , 0.01376462, 0.        , 0.00825877,
        0.01238816, 0.00757054, 0.00275292, 0.01307639, 0.01927047,
        0.03234687, 0.04129387, 0.02959394, 0.02615279, 0.05161734,
        0.03991741, 0.05574673, 0.12801101, 0.04335857, 0.07983482,
        0.05918789, 0.12319339, 0.02546456, 0.08878183, 0.01169993,
        0.04542326, 0.02064694, 0.01789401, 0.        , 0.00275292,
        0.        ],
       [...]])

The problem is that the second square bracket is too much. How can I solve this problem?

Sorry, "the second square bracket is too much"? What do you mean? — wjandrea
– wjandrea, Commented Nov 6, 2021 at 18:48
Yes, I know the function .to_numpy. I want to calculate the correlation between the df and this generated data y = np.array(range(0,31,1)). — tanaytuncer
– tanaytuncer, Commented Nov 6, 2021 at 18:56
Those brackets indicate the shape of the array. What is df.to_numpy().shape? How about df.to_numpy().dtype. The array from a dataframe should be 2d; one dimension for the rows, the other for the columns. — hpaulj
– hpaulj, Commented Nov 6, 2021 at 19:25
You have successfully created an array from the dataframe. The question isn't a general How, but rather, how do I get an array with particular shape and dtype. — hpaulj
– hpaulj, Commented Nov 6, 2021 at 19:27

Rodalm · Accepted Answer · 2021-11-06 19:29:14Z

3

It seems that you want to convert the DataFrame into a 1D array (this should be clear in the post).

First, convert the DataFrame to a 2D numpy array using DataFrame.to_numpy (using DataFrame.values is discouraged) and then use ndarray.ravel or ndarray.flatten to flatten the array.

arr = df.to_numpy().ravel()

edited Nov 6, 2021 at 19:29

answered Nov 6, 2021 at 19:24

Rodalm

5,7589 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

BIPUL MANDOL · Accepted Answer · 2021-11-06 19:33:15Z

1

np = df.values

Dataframe contains values property. This property actually holds data as a NumPy array.

Convert n-dimensional numpy array to 1D numpy array .

data = df.values 

# method 1 
raval = data.ravel() 

#method 2 
shape = data.shape 
1d_data = data.reshape(1,shape[0]*shape[1])

edited Nov 6, 2021 at 19:33

answered Nov 6, 2021 at 18:51

BIPUL MANDOL

2874 silver badges10 bronze badges

2 Comments

tanaytuncer Over a year ago

Of course, but I need a one-dimensional array. Like that y = np.array(range(0,31,1))

BIPUL MANDOL Over a year ago

data = df.values # method 1 raval = data.ravel() #method 2 shape = data.shape 1d_data = data.reshape(1,shape[0]*shape[1])

Collectives™ on Stack Overflow

How to convert a pandas dataframe to NumPy array [duplicate]

2 Answers 2

Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Linked

Related