Pandas Python: Cast array type to int

Question

I have a dataframe that looks like this:

In [169]: dfstacked
Out[169]:
    Percent Held  Rank
0          14.10   [1]
1          11.13   [2]
2          10.11   [3]
3           8.99   [4]
4           4.79   [5]
5           2.92   [6]
6           2.79   [7]
7           2.63   [8]
8           2.63   [9]
9           1.83  [10]
10          1.81  [11]
11          1.66  [12]
12          1.66  [13]
13          1.64  [14]  
14          1.63  [15]
15          1.62  [16]
16          1.26  [17]
17          1.08  [18]
18          1.08  [19]
19          1.07  [20]

The underlying datatype of dfstacked["Rank"] is array. I created it using a regex (using str.findall()), but to be safe I check:

In [171]: dfstacked["Rank"].dtype
Out[171]: dtype('O')

However, I want to cast dfstacked["Rank"] to an Series with int datatype so that I can perform some statistical tests on the values in dfstacked["Rank"]. How would I go about doing this?

So far I have tried to force an integer Series using Series.map and Series.astype(). Both return ValueErrors.

Ultimately, I want

    Percent Held  Rank
0          14.10   1
1          11.13   2
2          10.11   3
3           8.99   4
4           4.79   5
5           2.92   6
6           2.79   7
7           2.63   8
8           2.63   9
9           1.83   10
10          1.81   11
11          1.66   12
12          1.66   13
13          1.64   14  
14          1.63   15 
15          1.62   16
16          1.26   17 
17          1.08   18
18          1.08   19
19          1.07   20

What are the actual values of the Rank column: one-element lists or strings like '[1]' — joris
– joris, Commented Apr 13, 2015 at 20:51

EdChum · Accepted Answer · 2015-04-13 20:55:50Z

3

I believe the following should work:

In [6]:

df = pd.DataFrame({'Rank':[np.array([0]), np.array([1]), np.array([2])]})
df
Out[6]:
  Rank
0  [0]
1  [1]
2  [2]
In [8]:

df['Rank'] = df['Rank'].apply(lambda x: x[0])
df
Out[8]:
   Rank
0     0
1     1
2     2

In [9]:

df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 3 entries, 0 to 2
Data columns (total 1 columns):
Rank    3 non-null int64
dtypes: int64(1)
memory usage: 48.0 bytes

So in your case: dfstacked['Rank'] = dfstacked['Rank'].apply(lambda x: x[0])

edited Apr 13, 2015 at 20:55

answered Apr 13, 2015 at 20:52

EdChum

397k204 gold badges836 silver badges583 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Pandas Python: Cast array type to int

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related