1

I have a dataframe that looks like this:

In [169]: dfstacked
Out[169]:
    Percent Held  Rank
0          14.10   [1]
1          11.13   [2]
2          10.11   [3]
3           8.99   [4]
4           4.79   [5]
5           2.92   [6]
6           2.79   [7]
7           2.63   [8]
8           2.63   [9]
9           1.83  [10]
10          1.81  [11]
11          1.66  [12]
12          1.66  [13]
13          1.64  [14]  
14          1.63  [15]
15          1.62  [16]
16          1.26  [17]
17          1.08  [18]
18          1.08  [19]
19          1.07  [20]

The underlying datatype of dfstacked["Rank"] is array. I created it using a regex (using str.findall()), but to be safe I check:

In [171]: dfstacked["Rank"].dtype
Out[171]: dtype('O')

However, I want to cast dfstacked["Rank"] to an Series with int datatype so that I can perform some statistical tests on the values in dfstacked["Rank"]. How would I go about doing this?

So far I have tried to force an integer Series using Series.map and Series.astype(). Both return ValueErrors.

Ultimately, I want

    Percent Held  Rank
0          14.10   1
1          11.13   2
2          10.11   3
3           8.99   4
4           4.79   5
5           2.92   6
6           2.79   7
7           2.63   8
8           2.63   9
9           1.83   10
10          1.81   11
11          1.66   12
12          1.66   13
13          1.64   14  
14          1.63   15 
15          1.62   16
16          1.26   17 
17          1.08   18
18          1.08   19
19          1.07   20
2
  • What are the actual values of the Rank column: one-element lists or strings like '[1]' Commented Apr 13, 2015 at 20:51
  • They are one element lists Commented Apr 13, 2015 at 20:54

1 Answer 1

3

I believe the following should work:

In [6]:

df = pd.DataFrame({'Rank':[np.array([0]), np.array([1]), np.array([2])]})
df
Out[6]:
  Rank
0  [0]
1  [1]
2  [2]
In [8]:

df['Rank'] = df['Rank'].apply(lambda x: x[0])
df
Out[8]:
   Rank
0     0
1     1
2     2

In [9]:

df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 3 entries, 0 to 2
Data columns (total 1 columns):
Rank    3 non-null int64
dtypes: int64(1)
memory usage: 48.0 bytes

So in your case: dfstacked['Rank'] = dfstacked['Rank'].apply(lambda x: x[0])

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.