Dataframe with column of strings to column of integer lists

Question

I have a dataframe where in one column, the data for each row is a string like this:

[[25570], [26000]]

I want each entry in the series to become a list of integers.

IE:

[25570, 26000] ^ ^ int int

So far I can get it to a list of strings, but retaining empty spaces:

s = s.str.replace("[","").str.replace("]","")
    s = s.str.replace(" ","").str.split(",")

Dict for Dataframe:

     f =  {'chunk': {0: '[72]',
  1: '[72, 68]',
  2: '[72, 68, 65]',
  3: '[72, 68, 65, 70]',
  4: '[72, 68, 65, 70, 67]',
  5: '[72, 68, 65, 70, 67, 74]',
  6: '[68]',
  7: '[68, 65]',
  8: '[68, 65, 70]',
  9: '[68, 65, 70, 67]'},
 'chunk_completed': {0: '[25570]',
  1: '[26000]',
  2: '[26240]',
  3: '[26530]',
  4: '[26880]',
  5: '[27150]',
  6: '[26000]',
  7: '[26240]',
  8: '[26530]',
  9: '[26880]'},
 'chunk_id': {0: '72',
  1: '72-68',
  2: '72-68-65',
  3: '72-68-65-70',
  4: '72-68-65-70-67',
  5: '72-68-65-70-67-74',
  6: '68',
  7: '68-65',
  8: '68-65-70',
  9: '68-65-70-67'},
 'diffs_avg': {0: nan,
  1: 430.0,
  2: 335.0,
  3: 320.0,
  4: 327.5,
  5: 316.0,
  6: nan,
  7: 240.0,
  8: 265.0,
  9: 293.3333333333333},
 'sd': {0: nan,
  1: nan,
  2: 134.35028842544406,
  3: 98.48857801796105,
  4: 81.80260794538685,
  5: 75.3657747256671,
  6: nan,
  7: nan,
  8: 35.355339059327385,
  9: 55.075705472861024},
 'timecodes': {0: '[[25570]]',
  1: '[[25570], [26000]]',
  2: '[[25570], [26000], [26240]]',
  3: '[[25570], [26000], [26240], [26530]]',
  4: '[[25570], [26000], [26240], [26530], [26880]]',
  5: '[[25570], [26000], [26240], [26530], [26880], [27150]]',
  6: '[[26000]]',
  7: '[[26000], [26240]]',
  8: '[[26000], [26240], [26530]]',
  9: '[[26000], [26240], [26530], [26880]]'}}

iamklaus · Accepted Answer · 2019-03-26 13:20:06Z

2

try this

f = pd.DataFrame().from_dict(s, orient='index')
f.columns = ['timecodes']
f['timecodes'].apply(lambda x: [a[0] for a in eval(x) if a])

Output

Out[622]:
0                                        [25570]
1                                 [25570, 26000]
2                          [25570, 26000, 26240]
3                   [25570, 26000, 26240, 26530]
4            [25570, 26000, 26240, 26530, 26880]
5     [25570, 26000, 26240, 26530, 26880, 27150]
6                                        [26000]
7                                 [26000, 26240]
8                          [26000, 26240, 26530]
9                   [26000, 26240, 26530, 26880]
10           [26000, 26240, 26530, 26880, 27150]
11                                       [26240]
12                                [26240, 26530]
13                         [26240, 26530, 26880]
14                  [26240, 26530, 26880, 27150]
15                                       [26530]
16                                [26530, 26880]
17                         [26530, 26880, 27150]
18                                       [26880]
19                                [26880, 27150]
Name: 0, dtype: object

edited Mar 26, 2019 at 13:20

answered Mar 26, 2019 at 12:50

iamklaus

3,7682 gold badges14 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

syntheso Over a year ago

How could I make this apply in a situation where there are no values in the original string? ie [[]]

syntheso Over a year ago

I don't understand this solution and can't get it to work. This is part of a larger df, fyi.

iamklaus Over a year ago

i am assuming that you have another entry like [[]] in the dict..right.. if thats the case the update should work, i added a check in the list comprehension "if a ".. that should work

syntheso Over a year ago

in the dataframe, yes. if my dataframe is 'f' and the column in question is 'timecodes' how should I format your answer?

Collectives™ on Stack Overflow

Dataframe with column of strings to column of integer lists

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related