Looping through list with dataframe elements in python

Question

I want to iterate over a list, which has dataframes as its elements.

Example: ls is my list with below elements (two dataframes)

                           seq  score    status
4366  CGAGGCTGCCTGTTTTCTAGTTG   5.15  negative
5837  GGACCTTTTTTACAATATAGCCA   3.48  negative
96    TTTCTAGCCTACCAAAATCGGAG  -5.27  negative
1369  CTTCCTATCTTCATTCTTCGACT   1.28  negative
1223                CAAGTTTGT   2.06  negative
5451  TGTTTCCACACCTGTCTCAGCTC   4.48  negative
1277  GTACTGTGGAATCTCGGCAGGCT   4.87  negative
5299  CATAATGAATGCCCCATCAATTG  -7.19  negative
3477                ATGGCACTG  -3.60  negative
2953  AGTAATTCTGTTGCCTGAAGATA   2.86  negative
4586                TGGGCAAGT   2.48  negative
3746                AATGAGAGG  -3.67  negative,
                         seq  score    status
1983  AGCAGATCAAACGGGTAAAGGAC  -4.81  negative
3822  CCCTGGCCCACGCACTGCAGTCA   3.32  negative
1127  GCAGAGATGCTGATCTTCACGTC  -6.77  negative
3624                TGAGTATGG   0.60  negative
4559                AAGGTTGGG   4.94  negative
4391  ATGAAGATCATCGAAATCAGTTT  -2.09  negative
4028  TCTCCGACAATGCCTATCAGTAC   1.14  negative
2694                CAGGGAACT   0.98  negative
2197  CTTCCATTGAGCTGCTCCAGCAC  -0.97  negative
2025  TGTGATCTGGCTGCACGCACTGT  -2.13  negative
5575                CCAGAAAGG  -2.45  negative
275   TCTGTTGGGTTTTCATACAGCTA   7.11  negative

When I am accessing its elements, I am getting following error. list indices must be integers, not DataFrame

I tried the following code:

cut_off = [1,2,3,4]

for i in ls:
    for co in cut_off:
        print "Negative set : " + "cut off value =", str(
            co), number of variants = ", str((ls[i]['score'] > co).sum())

I want to access each dataframe element in the list and compare the score value of each row. If it is more than the cut_off value, it should sum it and give me the total number of rows which value > cut_off value.

Expected output: Negative set : cut off value = 0 , number of variants = 8

Thanks

i is not the index of the row, it is the row, so use i instead of ls[i]. It would be good to rename the variable. — Peter Collingridge
– Peter Collingridge, Commented Dec 3, 2019 at 10:46

Mr_and_Mrs_D · Accepted Answer · 2019-12-03 10:47:48Z

1

This should work ok

cut_off = [1,2,3,4]

for df in ls:
    for co in cut_off:
        print "Negative set : " + "cut off value =", str(
            co), number of variants = ", str((df['score'] > co).sum())

answered Dec 3, 2019 at 10:47

Mr_and_Mrs_D

34.5k45 gold badges193 silver badges373 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

thisisrandy · Accepted Answer · 2019-12-03 10:53:04Z

0

It looks like you are expecting i to be an index into your list ls, when in fact it is the element itself. For example:

foo = [ "one", "two", "three" ]
for i in foo:
     print(i)

outputs

one
two
three

while

for i, elm in enumerate(foo):
     print(f"{i}: {elm}")

outputs:

0: one
1: two
2: three

So I think enumerate is what you're looking for.

answered Dec 3, 2019 at 10:53

thisisrandy

3,0853 gold badges16 silver badges27 bronze badges

Comments

Anmol Narang · Accepted Answer · 2019-12-03 11:01:18Z

0

for i in range(len(ls)):
    for co in cut_off:
        print("Negative set : " + "cut off value =", str(
        co), number of variants = ", (sum(list((ls[ls['score'] > co]['score'])))

I hope this helps...

answered Dec 3, 2019 at 11:01

Anmol Narang

5515 silver badges10 bronze badges

Collectives™ on Stack Overflow

Looping through list with dataframe elements in python

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related