0

I want to iterate over a list, which has dataframes as its elements.

Example: ls is my list with below elements (two dataframes)

                           seq  score    status
4366  CGAGGCTGCCTGTTTTCTAGTTG   5.15  negative
5837  GGACCTTTTTTACAATATAGCCA   3.48  negative
96    TTTCTAGCCTACCAAAATCGGAG  -5.27  negative
1369  CTTCCTATCTTCATTCTTCGACT   1.28  negative
1223                CAAGTTTGT   2.06  negative
5451  TGTTTCCACACCTGTCTCAGCTC   4.48  negative
1277  GTACTGTGGAATCTCGGCAGGCT   4.87  negative
5299  CATAATGAATGCCCCATCAATTG  -7.19  negative
3477                ATGGCACTG  -3.60  negative
2953  AGTAATTCTGTTGCCTGAAGATA   2.86  negative
4586                TGGGCAAGT   2.48  negative
3746                AATGAGAGG  -3.67  negative,
                         seq  score    status
1983  AGCAGATCAAACGGGTAAAGGAC  -4.81  negative
3822  CCCTGGCCCACGCACTGCAGTCA   3.32  negative
1127  GCAGAGATGCTGATCTTCACGTC  -6.77  negative
3624                TGAGTATGG   0.60  negative
4559                AAGGTTGGG   4.94  negative
4391  ATGAAGATCATCGAAATCAGTTT  -2.09  negative
4028  TCTCCGACAATGCCTATCAGTAC   1.14  negative
2694                CAGGGAACT   0.98  negative
2197  CTTCCATTGAGCTGCTCCAGCAC  -0.97  negative
2025  TGTGATCTGGCTGCACGCACTGT  -2.13  negative
5575                CCAGAAAGG  -2.45  negative
275   TCTGTTGGGTTTTCATACAGCTA   7.11  negative

When I am accessing its elements, I am getting following error. list indices must be integers, not DataFrame

I tried the following code:

cut_off = [1,2,3,4]

for i in ls:
    for co in cut_off:
        print "Negative set : " + "cut off value =", str(
            co), number of variants = ", str((ls[i]['score'] > co).sum())

I want to access each dataframe element in the list and compare the score value of each row. If it is more than the cut_off value, it should sum it and give me the total number of rows which value > cut_off value.

Expected output: Negative set : cut off value = 0 , number of variants = 8

Thanks

2
  • i is a data frame and you cannot use that as index. Commented Dec 3, 2019 at 10:46
  • 1
    i is not the index of the row, it is the row, so use i instead of ls[i]. It would be good to rename the variable. Commented Dec 3, 2019 at 10:46

3 Answers 3

1

This should work ok

cut_off = [1,2,3,4]

for df in ls:
    for co in cut_off:
        print "Negative set : " + "cut off value =", str(
            co), number of variants = ", str((df['score'] > co).sum())
Sign up to request clarification or add additional context in comments.

Comments

0

It looks like you are expecting i to be an index into your list ls, when in fact it is the element itself. For example:

foo = [ "one", "two", "three" ]
for i in foo:
     print(i)

outputs

one
two
three

while

for i, elm in enumerate(foo):
     print(f"{i}: {elm}")

outputs:

0: one
1: two
2: three

So I think enumerate is what you're looking for.

Comments

0
for i in range(len(ls)):
    for co in cut_off:
        print("Negative set : " + "cut off value =", str(
        co), number of variants = ", (sum(list((ls[ls['score'] > co]['score'])))

I hope this helps...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.