0

I am trying to get the column names which have cell values less than .2, without repeating a combination of columns. I tried this to iterate over the column names without success:

pvals2=pd.DataFrame({'col1': [1, .2,.7], 
                     'col2': [.2, 1,.01],
                     'col3': [.7,.01,1]},
                    index = ['col1', 'col2', 'col3'])
print(pvals2)
print('---')
pvals2.transpose().join(pvals2, how='outer')

My goal is:

col3 col2 .01
#col2 col3 .01 #NOT INCLUDED (because it it a repeat)

4 Answers 4

1

A list comprehension is one way:

pvals2 = pd.DataFrame({'col1': [1, .2,.7], 'col2': [.2, 1,.01], 'col3': [.7,.01,1]},
                      index = ['col1', 'col2', 'col3'])

res = [col for col in pvals2 if (pvals2[col] < 0.2).any()]

# ['col2', 'col3']

To get values as well, as in your desired output, requires more specification, as a column may have more than one value less than 0.2.

Sign up to request clarification or add additional context in comments.

Comments

0

Iterate through the columns and check if any value meets your conditions:

pvals2=pd.DataFrame({'col1': [1, .2,.7], 
                 'col2': [.2, 1,.01],
                 'col3': [.7,.01,1]})

cols_with_small_values = set()
for col in pvals2.columns:     
    if any(i < 0.2 for i in pvals2[col]):
        cols_with_small_values.add(col)
        cols_with_small_values.add(pvals2[col].min())

print(cols_with_small_values)


RESULT: {'col3', 0.01, 'col2'}

any is a built-in. This question has a good explanation for how any works. And we can use a set to assure each column will only appear once.

We use DataFrame.min() to get the small value that caused us to select this column.

Comments

0

You could use stack and then filter out values < 0.2. Then keep the last duplicated value

pvals2.stack()[pvals2.stack().lt(.2)].drop_duplicates(keep='last')

col3  col2    0.01
dtype: float64

Comments

0
pvals2=pd.DataFrame({'col1': [1, .2,.7], 
             'col2': [.2, 1,.01],
             'col3': [.7,.01,1]},
            index = ['col1', 'col2', 'col3'])


pvals2.min().where(lambda x : x<0.1).dropna()

Output

col2    0.01
col3    0.01
dtype: float64

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.