4

I am working on determining correlation for a set of data containing boolean values. The ideal situation would be to replace all instances of booleans with 1's and 0's. How can I most efficiently parse through my numPy array and replace these values? Bellow is what I have to work with and the output...

def findCorrelation(csvFileName):
    data = pd.read_csv(csvFileName)
    data = data.values
    df = pd.DataFrame(data=data)
    npList = np.asarray(df)
    print npList
    print df.corr()

Output:

   [[320 True]
     [400 False]
     [350 True]
     [360 True]
     [340 True]
     [340 True]
     [425 False]
     [380 False]
     [365 True]]
    Empty DataFrame
    Columns: []
    Index: []
    Success

Process finished with exit code 0

1
  • 2
    If your array is arr, then what you need is arr.astype(int). Commented May 31, 2017 at 2:14

1 Answer 1

5

The function you're looking for is astype (documentation).

Example:

import numpy as np

a = np.asarray([[320, True], [400, False], [350, True], [360, True], [340, True], [340, True], [425, False], [380, False], [365, True]]).astype(int)

print (a)

Output:

[[320   1]
 [400   0]
 [350   1]
 [360   1]
 [340   1]
 [340   1]
 [425   0]
 [380   0]
 [365   1]]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.