I have the following dataset in a .csv file:
feature1, feature2, feature3, feature4
0, 42, 2, 1000
2, 13, ?, 997
1, 30, ?, 861
2, 29, ?, ?
I would like to create a pandas dataframe or a numpy array where I don't have the features with an x% of unknown data (where x was specified previously in code).
?with something? Is that your question?NaNvalues. This is important because the dataframe currently has mixed types