I am looking for a way to get class label from my dataframe containing rows of features.
For instance, in this example:
df = pd.DataFrame([
['1', 'a', 'bb', '0'],
['1', 'a', 'cc', '0'],
['2', 'a', 'dd', '1'],
['2', 'a', 'ee', '1'],
['3', 'a', 'ff', '2'],
['3', 'a', 'gg', '2'],
['3', 'a', 'hh', '2']], columns = ['ID', 'name', 'type', 'class'])
df
ID name type class
0 1 a bb 0
1 1 a cc 0
2 2 a dd 1
3 2 a ee 1
4 3 a ff 2
5 3 a gg 2
6 3 a hh 2
My class array should be (i.e. for each ID the class value should be picked once):
class
array([0., 1., 2.,])
EDIT
df['class'].values
produces array(['0', '0', '1', '1', '2', '2', '2'], dtype=object)
Expected answer:
I want array([0, 1, 2])
df.drop_duplicates('ID')['class']valuesis giving you.