I have data device_class as below:
Base G Pref Sier Val Other latest_class d_id
0 2 0 0 12 0 Val 38
12 0 0 0 0 0 Base 39
0 0 12 0 0 0 Pref 40
0 0 0 12 0 0 Sier 41
0 0 0 12 0 0 Sier 42
12 0 0 0 0 0 Base 43
0 0 0 0 0 12 Other 45
0 0 0 0 0 12 Other 46
0 12 0 0 0 0 G 47
0 0 12 0 0 0 Pref 48
0 0 0 0 0 12 Other 51
0 0 8 5 0 0 Sier 53
0 0 0 0 12 0 Val 54
0 0 0 0 12 0 Val 55
I want to select only the rows(or devices) where the devices: 1. Has been in their latest class for a minimum of 3 consecutive months 2. I need to filter out records where latest_class = 'Other'. 3. Now the above data is a year's data and for some devices like ( 38) there are two classes which the device has been a part of G and Val.These types of devices I need to filter out.
So the expected output will be:
Base G Pref Sier Val Other latest_class d_id
12 0 0 0 0 0 Base 39
0 0 12 0 0 0 Pref 40
0 0 0 12 0 0 Sier 41
0 0 0 12 0 0 Sier 42
12 0 0 0 0 0 Base 43
0 12 0 0 0 0 G 47
0 0 12 0 0 0 Pref 48
0 0 0 0 12 0 Val 54
0 0 0 0 12 0 Val 55
I have done the below to get only records whose values in latest_class are more than 3:
i = np.arange(len(device_class))
j = (device_class.columns[:-1].values[:, None] == device_class.latest_class.values).argmax(0)
device_class_latest = device_class.iloc[np.flatnonzero(device_class.values[i,j] >= 3)]
Can someone please help me with this?