I have one dataframe with sessions - one session, one row, so SID is unique. The session has a doctor name.
| SID | Doctor | Patient |
|---|---|---|
| 1 | robby | david |
| 2 | langdon | sara |
| 3 | langdon | michael |
I have another dataframe with the SID, and a record of who opened the patient file. The opening person can be either the doctor or anyone else from the clinic. If two different people from the clinic open the patient file in the SID, I will have two rows with the same SID, only different opener_name.
| SID | opener_name |
|---|---|
| 1 | robby |
| 1 | dana |
| 2 | dana |
I want to generate a true/false column in the sessions dataframe for:
If the doctor opened the file
If anyone opened the file at all (either the doctor or anyone else)
Sessions were not necessarily opened by anyone, and if not wont appear at all.
The output I desire is this:
| SID | Doctor | Patient | is_doctor_opened | is_anyone_opened |
|---|---|---|---|---|
| 1 | robby | david | True | True |
| 2 | langdon | sara | False | True |
| 3 | langdon | michael | False | False |
If I merge the two files on session ID, I will get duplicate rows, and I'm not sure how to rid of the duplicates in that scenario.
I've also tried playing around with simple booleans but I run into problems.
How do I get an organized dataframe with the booleans and keep it to one session, one row?
.drop_duplicates()?