I have a data frame that looks like this:
boat_type boat_type_2
Not Known Not Known
Not Known kayak
ship Not Known
Not Known Not Known
ship Not Known
And I want to create a third columns boat_type_final that should look like this:
boat_type boat_type_2 boat_type_final
Not Known Not Known cruise
Not Known kayak kayak
ship Not Known ship
Not Known Not Known cruise
ship Not Known ship
So basically if 'Not Known' is present in bothboat_type and boat_type_2, then the value should be 'cruise'. But if there is a string other than 'Not Known' in the first two columns, then boat_type_final should be filled with that string, either 'kayak' or 'ship'.
What's the most elegant way to do this? I've seen several options such as where, creating a function, and/or logic, and I'd like to know what a true pythonista would do.
Here's my code so far:
import pandas as pd
import numpy as np
data = [{'boat_type': 'Not Known', 'boat_type_2': 'Not Known'},
{'boat_type': 'Not Known', 'boat_type_2': 'kayak'},
{'boat_type': 'ship', 'boat_type_2': 'Not Known'},
{'boat_type': 'Not Known', 'boat_type_2': 'Not Known'},
{'boat_type': 'ship', 'boat_type_2': 'Not Known'}]
df = pd.DataFrame(data
df['phone_type_final'] = np.where(df.phone_type.str.contains('Not'))...