I have 2 columns, I need to take specific string information from each column and create a new column with new strings based on this.
In column "Name" I have wellnames, I need to look at the last 4 characters of each wellname and if it Contains "H" then call that "HZ" in a new column.
I need to do the same thing if the column "WELLTYPE" contains specific words.
Using a Data Analysis program Spotfire I can do this all in one simple equation. (see below).
case
When right([UWI],4)~="H" Then "HZ"
When [WELLTYPE]~="Horizontal" Then "HZ"
When [WELLTYPE]~="Deviated" Then "D"
When [WELLTYPE]~="Multilateral" Then "ML"
else "V"
End
What would be the best way to do this in Python Pandas?
Is there a simple clean way you can do this all at once like in the spotfire equaiton above?
Here is the datatable with the two columns and my hopeful outcome column. (it did not copy very well into this), I also provide the code for the table below.
Name WELLTYPE What I Want
0 HH-001HST2 Oil Horizontal HZ
1 HH-001HST Oil_Horizontal HZ
2 HB-002H Oil HZ
3 HB-002 Water_Deviated D
4 HB-002 Oil_Multilateral ML
5 HB-004 Oil V
6 HB-005 Source V
7 BB-007 Water V
Here is the code to create the dataframe
# Dataframe with hopeful outcome
raw_data = {'Name': ['HH-001HST2', 'HH-001HST', 'HB-002H', 'HB-002', 'HB-002','HB-004','HB-005','BB-007'],
'WELLTYPE':['Oil Horizontal', 'Oil_Horizontal', 'Oil', 'Water_Deviated', 'Oil_Multilateral','Oil','Source','Water'],
'What I Want': ['HZ', 'HZ', 'HZ', 'D', 'ML','V','V','V']}
df = pd.DataFrame(raw_data, columns = ['Name','WELLTYPE','What I Want'])
df