Please try:
Option#1:
df1 = df.sort_values(['IMAGE_NAME','REF'], ascending=False)
df1 = df1.groupby('IMAGE_NAME').agg({'DETECTEDTEXT' : ' '.join , 'REF': 'last','CONFIDENCE':'last'}).reset_index()[['IMAGE_NAME','REF','CONFIDENCE','DETECTEDTEXT']]
df1.loc[df1['REF'].isnull(), 'CONFIDENCE'] = np.NaN
df1.rename(columns={'DETECTEDTEXT':'ALL_TEXT'},inplace=True)
Option#2
df1 = df.fillna('0')
df1 = df1.groupby('IMAGE_NAME').agg({'DETECTEDTEXT' : ' '.join , 'REF': 'max'}).reset_index()
df1 = df1.merge(df,on=['IMAGE_NAME','REF'], how='left')[['IMAGE_NAME','REF','CONFIDENCE','DETECTEDTEXT_x']]
df1 = df1.rename(columns={'DETECTEDTEXT_x' : 'ALL_TEXT'})
df1['REF'] = df1.REF.replace('0',np.NaN)
Both Prints:
IMAGE_NAME REF CONFIDENCE ALL_TEXT
0 14.jpeg 9b01dc1e 98.488983 INY 9:36 MICKEYD19 Ln ADULT
1 15.jpeg NaN NaN Ln ADULT
Input df:
DETECTEDTEXT CONFIDENCE IMAGE_NAME REF ALL_TEXT
0 INY 73.215164 14.jpeg NaN NaN
1 9:36 91.633514 14.jpeg NaN NaN
2 MICKEYD19 89.422897 14.jpeg NaN NaN
3 Ln 59.588081 14.jpeg NaN NaN
4 ADULT 98.488983 14.jpeg 9b01dc1e NaN
5 Ln 59.588081 15.jpeg NaN NaN
6 ADULT 98.488983 15.jpeg NaN NaN
Option#1:
Option#1 is more elegant and came to me after I wrote Option#2. Just sorting the IMAGE_NAMe & 'REF' combo and using groupby.
Option#2:
First replacing all NaNs to zeros for ease of calculation,the groupby with 'REF' : 'MAX returns 9b01dc1e for 14.jpeg and 0 for 15.jpeg. Now using pd.merge, pick the 'confidence' score corresponding to those REF values. For 14.jpeg, it returns the correct match for 9b01dc1e from original df and for 15.jpeg, it returns NaN since there is no match for 0 in the original df. So we get the required input.
Note:
The code may need some changes if you can have multiple not null REF values for the same image. If so, we might have to do some other pre-processing as well. Other than that, this should work.