I have a DataFrame with two columns a and b. I want change NaN values in column b. Eg: For the value of 123 in column a, column b has both abc and NaN. I want both to change to abc:
df
a b
0 123 NaN
1 123 abc
2 456 def
3 456 NaN
My expected result is:
df
a b
0 123 abc
1 123 abc
2 456 def
3 456 def
Sample data:
import pandas as pd
from io import StringIO
s = '''\
a,b
123,NaN
123,abc
456,def
456,NaN
'''
df = pd.read_csv(StringIO(s))
Describing the issue and what i have tried:
df.loc[df.a == 123, 'b'] = "abc"
Here i'm able to change only for a particular value. i.e., replace 'b' with abc if 'a' is 123
But for df.a == 123 and with 'b' value NaN i also wanted it to update abc.
So I tried this,
df.loc[df.a == NaN, 'b'] = "abc"
But, This made all the empty columns in df to abc.
So, How do i proceed from this?
Edit 2: Sample data 2
raw_data = {'a': [123, 123, 456, 456,789], 'b':
[np.nan,'abc','def',np.nan,np.nan], 'c':
[np.nan,np.nan,0,np.nan,np.nan]}
df = pd.DataFrame(raw_data, columns =
['a', 'b','c'])
Ans:
df['b'] = df['a'].map(df.groupby('a')['b'].first()).fillna(df['b'])
replaceandiloccouldn't succeed. Research : I did with replace and iloc . Sample data: Updated question with sample data. Any other suggestions @AMC