Combine two dataframes in pandas

Question

I have 2 dataframes :

df:

portfolio  symbol  id  var1  var2  var3

df1:

symbol  sector  market  count

I want to add the columns sector and market from df1 to df. df1 has uniques values for symbol and hence a smaller dataframe than df which is the original dataframe.

I tried doing :

pd.merge(df,df1,on='symbol',how='outer')

But the output is extending rows than desired. Can anyone help as to what is missed out here.

Thanks

But the output is extending rows than desired. It seems not, how working pd.merge(df,df1.drop_duplicates('symbol'),on='symbol',how='outer') ? — jezrael
– jezrael, Commented Apr 24, 2020 at 13:25

NYC Coder · Accepted Answer · 2020-04-24 13:32:34Z

2

Have you tried doing an inner join,

df.merge(df1, on='symbol', how='inner')

answered Apr 24, 2020 at 13:32

NYC Coder

7,6443 gold badges14 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Dnorious · Accepted Answer · 2020-04-24 13:39:56Z

1

If you do an outer join, the amount of rows will be the amount of rows the longer column of the two (symbol column) has and thus the one from df. If you only want the amount of unique symbol values you should use an inner join.

answered Apr 24, 2020 at 13:39

Dnorious

557 bronze badges

Comments

dper · Accepted Answer · 2020-04-24 14:59:07Z

1

My apologies, I didn't realise that outer join would also create rows for the second dataframe values if not available in the first dataframe. that is the reason why I was getting extra rows, to remove that I added df7 = df.dropna(subset=['symbol'])

answered Apr 24, 2020 at 14:59

dper

9141 gold badge11 silver badges34 bronze badges

Collectives™ on Stack Overflow

Combine two dataframes in pandas

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related