1

I have 2 dataframes :

df:

portfolio  symbol  id  var1  var2  var3 

df1:

symbol  sector  market  count 

I want to add the columns sector and market from df1 to df. df1 has uniques values for symbol and hence a smaller dataframe than df which is the original dataframe.

I tried doing :

pd.merge(df,df1,on='symbol',how='outer')

But the output is extending rows than desired. Can anyone help as to what is missed out here.

Thanks

3
  • 1
    But the output is extending rows than desired. It seems not, how working pd.merge(df,df1.drop_duplicates('symbol'),on='symbol',how='outer') ? Commented Apr 24, 2020 at 13:25
  • @jezrael There are no duplicated values on df1 Commented Apr 24, 2020 at 13:26
  • Ok, so if use my code same issue? Commented Apr 24, 2020 at 13:27

3 Answers 3

2

Have you tried doing an inner join,

df.merge(df1, on='symbol', how='inner')
Sign up to request clarification or add additional context in comments.

Comments

1

If you do an outer join, the amount of rows will be the amount of rows the longer column of the two (symbol column) has and thus the one from df. If you only want the amount of unique symbol values you should use an inner join.

Comments

1

My apologies, I didn't realise that outer join would also create rows for the second dataframe values if not available in the first dataframe. that is the reason why I was getting extra rows, to remove that I added df7 = df.dropna(subset=['symbol'])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.