109

I have to merge two dataframes:

df1

company,standard
tata,A1
cts,A2
dell,A3

df2

company,return
tata,71
dell,78
cts,27
hcl,23

I have to unify both dataframes to one dataframe. I need output like:

company,standard,return
tata,A1,71
cts,A2,27
dell,A3,78
0

4 Answers 4

184

Use merge:

print (pd.merge(df1, df2, on='company'))

Sample:

print (df1)
  company standard
0    tata       A1
1     cts       A2
2    dell       A3

print (df2)
  company  return
0    tata      71
1    dell      78
2     cts      27
3     hcl      23

print (pd.merge(df1, df2, on='company'))
  company standard  return
0    tata       A1      71
1     cts       A2      27
2    dell       A3      78
Sign up to request clarification or add additional context in comments.

6 Comments

how to sum %values in pandas?
i have one column (return)...that column contains 10%,10%,20%
i have to sum this values and print 40%
You have to first remove %, then sum - I think you can post new question.
Thank you for this answer, I used .join and .concat; none worked. This one did :D
|
5

I think we can also use

df1.merge(df2,on='Company')

Comments

-1

The following solution worked perfect for me on a MSA(microservice) architecture.

Required Packages

pip install mysql-connector-python SQLAlchemy pandas

Solution

import pandas as pd
from main.db.sql_alchemy import (
    engine_siaw,
    engine_sip,
)

altas_periodo_id = 10
df_altas = pd.read_sql(
    SQL_ALTAS_EST.format(altas_periodo_id=altas_periodo_id),
    con=engine_sip,
)
df_tiendas = pd.read_sql(SQL_ALTAS_TIENDAS, con=engine_siaw)
df_empresas = pd.read_sql(SQL_ALTAS_EMPRESAS, con=engine_siaw)

df_altas_tiendas = df_altas.merge(
    df_tiendas, left_on="tienda", right_on="nombre"
)
df_file = df_altas_tiendas.merge(
    df_empresas, left_on="empresa", right_on="razon_social"
)

The SQL variables containing the raw SQL:

SQL_ALTAS_EST = """
select
    td.codigo as tipo_doc,
    ad.numero_documento as numero_doc,
    cna.codigo as nacionalidad,
    ad.tienda,
    ad.empresa
from sip_registroaltadetalle as ad
    inner join sip_personal sp on sp.codigo = ad.codigo
    left join sip_tipodocumentoidentidad as td on td.pkid = ad.tipo_documento
    left join sip_multitabladetalle as cna on cna.pkid = sp.nacionalidad_id
where ad.codigo is not null and ad.parent_id = '{altas_periodo_id}'
""".strip()

SQL_ALTAS_TIENDAS = """
select codigo as codigo_establecimiento, nombre from catalogo_tienda ct
""".strip()

SQL_ALTAS_EMPRESAS = """
select razon_social, ruc from catalogo_empresa ce
""".strip()

the database connections:

from django.conf import settings
from sqlalchemy import create_engine

cn_sip = f"mysql+mysqlconnector://{settings.DB_USER}:{settings.DB_PASSWORD}@{settings.DB_HOST}:{settings.DB_PORT}/{settings.DB_NAME}"  # noqa
cn_siaw = f"mysql+mysqlconnector://{settings.SIAW_DB_USER}:{settings.SIAW_DB_PASSWORD}@{settings.SIAW_DB_HOST}:{settings.SIAW_DB_PORT}/{settings.SIAW_DB_NAME}"  # noqa

engine_siaw = create_engine(cn_siaw)
engine_sip = create_engine(cn_sip)

I really hope u find it helpful.

Comments

-5

In order to successfully merge two data frames based on common column(s), the dtype for common column(s) in both data frames must be the same! dtype for a column can be changed by:

df['commonCol'] = df['commonCol'].astype(int)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.