3

I have a pandas.DataFrame with columns having different data types like object, int64 , etc. I have a postgresql table created with appropriate data types. I want to insert all the dataframe data into postgresql table. How should manage to do this?

Note : The data in pandas is coming from another source so the data types are not specified manually by me.

2 Answers 2

2

The easiest way is to use sqlalchemy:

from sqlalchemy import create_engine

engine = create_engine('postgresql://abc:def@localhost:5432/database')
df.to_sql('table_name', engine, if_exists='replace')

If the table exists, you can choose what you want to do with if_exists option

if_exists {‘fail’, ‘replace’, ‘append’}, default ‘fail’

If the table does not exist, it will create a new table with the corresponding datatypes.

Sign up to request clarification or add additional context in comments.

2 Comments

It sounds like the table is already created and they want to append to it. In this case the datatypes are already taken care of. If my memory serves me correctly it will throw an exception if there is a data entry that is not convertable to the data type of the table. A more challenging issue is creating the table with the correct datatypes automatically, though that function makes a pretty good go at it!
Works perfectly for me, recognized and converted datatypes properly.
2

Maybe you have the problem I had that you want to create new columns on the existing table, and then the solution to replace or append the table does not work for me. Shortly for me it looks like this ( I guess for the converting of datatypes is no general solution and you should adapt for your need):

lg.debug('table gets extended with the columns: '+",".join(dataframe.dtypes))
#check whether we have to add a field
df_postgres={'object':'text','int64':'bigint','float64':'numeric','bool':'boolean','datetime64':'timestamp','timedelta':'interval'}
for col in dataframe.columns:
    #convert the columns to postgres:
    if str(dataframe.dtypes[col]) in df_postgres:
        dbo.table_column_if_not_exists(self.table_name,col,df_postgres[str(dataframe.dtypes[col])],original_endpoint)
    else:
        lg.error('Fieldtype '+str(dataframe.dtypes[col])+' is not configured')

and the function to create the columns:

def table_column_if_not_exists(self,table,name,dtype,original_endpoint=''):
    self.query(query='ALTER TABLE '+table+' ADD COLUMN IF NOT EXISTS '+name+' '+dtype)
    #make a comment when we know which source create this column
    if original_endpoint!='':
            self.query(query='comment on column '+table+'.'+name+" IS '"+original_endpoint+"'")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.