Insert or update rows in MS Access database in Python

Question

I've got an MS Access table (SearchAdsAccountLevel) which needs to be updated frequently from a python script. I've set up the pyodbc connection and now I would like to UPDATE/INSERT rows from my pandas df to the MS Access table based on whether the Date_ AND CampaignId fields match with the df data.

Looking at previous examples I've built the UPDATE statement which uses iterrows to iterate through all rows within df and execute the SQL code as per below:

    connection_string = (
            r"Driver={Microsoft Access Driver (*.mdb, *.accdb)};"
            r"c:\AccessDatabases\Database2.accdb;"
    )
    cnxn = pyodbc.connect(connection_string, autocommit=True)
    crsr = cnxn.cursor()

    for index, row in df.iterrows():
            crsr.execute("UPDATE SearchAdsAccountLevel SET [OrgId]=?, [CampaignName]=?, [CampaignStatus]=?, [Storefront]=?, [AppName]=?, [AppId]=?, [TotalBudgetAmount]=?, [TotalBudgetCurrency]=?, [DailyBudgetAmount]=?, [DailyBudgetCurrency]=?, [Impressions]=?, [Taps]=?, [Conversions]=?, [ConversionsNewDownloads]=?, [ConversionsRedownloads]=?, [Ttr]=?, [LocalSpendAmount]=?, [LocalSpendCurrency]=?, [ConversionRate]=?, [Week_]=?, [Month_]=?, [Year_]=?, [Quarter]=?, [FinancialYear]=?, [RowUpdatedTime]=? WHERE [Date_]=? AND [CampaignId]=?",
                        row['OrgId'],
                        row['CampaignName'],
                        row['CampaignStatus'],
                        row['Storefront'],
                        row['AppName'],
                        row['AppId'],
                        row['TotalBudgetAmount'],
                        row['TotalBudgetCurrency'],
                        row['DailyBudgetAmount'],
                        row['DailyBudgetCurrency'],
                        row['Impressions'],
                        row['Taps'],
                        row['Conversions'],
                        row['ConversionsNewDownloads'],
                        row['ConversionsRedownloads'],
                        row['Ttr'],
                        row['LocalSpendAmount'],
                        row['LocalSpendCurrency'],
                        row['ConversionRate'],
                        row['Week_'],
                        row['Month_'],
                        row['Year_'],
                        row['Quarter'],
                        row['FinancialYear'],
                        row['RowUpdatedTime'],
                        row['Date_'],
                        row['CampaignId'])
crsr.commit()

I would like to iterate through each row within my df (around 3000) and if the ['Date_'] AND ['CampaignId'] match I UPDATE all other fields. Otherwise I want to INSERT the whole df row in my Access Table (create new row). What's the most efficient and effective way to achieve this?

There is a lot of ways and steps to affront your quetion. 1- Dont execute queries one by one, use parametrizable query. 2- Use yield's in python ... — Wonka
– Wonka, Commented May 2, 2019 at 15:32

Parfait · Accepted Answer · 2019-05-03 13:35:26Z

3

Consider DataFrame.values and pass list into an executemany call, making sure to order columns accordingly for the UPDATE query:

cols = ['OrgId', 'CampaignName', 'CampaignStatus', 'Storefront',
        'AppName', 'AppId', 'TotalBudgetAmount', 'TotalBudgetCurrency',
        'DailyBudgetAmount', 'DailyBudgetCurrency', 'Impressions',
        'Taps', 'Conversions', 'ConversionsNewDownloads', 'ConversionsRedownloads',
        'Ttr', 'LocalSpendAmount', 'LocalSpendCurrency', 'ConversionRate',
        'Week_', 'Month_', 'Year_', 'Quarter', 'FinancialYear',
        'RowUpdatedTime', 'Date_', 'CampaignId']

sql = '''UPDATE SearchAdsAccountLevel 
            SET [OrgId]=?, [CampaignName]=?, [CampaignStatus]=?, [Storefront]=?, 
                [AppName]=?, [AppId]=?, [TotalBudgetAmount]=?, 
                [TotalBudgetCurrency]=?, [DailyBudgetAmount]=?, 
                [DailyBudgetCurrency]=?, [Impressions]=?, [Taps]=?, [Conversions]=?, 
                [ConversionsNewDownloads]=?, [ConversionsRedownloads]=?, [Ttr]=?, 
                [LocalSpendAmount]=?, [LocalSpendCurrency]=?, [ConversionRate]=?,
                [Week_]=?, [Month_]=?, [Year_]=?, [Quarter]=?, [FinancialYear]=?, 
                [RowUpdatedTime]=? 
          WHERE [Date_]=? AND [CampaignId]=?'''

crsr.executemany(sql, df[cols].values.tolist())   
cnxn.commit()

For the insert, use a temp, staging table with exact structure as final table which you can create with make-table query: SELECT TOP 1 * INTO temp FROM final. This temp table will be regularly cleaned out and inserted with all data frame rows. The final query migrates only new rows from temp into final with NOT EXISTS, NOT IN, or LEFT JOIN/NULL. You can run this query anytime and never worry about duplicates per Date_ and CampaignId columns.

# CLEAN OUT TEMP
sql = '''DELETE FROM SearchAdsAccountLevel_Temp'''
crsr.executemany(sql)   
cnxn.commit()

# APPEND TO TEMP
sql = '''INSERT INTO SearchAdsAccountLevel_Temp (OrgId, CampaignName, CampaignStatus, Storefront,
                                AppName, AppId, TotalBudgetAmount, TotalBudgetCurrency,
                                DailyBudgetAmount, DailyBudgetCurrency, Impressions,
                                Taps, Conversions, ConversionsNewDownloads, ConversionsRedownloads,
                                Ttr, LocalSpendAmount, LocalSpendCurrency, ConversionRate,
                                Week_, Month_, Year_, Quarter, FinancialYear,
                                RowUpdatedTime, Date_, CampaignId)    
         VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, 
                 ?, ?, ?, ?, ?, ?, ?, ?, ?, 
                 ?, ?, ?, ?, ?, ?, ?, ?, ?);'''

crsr.executemany(sql, df[cols].values.tolist())   
cnxn.commit()

# MIGRATE TO FINAL
sql = '''INSERT INTO SearchAdsAccountLevel 
         SELECT t.* 
         FROM SearchAdsAccountLevel_Temp t
         LEFT JOIN SearchAdsAccountLevel f
            ON t.Date_ = f.Date_ AND t.CampaignId = f.CampaignId
         WHERE f.OrgId IS NULL'''
crsr.executemany(sql)   
cnxn.commit()

edited May 3, 2019 at 13:35

answered May 2, 2019 at 16:58

Parfait

108k19 gold badges102 silver badges138 bronze badges

Sign up to request clarification or add additional context in comments.

11 Comments

Mtra Over a year ago

how can I also integrate an insert query to create rows which are not currently present in my table but are in df?

Parfait Over a year ago

That is a different question than this as it involves a different query and data handling. Usually, staging temp tables are used to avoid append duplicates with NOT IN, NOT EXISTS, LEFT JOIN/IS NULL clauses against final tables in an INSERT...SELECT call.

Gord Thompson Over a year ago

+1 for the df[cols].values hint. To be fair, the "insert" aspect was part of the question (although buried near the bottom), and that is covered by this answer via a comment to the dup.

Parfait Over a year ago

One or more of your data frame column types do not match database table types. Try importing a select query of few rows with read_sql and compare dtypes with your current df.

Parfait Over a year ago

You need to explicitly name the columns in last append query's INSERT and SELECT clauses. For latter, do not use the abbreviated *.

|

Collectives™ on Stack Overflow

Insert or update rows in MS Access database in Python

1 Answer 1

11 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

11 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related