2

I am trying to update mysql database table So I started by creating ORM Object helping me to reduce the volume of an update query by using UPDATE, WHERE Conditions

First of all, I created an ORM variable as this ORM Object is a filtered data from dataframe by using a condition in another pd.data_frame CSV this is my simple rule as to be easy to create conditions like this

myOutlook_inBox = pd.read_csv (r'' + mydir + 'test.CSV', usecols= 
['Subject','Body', 'From: (Name)', 'To: (Name)' ], encoding='latin-1')

this is simple ORM extracted data from pd.read_csv

replaced_sbj_value = myOutlook_inBox['Subject']
.str.extract(pat='(L(?:DEL|CAI|SIN).\d{5})').dropna()

and this ORM is extracting csv.column from myOutlook_inBox['Subject']

replaced_sbj_value = myOutlook_inBox['Subject']
.str.extract(pat='(L(?:DEL|CAI|SIN).\d{5})').dropna()

myOutlook_inBox["Subject"] = replaced_sbj_value

and this is a condition that I am using to filter a specific data

frm_mwfy_to_te = myOutlook_inBox.loc[myOutlook_inBox['From: 
(Name)'].str.contains("mowafy", na=False)
& myOutlook_inBox['To:(Name)'].str.contains("te", 
na=False)].drop_duplicates(keep=False)
frm_mwfy_to_te.Subject

and this variable is filtered rows in mysql database in a column called Subject

filtered_data = all_data
.loc[all_data.site_code.str.contains('|'.join(frm_mwfy_to_te.Subject))]

and this is my sql query, all I need now I need to create a query that's updates column called "pending" filters in a column called "site_code" and update rows which value contains filtered_data as to update or replace values in column pending with a value TE

update_db_query = engine.execute("UPDATE govtracker SET pending = 'TE'  
WHERE site_code = " + filtered_data)

I am thinking that I am on the wrong scenario any Ideas to solve this

Note: I don't need to mention the old value in my query I just want to update the value in the same row according to the the filtered data frame by the new value I mentioned in the query

For example according to frm_mwfy_to_te.Subject as Subject is a columns name called in csv file

Let's say the output of this ORM frm_mwfy_to_te.Subject

 Subject
 LCAIN20804
 LDELE30434
 LSINI20260

and this is my whole code

from sqlalchemy import create_engine
import pandas as pd
import os
import csv
import MySQLdb
from sqlalchemy import types, create_engine


# MySQL Connection
MYSQL_USER      = 'root'
MYSQL_PASSWORD  = 'Mharooney'
MYSQL_HOST_IP   = '127.0.0.1'
MYSQL_PORT      = 3306
MYSQL_DATABASE  = 'mydb'

engine = create_engine('mysql+mysqlconnector://'+MYSQL_USER+'
:'+MYSQL_PASSWORD+'@'+MYSQL_HOST_IP+':'+str(MYSQL_PORT)+'/'+MYSQL_DATABASE, 
echo=False)
#engine = create_engine('mysql+mysqldb://root:@localhost:123456/myDB? 
charset=utf8mb4&binary_prefix=true', echo=False)

mydir = (os.getcwd()).replace('\\', '/') + '/'
all_data = pd.read_sql('SELECT * FROM govtracker', engine)
# .drop(['#'], axis=1)
myOutlook_inBox = pd.read_csv(r'' + mydir + 'test.CSV', usecols=['Subject', 
'Body', 'From: (Name)', 'To: (Name)'],
                          encoding='latin-1')
myOutlook_inBox.columns = myOutlook_inBox.columns.str.replace(' ', '')

#this object extract 5 chars and 5 numbers from specific column in csv
replaced_sbj_value = myOutlook_inBox['Subject'].str.extract(pat='(L(?:DEL|CAI|SIN).\d{5})').dropna()

#this columns I want to filter in database
myOutlook_inBox["Subject"] = replaced_sbj_value
# this conditions filters and get and dublicate repeated data from outlook 
exported file
# Condition 1 any mail from mowafy to te
frm_mwfy_to_te = myOutlook_inBox.loc[myOutlook_inBox['From: 
(Name)'].str.contains("mowafy", na=False)
                                 & myOutlook_inBox['To: 
(Name)'].str.contains("te", na=False)].drop_duplicates(
keep=False)
frm_mwfy_to_te.Subject

filtered_data = all_data.loc[all_data.site_code.str.contains
('|'.join(frm_mwfy_to_te.Subject))]

print(myOutlook_inBox)

all_data.replace('\n', '', regex=True)
df = all_data.where((pd.notnull(all_data)), None)
print(df)

print("Success")

print(frm_mwfy_to_te.Subject)
print(filtered_data)
# rows = engine.execute("SELECT * FROM govtracker")#.fetchall()
# print(rows)


update_db_query = engine.execute("UPDATE govtracker SET pending = 'TE'  
WHERE site_code = " + filtered_data)
"""engine = create_engine('postgresql+psycopg2://user:pswd@mydb')
df.to_sql('temp_table', engine, if_exists='replace')"""

# select_db_query = pd.read_sql("SELECT * FROM govtracker", con = engine)

#print(update_db_query)

Now let's say this is the output of my ORM then I will use this ORM as to filter and get the row of these three values from mysql database as to update every row contains these values and I want to update columns called Pending and pending status in my sql

and this is my database query

CREATE TABLE `mydb`.`govtracker` (
    `id` INT,
    `site_name` VARCHAR(255),
    `region` VARCHAR(255),
    `site_type` VARCHAR(255),
    `site_code` VARCHAR(255),
    `tac_name` VARCHAR(255),
    `dt_readiness` DATE,
    `rfs` VARCHAR(255),
    `rfs_date` DATE,
    `huawei_1st_submission_date` DATE,
    `te_1st_submission_date` DATE,
    `huawei_2nd_submission_date` DATE,
    `te_2nd_submission_date` DATE,
    `huawei_3rd_submission_date` DATE,
    `te_3rd_submission_date` DATE,
    `acceptance_date_opt` DATE,
    `acceptance_date_plan` DATE,
    `signed_sites` VARCHAR(255),
    `as_built_date` DATE,
    `as_built_status` VARCHAR(255),
    `date_dt` DATE,
    `dt_status` VARCHAR(255),
    `shr_status` VARCHAR(255),
    `dt_planned` INT(255),
    `integeration_status` VARCHAR(255),
    `comments_snags` LONGTEXT,
    `cluster_name` LONGTEXT,
    `type_standalone_colocated` VARCHAR(255),
    `installed_type_standalone_colocated` VARCHAR(255),
    `status` VARCHAR(255),
    `pending` VARCHAR(255),
    `pending_status` LONGTEXT,
    `problematic_details` LONGTEXT,
    `ets_tac` INT(255),
    `region_r` VARCHAR(255),
    `sf6_signed_date` DATE,
    `sf6_signed_comment` LONGTEXT,
    `comment_history` LONGTEXT,
    `on_air_owner` VARCHAR(255),
    `pp_owner` VARCHAR(255),
    `report_comment` LONGTEXT,
    `hu_opt_area_owner` VARCHAR(255),
    `planning_owner` VARCHAR(255),
    `po_number` VARCHAR(255),
    `trigger_date` DATE,
    `as_built_status_tr` VARCHAR(255)
) ENGINE = InnoDB;

Another Important note: In excel while I using filter in some column it shows the all values in the column I selected lets to say Pending is the column I've selected which have values Accepted & PAC in progress Planning TE PP DT FM Rollout Integration Opt Team So now all the rest columns have values like this So should I have to create a table something like columns_values and fill this table with all these values I have, as these values are static values It is easy to solve my case

Last Note: This database is according to an existing xlsm file but I push the data from xlsm to mysql and now mysql Is my main database, not the excel formats but I am updating mysql database through csv file not in my database the orm object frm_mwfy_to_te.Subject is an extracted data from the data frame in the csv file

Any Ideas Here?

I hope everything is clear enough

Is this material could help me or not?

https://auth0.com/blog/sqlalchemy-orm-tutorial-for-python-developers/#SQLAlchemy-ORM

It's called TL;DR

Important Note: the value of fitered data is actually as pandas Dataframe but for one column only from CSV file because I want to filter with this dataframe column values like I posted before to update some columns in my database I just started with updating one column called pending one as to see the result after that I'll update the other columns by the way the script the I want to create that to search in my database with this values in filtered data for an example I have a value called LCAIN20804 I want to take this value and to filter in database table then go the column called Huawei 1st submission date if it wasn't filled then fill with current data if it was filled go to pending column and replace the old value with TE then go to pending_status and replace the old value with waiting TE acceptance and so on that's a small part of my script I want to create I hope this is clear enough

2
  • Can you please include a snippet of the value of filtered_data? It looks like a pandas DataFrame, but does it have multiple columns then? And do you want to update all of those columns? Commented Jun 16, 2019 at 18:28
  • @Ruben Helsloot thanks for your interest please check my edit in important note Commented Jun 16, 2019 at 19:21

1 Answer 1

1

If you want to turn a pandas DataFrame into a SQL update statement, it may be nice to first transform it into a list of tuples, where the tuples are the new column values, and then use engine.executemany (https://stackoverflow.com/a/27743541/5015356)

values = [tuple(x) for x in filtered_data.values]

query = """
UPDATE govtracker
SET pending = 'TE'  
WHERE site_code = '%s')
"""
connection = engine.connect()
update_db_query = connection.execute(query, values)

For each tuple (<sitecode>), this will execute the update statement. If you want to update more columns or expand the where clause, just add the additional columns to filtered_data, and add a new %s where you want the other value to appear.

Just make sure you keep the columns in the correct order!

Sign up to request clarification or add additional context in comments.

9 Comments

thanks alot for your help the tuples one I sctually I looking for but I find this error in terminal after I run proj AttributeError: 'Engine' object has no attribute 'executemany' I want to vote but my reputations is too low
Sorry, I updated the answer to work better with SQLAlchemy
I tried this but I find another one Not all parameters were used in the SQL statement
Are you sure filtered_data has only one column? What do you get when you execute filtered_data.shape?
actually the filtered_data shows the all columns in the data base but the sepecific column I want to filter in is filtered_data.site_code sorry for late I just wake up
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.