I'm writing a python code to load a csv dataset to MySQL for the client. The requirements are that they will just load the data without opening the dataset. Therefore the code is a bit different from normal loading codes. I have completed writing the code:
#!/usr/bin/env python
# coding: utf-8
# In[1]:
import string
import csv
import mysql.connector
import re
# In[2]:
mydb=mysql.connector.connect(host="localhost",user="root",password="password",autocommit=True)
mycursor = mydb.cursor()
# In[3]:
sql_str=''
sql_str1=''
field_name=[]
line=1
file_name=input("\nFile name with extension:")
delimiter_=input('\nDelimiter used:')
textquali=input('\nText Qualifier used Press Enter if no qualifier:')
# In[4]:
mycursor.execute("drop table if exists rd.pbc_gl_sigline_src")
#r"C:\Users\rcsid\Documents\Office Programs\Working prog\MOCK_DATA.csv"
#re.sub('[^a-zA-Z0-9]\n\.', '_', row)
reader = csv.DictReader(open(file_name,encoding='utf-8',errors='ignore'), delimiter=delimiter_,quotechar=textquali)
rowHeaders = reader.fieldnames
print(rowHeaders)
for i in rowHeaders:
field_name.append(re.sub('[^A-Za-z0-9]+', '_', i))
#print(field_name)
#print(f'''create table rd.data_with_header ( {" varchar(100), ".join(field_name)} varchar(100))''')
sql_str=f'''create table rd.pbc_gl_sigline_src ( {" varchar(250), ".join(field_name)} varchar(100))'''
mycursor.execute(sql_str)
for row in reader:
line=line+1
sql_str1=f"""insert into rd.pbc_gl_sigline_src values ('{"',' ".join(row.values())}')"""
print(line)
print(sql_str1)
mycursor.execute(sql_str1)
print("Loaded to pbc_gl_sigline_src")
# In[ ]:
The code is working fine and loading the data lines. The issue I am facing is that as the code is building the insert code using a number of single quotes therefore when a data value has a single quote in it that line is getting an error.
Example: "sam", "New York City", "123'434" , "houseSTX"
The error is showing that there is a syntax error right to houseSTX
Is there a way to replace the Single quotes in the data while loading them using : sql_str1=f"""insert into rd.pbc_gl_sigline_src values ('{"',' ".join(row.values())}')""". I tried to use the replace function in the code but that did not go as planned. Any help would be much appreciated. Please do update my code.