Running Dynamic Query From Python with input from CSV

Question

I've a CSV file with table names and primary keys for those tables in below format:

|  Table Name  |  Primary Key  | 
|    Table 1   |     Col1      |  
|    Table 1   |     Col2      |
|    Table 1   |     Col3      | 
|    Table 2   |     Col11     | 
|    Table 2   |     Col12     |

I want to run a sql query to validate PK constraint for every table. The query to do it would look like this:

select Col1, Col2, Col3 from Table1
group by Col1, Col2, Col3
having count(*)>1

But I've thousands of table in this file. How would I write and execute this query dynamically and write results into a flat file? I want to execute this using Python 3.

Attempt:

CSV:

My PKTest.py

def getColumns(filename):
    tables = {}

    with open(filename) as f:
        for line in f:
            line = line.strip()
            if 'Primary Key' in line:
                continue

            cols = line.split('|')
            table = cols[1].strip()
            col = cols[2].strip()

            if table in tables:
                tables[table].append(col)
            else:
                tables[table] = [col]
    return tables

def runSQL(table, columns):
    statement = 'select {0} from {1} group by {0} having count(*) > 1'.format(', '.join(columns), table.replace(' ',''))
    return statement

if __name__ == '__main__':
    tables = getColumuns('PKTest.csv')
    try:
        #cursor to connect

        for table in tables:
            sql = runSQL(table,tables[table])
            print(sql)
            cursor.execute(sql)
            for result in cursor:
                print(result)

    finally:
        cursor.close()
    ctx.close()

zedfoxus · Accepted Answer · 2019-04-13 02:18:59Z

2

You will have to improvise on this answer a bit since I do not have access to Oracle.

Let's assume there's a file called so.csv that contains the data as shown in your question.

Create a file called so.py like so. I'll add bits of code and some explanation. You can piece the file together or copy/paste it from here: https://rextester.com/JLQ73751.

At the top of the file, import your Oracle dependency:

# import cx_Oracle
# https://www.oracle.com/technetwork/articles/dsl/python-091105.html

Then, create a function that parses your so.csv and puts table and columns in a dictionary like this: {'Table 1': ['Col1', 'Col2', 'Col3'], 'Table 2': ['Col11', 'Col12']}

def get_tables_columns(filename):

    tables = {}

    with open(filename) as f:
        for line in f:
            line = line.strip()
            if 'Primary Key' in line:
                continue

            cols = line.split('|')

            table = cols[1].strip()
            col = cols[2].strip()

            if table in tables:
                tables[table].append(col)
            else:
                tables[table] = [col]

    return tables

Then, create a function that generates sql if it knows the table and list of columns:

def get_sql(table, columns):

    statement = 'select {0} from {1} group by {0} having count(*) > 1'.format(
            ', '.join(columns),
            table.replace(' ', '')
        )

    return statement

It's time to execute the functions:

if __name__ == '__main__':
    tables = get_tables_columns('so.csv')

    # here goes your code to connect with Oracle
    # con = cx_Oracle.connect('pythonhol/[email protected]/orcl')
    # cur = con.cursor()

    for table in tables:
        sql = get_sql(table, tables[table])
        print(sql)

        # here goes your sql statement execution            
        # cur.execute(sql)
        # for result in cur:
        #    print result

    # close your Oracle connection
    # con.close()

You can include your Oracle-related statements and run the python file.

answered Apr 13, 2019 at 2:18

zedfoxus

37.4k5 gold badges68 silver badges66 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

django-unchained Over a year ago

this looks good. I've tested it yet. How about writing this to a flat file? Appending?

zedfoxus Over a year ago

Yeah, you can just execute this in its current form like so: python3 so.py > outputfile.sql. That'll give you outputfile.sql.

django-unchained Over a year ago

I am getting below error:

Traceback (most recent call last):   File "PKTest.py", line 28, in <module>     tables = getColumuns('PK Test.csv')   File "PKTest.py", line 14, in getColumuns     table = cols[1].strip() IndexError: list index out of range

django-unchained Over a year ago

Also did you add cols = line.split('|') based on the table above? Don't think its needed if we're reading from CSV

zedfoxus Over a year ago

Yes, splitting is done based on | from your example. If you have had a CSV, it'd be best to publish an example in your question. You can adapt the split for CSV. What does your PKTest.py look like? You might want to include that in your edited question and I'll be happy to help.

|

Collectives™ on Stack Overflow

Running Dynamic Query From Python with input from CSV

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related