Oracle SQL query with multiple conditions based on pandas dataframe

Question

I have a dataframe in the following format:

PID	Date1	Date2	Details
17750A	03/07/1960	06/07/2009	A1B3
17758X	03/07/1960	06/07/2009	A1B3
	06/09/1961	11/05/2013	A2B2
28363D	20/11/1964	05/03/2019	A1A2
30050A	30/06/1961	18/07/2017	A1B3
	04/11/1961	16/10/2008	A2B2

And an Oracle database with a table as follows:

ID	DateA	DateB	Notes
17750A	03/07/1960	06/07/2009	A1B3
	03/07/1960	06/07/2009	A1B3
20964Q	06/09/1961	11/05/2013	A2B2
28363D	20/11/1964	04/03/2019	A1A2
	30/06/1961	19/07/2017	A1B3
10832Q	04/11/1961	17/10/2008	A2B2

I need to query the database to return another df containing any record where the ID matches a PID, or where (Date1, Date2) equals (DateA, DateB) - i.e. both dates in a df row match both dates in a table row.

So far, I've managed to achieve the first, but not the second.

pid_list = df['PID'].values
nvars = ','.join(f':{i}' for i in range(len(pid_list)))

sql_query = """
            SELECT
                gd.ID,
                gd.DateA,
                gd.DateB,
                gd.Notes
             FROM table1 as gd
            WHERE gd.ID in (%s)
            """ % nvars
            
df_result = pd_read_sql(sql_query, connection, params=pid_list)

How can I expand that to also match on the pair of dates? Is there a way to do this by passing a list of tuples as a param, rather than needing to iterate through pairs of dates? Something like:

sql_query = """
            SELECT
                gd.ID,
                gd.DateA,
                gd.DateB,
                gd.Notes
              FROM table1 as gd
             WHERE gd.pid in (%s)
                OR (Date1 = DateA AND Date2 = DateB)
            """ % nvars, dates

df_result = pd.read_sql(sql_query, connection, params=(pid_list, (Date1, Date2)))

I think the params passed to pd.read_sql() may need to be a dict, but am not sure how to structure this or how to reference the different entries in the SQL query without iterating.

The part I'm unsure of is what params to include when using pd.read_sql. Will update to make that clear. — Violet
– Violet, Commented Jan 18, 2021 at 11:29

Rustam Pulatov · Accepted Answer · 2021-01-18 13:32:22Z

1

Try selected all data for Query:

            SELECT
                gd.ID,
                gd.DateA,
                gd.DateB,
                gd.Notes
             FROM table1 as gd

and filter from your DataFrame. Don't execute query with tuple and IN conditions only 1000 values.

or two query and join two result

or create table in oracle and filter query oracle

answered Jan 18, 2021 at 13:32

Rustam Pulatov

6752 gold badges9 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Violet Over a year ago

Thanks for your answer. The database table is very large so extracting all that data and then filtering post hoc is not really practical. I'd normally create a temporary table to hold the initial dataframe and then join to that but unfortunately I don't have the necessary permissions to do that in this particular db. Since the initial dataframe is typically quite small, I've now implemented an interative method, but I'd still really like to know if there is a better approach.

Rustam Pulatov Over a year ago

Usually, I using approach like your with temp table(simple as fast). Of course, may be using unbeautiful decision and tuple conver in string(like with in conditions id)

Marmite Bomber · Accepted Answer · 2021-01-18 18:25:07Z

1

You will be always limited with the 1000 entry limit (which you may workaround with the split in sublists using OR).

For the date compare you may use multi-element IN list compare as illustrated in the following query

select * from tab
where id in ('17750A','20964Q') or
(dateA,dateB) in ( ( date'1960-07-03', date'2009-07-06'),
                   (date'1961-06-30' , date'2017-07-19'))

edited Jan 18, 2021 at 18:25

answered Jan 18, 2021 at 14:41

Marmite Bomber

21.3k4 gold badges31 silver badges59 bronze badges

Collectives™ on Stack Overflow

Oracle SQL query with multiple conditions based on pandas dataframe

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related