I am passing in the following as a query (.dbtable) to pyspark, running in jupyter notebook on AWS EMR.
num = [1234,5678]
newquery = "(SELECT * FROM db.table WHERE col = 1234) as new_table"
newquery = "(SELECT * FROM db.table WHERE col = {num}) as new_table"
newquery = "(SELECT * FROM db.table WHERE col IN %(num)s) as new_table"
newquery = "(SELECT * FROM db.table WHERE col IN :(num)) as new_table"
The first "newquery" will return results. The rest fail.
What is the correct way to return this?