4

I am using the sql package in a Jupyter notebook and I understand how to use variables in my query:

client = "Disney"
queryid = %sql SELECT * FROM mytable WHERE name = :client

What I don't understand is how to pass a list to my query, like:

clients = ["Disney", "Netflix", "Sky"]
queryid = %sql SELECT * FROM mytable WHERE name in (:clients)

This raises an error which states that my SQL is wrong. What is the way to handle lists in this setting?

7
  • If you're passing a list, are you sure = operator is still valid? You may want IN or the like. Commented Jul 14, 2017 at 22:40
  • What do you want? Items that match any thing in the list? Commented Jul 14, 2017 at 23:07
  • Other SO questions about where in concatenate the list strings into one string: stackoverflow.com/questions/283645/… Commented Jul 14, 2017 at 23:18
  • @hpaulj the point is to have the right sintax for this particular way to use MySQL Commented Jul 15, 2017 at 8:42
  • @cᴏʟᴅsᴘᴇᴇᴅ yes of course I'll update my question Commented Jul 15, 2017 at 8:42

4 Answers 4

3

With a demo case for sqlite3:

In [1]: import sqlite3
In [2]: conn = sqlite3.connect('example.db')
In [3]: c = conn.cursor()
In [4]: c.execute('''CREATE TABLE stocks
   ...:              (date text, trans text, symbol text, qty real, price real)''')
   ...: 
   ...: # Insert a row of data
   ...: c.execute("INSERT INTO stocks VALUES ('2006-01-05','BUY','RHAT',100,35.1
   ...: 4)")
   ...: 
   ...: # Save (commit) the changes
   ...: conn.commit()
   ...: 
In [5]: # Larger example that inserts many records at a time
   ...: purchases = [('2006-03-28', 'BUY', 'IBM', 1000, 45.00),
   ...:              ('2006-04-05', 'BUY', 'MSFT', 1000, 72.00),
   ...:              ('2006-04-06', 'SELL', 'IBM', 500, 53.00),
   ...:             ]
   ...: c.executemany('INSERT INTO stocks VALUES (?,?,?,?,?)', purchases)

I can fetch values that match several strings with:

In [31]: c.execute('SELECT * FROM stocks WHERE symbol IN (?,?)',('IBM','RHAT'))
Out[31]: <sqlite3.Cursor at 0xaf703fa0>
In [32]: c.fetchall()
Out[32]: 
[('2006-01-05', 'BUY', 'RHAT', 100.0, 35.14),
 ('2006-03-28', 'BUY', 'IBM', 1000.0, 45.0),
 ('2006-04-06', 'SELL', 'IBM', 500.0, 53.0)]

Or with the generalized solution from Parameter substitution for a SQLite "IN" clause

In [33]: alist=['RHAT','IBM']
In [34]: c.execute('SELECT * FROM stocks WHERE symbol IN (%s)' %
    ...:                            ','.join('?'*len(alist)), 
    ...:                            alist)
    ...:                            
Out[34]: <sqlite3.Cursor at 0xaf703fa0>
In [35]: c.fetchall()
Out[35]: 
[('2006-01-05', 'BUY', 'RHAT', 100.0, 35.14),
 ('2006-03-28', 'BUY', 'IBM', 1000.0, 45.0),
 ('2006-04-06', 'SELL', 'IBM', 500.0, 53.0)]

c.execute('SELECT * FROM stocks WHERE symbol IN (:1,:2)',alist), and possibly other forms.

see also:

sqlite3 "IN" clause

I assume MYSQL and %sql iterface behaves the same; but I don't have those installed.


With proper quoting literals also work (again sqlite3)

c.execute('SELECT * FROM stocks WHERE symbol IN ("IBM","RHAT")')

or

In [80]: 'SELECT * FROM stocks WHERE symbol IN (%s)'%','.join('"%s"'%x for x in alist)
Out[80]: 'SELECT * FROM stocks WHERE symbol IN ("RHAT","IBM")'
In [81]: c.execute(_)

So I'm guessing that:

%sql SELECT * FROM stocks WHERE symbol IN ("IBM","RHAT")

would work even if some form of (:....) does not.


I installed %sql

In [5]: %%sql
   ...: sqlite:///example.db
   ...: 

Out[5]: 'Connected: [email protected]'
In [7]: %sql SELECT * from stocks
Done.
Out[7]: 
[('2006-01-05', 'BUY', 'RHAT', 100.0, 35.14),
 ('2006-03-28', 'BUY', 'IBM', 1000.0, 45.0),
 ('2006-04-05', 'BUY', 'MSFT', 1000.0, 72.0),
 ('2006-04-06', 'SELL', 'IBM', 500.0, 53.0)]

In [9]: %sql SELECT * from stocks where symbol in ('IBM')
Done.
Out[9]: 
[('2006-03-28', 'BUY', 'IBM', 1000.0, 45.0),
 ('2006-04-06', 'SELL', 'IBM', 500.0, 53.0)]

In [10]: %sql SELECT * from stocks where symbol in ('IBM','RHAT')
Done.
Out[10]: 
[('2006-01-05', 'BUY', 'RHAT', 100.0, 35.14),
 ('2006-03-28', 'BUY', 'IBM', 1000.0, 45.0),
 ('2006-04-06', 'SELL', 'IBM', 500.0, 53.0)]

The string formatting approach works:

In [11]: alist=['RHAT','IBM']
In [12]: cmd='SELECT * FROM stocks WHERE symbol IN (%s)'%','.join('"%s"'%x for x
    ...:  in alist)
In [13]: cmd
Out[13]: 'SELECT * FROM stocks WHERE symbol IN ("RHAT","IBM")'
In [14]: %sql $cmd
Done.
Out[14]: 
[('2006-01-05', 'BUY', 'RHAT', 100.0, 35.14),
 ('2006-03-28', 'BUY', 'IBM', 1000.0, 45.0),
 ('2006-04-06', 'SELL', 'IBM', 500.0, 53.0)]

The : syntax isn't well documented. It isn't clear who's implementing it. ($ is standard Ipython variable substitution).

In [18]: sym='IBM'
In [19]: %sql SELECT * from stocks where symbol in (:sym)
Done.
Out[19]: 
[('2006-03-28', 'BUY', 'IBM', 1000.0, 45.0),
 ('2006-04-06', 'SELL', 'IBM', 500.0, 53.0)]

symbol in (:sym1,:sym2) works

So far I don't see evidence that %sql works with the conventional SQL placeholder syntax.


Looks like you (?) submitted and closed a issue on github, https://github.com/catherinedevlin/ipython-sql/issues/92

Adapting that solution to quote strings:

In [74]: mystring = '({})'.format(','.join('"{}"'.format(e) for e in alist))
In [75]: mystring
Out[75]: '("RHAT","IBM")'
In [76]: %sql SELECT * from stocks where symbol in $mystring
Done.

In other words, use the ipython $ injection as opposed to the : form.


Looking at the ipython-sql source code:

ipython-sql/blob/master/src/sql/run.py
def run(conn, sql, config, user_namespace):
    ...
    txt = sqlalchemy.sql.text(statement)
    result = conn.session.execute(txt, user_namespace)

It looks like the :name syntax is a sqlalchemy bind parameter, and is handled with sqlalchemy.sql.text and sqlalchemy.sql.bindparam

(http://docs.sqlalchemy.org/en/latest/orm/tutorial.html#orm-tutorial-literal-sql)

This error indicates that each bindparameter is translated into a ? placeholder, plus as matching parameters entry:

In [96]: %sql SELECT * from stocks where symbol in :mystring
(sqlite3.OperationalError) near "?": syntax error [SQL: 'SELECT * from stocks where symbol in ?'] [parameters: ('("RHAT","IBM")',)]

So my original solution of generating IN (?,?,...) to match the length of the list is the right SQL, even though it does not work with sqlalchemy and %sql.

Sign up to request clarification or add additional context in comments.

3 Comments

thanks for your time, but as I said before, either you have a solution for %sql or you don't...
I've downloaded %sql and got a solution working with the Ipython $ syntax. Conventional SQL placeholder syntax apparently does not work with %sql and sqlalchemy which has its own bindparameters syntax.
This is a fantastically thorough answer that really anticipates future visits to this question. The %sql version worked like a charm for me.
1

Removing parentheses works for me.

clients = ["Disney", "Netflix", "Sky"]
queryid = %sql SELECT * FROM mytable WHERE name in :clients

Comments

0

This function saves me from this kind of problem:


> def splitParam(param): # this function splits multiple values of filter
    filteredParam = ""
    if param:
        for item in param:
            if filteredParam=="":
                filteredParam += "'" + item.upper() + "'"
            else:
                filteredParam += ", '" + item.upper() + "'"

    return filteredParam 

Comments

-1

Anecdotal note about applying the $ with python tuples in a SQL query:

It needs to be used in a %sql line, it cannot* be used in a %%sql code block. Use line continuation backslashes for readability on the query.

*as far as I can tell

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.