7

How do I do a case insensitive IN search in the SQLAclhemy ORM in a way that is secure?

Both myself and others on my project have looked for this, but we cant seem to find anything that fits our needs.

In raw SQL I could do:

 SELECT * FROM TABLENAME WHERE UPPER(FIELDNAME) IN (UPPER('foo'), UPPER('bar'));

..if FOO and BAR were not user input in unknown case. As it is, I am worried about the following:

  1. Security: I don't want a visit from Bobby Tables (http://xkcd.com/327/) in the form of an SQL INjection Attack.and I cant find the documentation that tells me how to escape strings in SQLAlchemy or I would feel safer joining strings (But still feel dirty doing it).
  2. Speed is handled largely by indexing but obviously, doing case corrections in RAM before issuing the query would be faster than telling the DB to do it, so I would not do the UPPER in the query unless I really had to. The above was however the best way to show what I want to do. But sti;;, it shouldn't do anything crazy.
  3. Platform agnostic code. I will be running this on multiple database types - and its going to be fully tested is I have any say on the matter - and I don't want the query to be bound to a specific dialog of SQL. That is, after all, why I am using SQLAlchemy. :)

If it helps we are currently bound to the 8.4 version of SQLAlchemy due to our use of other libraries.

1 Answer 1

11

This should compile exactly...

query( models.Object )\
.filter( 
     sqlalchemy.func.upper( models.Object.fieldname )\
     .in_( (sqlalchemy.func.upper(foo) , sqlalchemy.func.upper(bar), ) )
)\
.all()

  1. you could also just pass in uppercase text. personally, i would do in_( foo.uppercase() , bar.uppercase() )

  2. SqlAlchemy works with the DBAPI to pass bind parameters into your backend datastore. Translation -- values are automatically escaped.


if you want to do a list of strings , something like this should work

.in_( [ i.upper() for i in inputs ] )
.in_( [ sqlalchemy.func.upper(i) for i in inputs ] )

Just want to add that if you want to optimize these selects for speed, and are on Postgres or Oracle, you can create a 'function index'

CREATE INDEX table_fieldname_lower_idx ON table(lower(fieldname))

the query planner (in the database) will know to use that lower(fieldname) index when searching against a lower(fieldname) query.

Sign up to request clarification or add additional context in comments.

5 Comments

So what if my input is a list of strings?
my first post used the wrong syntax. in_ expects an iterable. so you can just toss a list , list_comprehension, generator, lambda/map/etc.
Going to set this as the answer; Both my team and Bobby Table's Mother thank you.
thanks. i should have added, doing a lowercase comparison/storage is often better for debugging -- it tends to be way more readable by human eyes than uppercase.
Yes, but the issue with collation is that its a platform specific issue and we are trying to keep this as platform agnostic as possible.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.