18

I created this regex in normal Regex

/(first|last)\s(last|first)/i

It matches the first three of

first last
Last first
First Last
First name

I am trying to get all the records where the full_name matches with the regex I wrote. I'm using PostgreSQL

Person.where("full_name ILIKE ?", "%(first|last)%(last|first)%")

This is my attempt. I also tried SIMILAR TO and ~ with no luck

1 Answer 1

48

Your LIKE query:

full_name ilike '%(first|last)%(last|first)%'

won't work because LIKE doesn't understand regex grouping ((...)) or alternation (|), LIKE only understands _ for a single character (like . in a regex) and % for any sequence of zero or more characters (like .* in a regex).

If you hand that pattern to SIMILAR TO then you'll find 'first last' but none of the others due to case problems; however, this:

lower(full_name) similar to '%(first|last)%(last|first)%'

will take care of the case problems and find the same ones as your regex.

If you want to use a regex (which you probably do because LIKE is very limited and cumbersome and SIMILAR TO is, well, a strange product of the fevered minds of some SQL standards subcommittee) then you'll want to use the case-insensitive matching operator and your original regex:

full_name ~* '(first|last)\s+(last|first)'

That translates to this bit of AR:

Person.where('full_name ~* :pat', :pat => '(first|last)\s+(last|first)')
# or this
Person.where('full_name ~* ?', '(first|last)\s+(last|first)')

There's a subtle change in my code that you need to take note of: I'm using single quotes for my Ruby strings, you're using double quotes. Backslashes mean more in double quoted strings than they do in single quoted strings so '\s' and "\s" are different things. Toss in a couple to_sql calls and you might see something interesting:

> puts Person.where('full_name ~* :pat', :pat => 'a\s+b').to_sql
SELECT "people".* FROM "people"  WHERE (full_name ~* 'a\s+b')

> puts Person.where('full_name ~* :pat', :pat => "a\s+b").to_sql
SELECT "people".* FROM "people"  WHERE (full_name ~* 'a +b')

That difference probably isn't causing you any problems but you need to be very careful with your strings when everyone wants to use the same escape character. Personally, I use single quoted strings unless I specifically need the extra escapes and string interpolation functionality of double quoted strings.

Some demos: http://sqlfiddle.com/#!15/99a2c/6

Sign up to request clarification or add additional context in comments.

5 Comments

This is one of the best answers I've received on here, Thanks. I don't need the + because I'm pretty sure that all records have a single space only. The reason why you used the symbol :pat was to define the regex as the value later correct? Also if I needed to pass in multiple values into the SQL, then creating symbols would help keep track of values.
I use :pat instead of ? to make it a bit more readable, this doesn't matter much when there is only one placeholder but it does when there are more or if you need to use the same value in several places. Giving things names is a readability win IMO. Anyway, thanks, I like to earn my points and the more you learn from me the better :)
FYI: if you are using MySQL, the ~* operator does not exist. Instead, use REGEXP in its place.
thanks,i had to use, \\d instead of \d, and also it didnt work while i used single quotes Project.where('projectid ~* ?',"^(P#{(Date.today).year%100})\\d{5}$")
@JoseKj String interpolation only works with double quotes and that probably required the double backslashes to keep the double quotes from trying to interpret \d in Ruby, sound about right?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.