Postgresql - case insensitive build to allow all wheres, joins, group bys etc to be case insensitive

Question

I've had this thought brewing for some time but I can't find anyone online who's discussed this as a possibility.

Currently the recommendations available for making case insensitive searches seem to be either to use "ilike" or "citext".

We're moving away from Microsoft Sql Server to Postgresql and all our code assumes case insensitive comparisons - but our TSQL code base is huge so changing it all to use UCASE() or ilike or citext etc etc isn't really feasible as a commercial development project.

However it must be possible to grab the source of postgresql and change some of the C code so that all string comparisons as case-insensitive, and then make our own compilation of the whole product. I think it would possibly require only a few lines of code to be changed and so upgradeability might not be a huge issue.

So I'm wondering whether anyone on here knows the Postgresql code base well enough to kick around ideas about whether this is feasible and whereabout the code is that does the comparisons just to help us get started. I'm continuing to research this in the meantime, and getting started with just being able to build postgresql on windows, but the hope is to bring others onboard with the idea such that a community project could be started, and as well as case insensitivity there might be other tweaks to allow tsql code to work better thus easing migration projects. My company would contribute to strongly.

Sorry if this is off topic but it seems to strongly lean towards being a developer question and I'm sure many other postgres users would appreciate a case insensitive build in this day and age -thanks

What's wrong with citext? You might also want to try out the possibilities of ICU collations: postgresql.verite.pro/blog/2018/07/25/icu-extension.html — user330315
– user330315, Commented Jul 26, 2018 at 16:24

Laurenz Albe · Accepted Answer · 2018-07-26 15:28:27Z

1

I understand your sentiment, but I believe that you are wrong to assume that this would be a simple change. Otherwise PostgreSQL would probably already have case insensitive collations...

I'd say that your best bet is to use citext throughout. What is the problem you have with that?

You should take this to the hackers list to start a serious discussion, but make sure you read the archives first, because the problem is not a new one.

answered Jul 26, 2018 at 15:28

Laurenz Albe

257k22 gold badges312 silver badges388 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

user2728841 Over a year ago

The problem with citext is performance. This is an enterprise app with a lot of adhoc search and adhoc query capability, we need string comparison operations to be fast and indexed. In the limitations it says " citext is not as efficient as text because the operator functions and the B-tree comparison functions must make copies of the data and convert it to lower case for comparisons. It is, however, slightly more efficient than using lower to get case-insensitive matching. " ... so not as efficient as if the source was changed to do case insensitive comparisons at low level.

user2728841 Over a year ago

when I can get the damn thing compiling the first thing I'm going to try is just replacing strcmp with stricmp and see what happens :) just for fun

user330315 Over a year ago

@user2728841: did you actually test the performance of citext or are you just assuming?

Laurenz Albe Over a year ago

That slow-down with citext will only be noticeable in large sequential scans with a filter. If you use an index, the difference won't be noticeable.

user2728841 Over a year ago

No not tested it yet. But the risk that it may be slow in some scenarios we didn't test (the adhoc querying possibilities here are infinite so you can't test them all) is sufficient for this to warrant further research. Still I appreciate your input. Most of the tables are indexed but we can't index for every possible search scenario since, as stated, there are infinite combinations

Collectives™ on Stack Overflow

Postgresql - case insensitive build to allow all wheres, joins, group bys etc to be case insensitive

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related