1

I've had this thought brewing for some time but I can't find anyone online who's discussed this as a possibility.

Currently the recommendations available for making case insensitive searches seem to be either to use "ilike" or "citext".

We're moving away from Microsoft Sql Server to Postgresql and all our code assumes case insensitive comparisons - but our TSQL code base is huge so changing it all to use UCASE() or ilike or citext etc etc isn't really feasible as a commercial development project.

However it must be possible to grab the source of postgresql and change some of the C code so that all string comparisons as case-insensitive, and then make our own compilation of the whole product. I think it would possibly require only a few lines of code to be changed and so upgradeability might not be a huge issue.

So I'm wondering whether anyone on here knows the Postgresql code base well enough to kick around ideas about whether this is feasible and whereabout the code is that does the comparisons just to help us get started. I'm continuing to research this in the meantime, and getting started with just being able to build postgresql on windows, but the hope is to bring others onboard with the idea such that a community project could be started, and as well as case insensitivity there might be other tweaks to allow tsql code to work better thus easing migration projects. My company would contribute to strongly.

Sorry if this is off topic but it seems to strongly lean towards being a developer question and I'm sure many other postgres users would appreciate a case insensitive build in this day and age -thanks

2
  • What's wrong with citext? You might also want to try out the possibilities of ICU collations: postgresql.verite.pro/blog/2018/07/25/icu-extension.html Commented Jul 26, 2018 at 16:24
  • see my comments under Laurenz's answer Commented Jul 26, 2018 at 16:58

1 Answer 1

1

I understand your sentiment, but I believe that you are wrong to assume that this would be a simple change. Otherwise PostgreSQL would probably already have case insensitive collations...

I'd say that your best bet is to use citext throughout. What is the problem you have with that?

You should take this to the hackers list to start a serious discussion, but make sure you read the archives first, because the problem is not a new one.

Sign up to request clarification or add additional context in comments.

5 Comments

The problem with citext is performance. This is an enterprise app with a lot of adhoc search and adhoc query capability, we need string comparison operations to be fast and indexed. In the limitations it says " citext is not as efficient as text because the operator functions and the B-tree comparison functions must make copies of the data and convert it to lower case for comparisons. It is, however, slightly more efficient than using lower to get case-insensitive matching. " ... so not as efficient as if the source was changed to do case insensitive comparisons at low level.
when I can get the damn thing compiling the first thing I'm going to try is just replacing strcmp with stricmp and see what happens :) just for fun
@user2728841: did you actually test the performance of citext or are you just assuming?
That slow-down with citext will only be noticeable in large sequential scans with a filter. If you use an index, the difference won't be noticeable.
No not tested it yet. But the risk that it may be slow in some scenarios we didn't test (the adhoc querying possibilities here are infinite so you can't test them all) is sufficient for this to warrant further research. Still I appreciate your input. Most of the tables are indexed but we can't index for every possible search scenario since, as stated, there are infinite combinations

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.