That citext doc already tells you it's somewhat superseded by case-insensitive collations:
Consider using nondeterministic collations (see Section 23.2.2.4) instead of this module. They can be used for case-insensitive comparisons, accent-insensitive comparisons, and other combinations, and they handle more Unicode special cases correctly.
You're better off with a regular text type and a custom collation, or "C" with an expression index using lower(). You can find a few benchmarks here:
If you upgrade to version 18 (release candidate 1 is out), you get nondeterministic collation support for LIKE which handles your prefix search.
In PostgreSQL 16, use collate "C" with a text_pattern_ops expression index:
demo at db<>fiddle
create unique index on test_lower_collate_c
(lower(a) collate "C" text_pattern_ops);
explain analyse verbose
select count(*) from test_lower_collate_c where lower(a) like 'eb%5%';
| QUERY PLAN |
| Aggregate (cost=146.29..146.30 rows=1 width=8) (actual time=0.283..0.284 rows=1 loops=1) |
| Output: count(*) |
| -> Bitmap Heap Scan on public.test_lower_collate_c (cost=4.94..146.28 rows=5 width=0) (actual time=0.072..0.277 rows=26 loops=1) |
| Filter: (lower(test_lower_collate_c.a) ~~ 'eb%5%'::text) |
| Rows Removed by Filter: 162 |
| Heap Blocks: exact=133 |
| -> Bitmap Index Scan on test_lower_collate_c_lower_idx (cost=0.00..4.94 rows=65 width=0) (actual time=0.043..0.043 rows=188 loops=1) |
| Index Cond: ((lower(test_lower_collate_c.a) >= 'eb'::text) AND (lower(test_lower_collate_c.a) < 'ec'::text)) |
| Planning Time: 0.418 ms |
| Execution Time: 0.359 ms |
Will leaving the collation on these columns as C cause issues, since C is a case sensitive collation?
Values get folded to lowercase when ingested into citext type so the case differences are lost - that's not a problem.
It might be a problem if you're dealing with accents and other non-ASCII texts because collate "C" places them in a different range. According to it where a ~>=~ 'ea' and a ~<~ 'eb' won't find values starting with 'eá' because accent variants go somewhere way behind the whole alphabet instead of following their base letter.
Another thing of note is that I don't see the optimiser adding the range scan to pattern-based search on its own. Given a query like this:
demo at db<>fiddle
select from test_lower_collate_c where lower(a) like 'eb%5%';
text_pattern_ops gets you an additional condition to speed up the search based on prefix
Index Cond: ((lower(test_lower_collate_c.a) ~>=~ 'eb'::text) AND (lower(test_lower_collate_c.a) ~<~ 'ec'::text))
Meanwhile, with citext_pattern_ops I needed to add them on my own:
select from test_citext_collate_c where a like 'eb%5%' and a ~>=~ 'eb' and a ~<~ 'ec';;
And the timing was still worse than for the expression-based index.
What is the recommended collation for a citext column?
If your values/patterns are simple ASCII, COLLATE "C" can handle them fast. Otherwise, it just won't work right.