You could use an expression index, but you shouldn't. Keep it simple, and use a plain b-tree.
An expression index can be created on colname IS NOT NULL:
test=> CREATE TABLE blah(name text);
CREATE TABLE
test=> CREATE INDEX name_notnull ON blah((name IS NOT NULL));
CREATE INDEX
test=> INSERT INTO blah(name) VALUES ('a'),('b'),(NULL);
INSERT 0 3
test=> SET enable_seqscan = off;
SET
craig=> SELECT * FROM blah WHERE name IS NOT NULL;
name
------
a
b
(2 rows)
test=> EXPLAIN SELECT * FROM blah WHERE name IS NOT NULL;
QUERY PLAN
-----------------------------------------------------------------------------
Bitmap Heap Scan on blah (cost=9.39..25.94 rows=1303 width=32)
Filter: (name IS NOT NULL)
-> Bitmap Index Scan on name_notnull (cost=0.00..9.06 rows=655 width=0)
Index Cond: ((name IS NOT NULL) = true)
(4 rows)
test=> SET enable_bitmapscan = off;
SET
test=> EXPLAIN SELECT * FROM blah WHERE name IS NOT NULL;
QUERY PLAN
------------------------------------------------------------------------------
Index Scan using name_notnull on blah (cost=0.15..55.62 rows=1303 width=32)
Index Cond: ((name IS NOT NULL) = true)
Filter: (name IS NOT NULL)
(3 rows)
... but Pg doesn't realise that it's also usable for IS NULL:
test=> EXPLAIN SELECT * FROM blah WHERE name IS NULL;
QUERY PLAN
-------------------------------------------------------------------------
Seq Scan on blah (cost=10000000000.00..10000000023.10 rows=7 width=32)
Filter: (name IS NULL)
(2 rows)
and even transforms NOT (name IS NOT NULL) into name IS NULL, which is usually what you want.
test=> EXPLAIN SELECT * FROM blah WHERE NOT (name IS NOT NULL);
QUERY PLAN
-------------------------------------------------------------------------
Seq Scan on blah (cost=10000000000.00..10000000023.10 rows=7 width=32)
Filter: (name IS NULL)
(2 rows)
so you're actually better off with two disjoint expression indexes, one on the null and one on the non-null set.
test=> DROP INDEX name_notnull ;
DROP INDEX
test=> CREATE INDEX name_notnull ON blah((name IS NOT NULL)) WHERE (name IS NOT NULL);
CREATE INDEX
test=> EXPLAIN SELECT * FROM blah WHERE name IS NOT NULL;
QUERY PLAN
--------------------------------------------------------------------------
Index Scan using name_notnull on blah (cost=0.13..8.14 rows=3 width=32)
Index Cond: ((name IS NOT NULL) = true)
(2 rows)
test=> CREATE INDEX name_null ON blah((name IS NULL)) WHERE (name IS NULL);
CREATE INDEX
craig=> EXPLAIN SELECT * FROM blah WHERE name IS NULL;
QUERY PLAN
-----------------------------------------------------------------------
Index Scan using name_null on blah (cost=0.12..8.14 rows=1 width=32)
Index Cond: ((name IS NULL) = true)
(2 rows)
This is pretty gruesome though. For most sensible uses I'd just use a plain b-tree index. The index size improvement isn't too exciting, at least for small-ish inputs, like the dummy I created with a bunch of md5 values:
test=> SELECT pg_size_pretty(pg_relation_size('blah'));
pg_size_pretty
----------------
9416 kB
(1 row)
test=> SELECT pg_size_pretty(pg_relation_size('blah_name'));
pg_size_pretty
----------------
7984 kB
(1 row)
test=> SELECT pg_size_pretty(pg_relation_size('name_notnull'));
pg_size_pretty
----------------
2208 kB
(1 row)
test=> SELECT pg_size_pretty(pg_relation_size('name_null'));
pg_size_pretty
----------------
2208 kB
(1 row)
create indexto at least minimize its size.create index i on t (coalesce('NULL',col));to actually indexNULLand avoid separating one nulls from other nullsNULLis indexed. Where do you get the idea that it isn't?NULLcontains row pointers to each row withNULLin that field. While they unequal, they all match the predicateIS NULL. Your suggestion doesn't make sense, and not only that, Pg could only then actually use the index if the query also used the same expression.