Recent versions of Microsoft SQL Server allow creating a clustered columnstore index on a table that has computed columns, as long as they are not persisted computed columns. [1]
I would like to get the performance benefits of a clustered columnstore index, in particular to use segment elimination. But this isn't working on the computed column. To reproduce, create a table filled with some integers, and a computed column which happens to be always 0.
drop table if exists my_table
create table my_table (x int not null)
alter table my_table add is_odd as convert(bit, x % 2)
create clustered columnstore index cs on my_table
insert into my_table (x)
select value
from generate_series(1, 10000000)
where value % 2 = 0
select distinct is_odd from my_table -- gives 0
Every row has is_odd=0 and so every rowgroup of the columnstore will have 0 as the min and max value of this column (if ineed the column is physically present in the columnstore). Segment elimination does work on the ordinary column:
set statistics io on
select 0 from my_table where x < 0
Table 'my_table'. Segment reads 0, segment skipped 5.
But when querying the computed column, it seems not to notice, and ends up scanning the whole table:
set statistics io on
select 0 from my_table where is_odd = 1
Table 'my_table'. Segment reads 5, segment skipped 0.
The query plan shows it scanning the whole table, then computing the is_odd value for each row and filtering it. It hasn't read is_odd directly from the columnstore. Is there any way I can get the columnstore to include this column and do segment elimination?
(By contrast if I create a rowstore index, clustered or nonclustered, having is_odd as a key column, then queries can seek directly and don't have to scan the whole table.)