Does MySQL index NULL values?

Question

I have a MySQL table where an indexed INT column is going to be 0 for 90% of the rows. If I change those rows to use NULL instead of 0, will they be left out of the index, making the index about 90% smaller?

Yves M. · Accepted Answer · 2019-10-14 12:56:25Z

38

http://dev.mysql.com/doc/refman/5.0/en/is-null-optimization.html

MySQL can perform the same optimization on col_name IS NULL that it can use for col_name = constant_value. For example, MySQL can use indexes and ranges to search for NULL with IS NULL.

edited Oct 14, 2019 at 12:56

Yves M.

31.3k24 gold badges111 silver badges154 bronze badges

answered May 19, 2013 at 17:32

Khanh Van

1,44211 silver badges6 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Timo Over a year ago

Please note that the documentation mentions some caveats, e.g. "the optimization can handle only one IS NULL".

Bill the Lizard · Accepted Answer · 2018-08-31 12:27:06Z

13

It looks like it does index the NULLs too.

Be careful when you run this because MySQL will LOCK the table for WRITES during the index creation. Building the index can take a while on large tables even if the column is empty (all nulls).

Reference.

edited Aug 31, 2018 at 12:27

answered Nov 14, 2008 at 1:52

Bill the Lizard

407k213 gold badges579 silver badges892 bronze badges

4 Comments

too much php Over a year ago

How did you come to that conclusion? I don't see any mention of the topic.

Bill the Lizard Over a year ago

It was in the comments at the bottom of the article. I pulled out the relevant part.

too much php Over a year ago

I believe the reason it takes a while on large tables is because MySQL has to read the through the entire table, not because it is building a giant index. I could be wrong.

KajMagnus Over a year ago

@toomuchphp Yes actually "takes a while on large tables ...even if the column is ... all nulls" might as well be interpreted as "handling nulls is fast [because they are not indexed] but if table is huge .."

J.D. Fitz.Gerald · Accepted Answer · 2008-11-14 10:45:04Z

3

Allowing a column to be null will add a byte to the storage requirements of the column. This will lead to an increased index size which is probably not good. That said if a lot of your queries are changed to use "IS NULL" or "NOT NULL" they might be overall faster than doing value comparisons.

My gut would tell me not null, but there's one answer: test!

answered Nov 14, 2008 at 10:45

J.D. Fitz.Gerald

2,9572 gold badges19 silver badges17 bronze badges

4 Comments

J.D. Fitz.Gerald Over a year ago

Question was whether the index would increase in size. Answer was that it would increase the index size in the second sentence.

user359996 Over a year ago

The title asks whether MySQL indexes null columns (it does). The description seems to ask a somewhat different question, but is really just an elucidation of why the (title) question was asked, in the first place. Moreover, since people largely choose whether or not to read a question based on its title, I'd say the title form overrides the description form, in most cases.

user359996 Over a year ago

Also, allowing a null column adds a byte to the row, not the column, unless there are already (a multiple of) 8 nullable columns, since the null is bitmapped. Indeed, this can very well save space, as null values only need to be stored in the bitmap.

user359996 Over a year ago

In this case, an INT column that is NULL 90% of the time takes 1 or less bytes 90% of the time, and between 4 and 5 bytes 10% of the time. This is, in the mean, significantly less than 4 bytes, all of the time, which is what the cost would be without allowing NULL.

dkretz · Accepted Answer · 2012-03-10 01:35:24Z

1

No, it will continue to include them, but don't make too many assumptions about what the consequences are in either case. A lot depends on the range of other values (google for "cardinality").

MSSQL has a new index type called a "filtered index" for this type of situation (i.e. includes records in the index based on a filter). dBASE-type systems used to have a similar capability, and it was pretty handy.

edited Mar 10, 2012 at 1:35

answered Nov 14, 2008 at 2:22

dkretz

37.6k13 gold badges84 silver badges140 bronze badges

Comments

Piruz Hashemian · Accepted Answer · 2016-08-26 15:19:50Z

1

Each index has a cardinality means how many distinct values are indexed. AFAIK it's not a reasonable idea to say indexes repeat the same value for many rows but the index will only addresses a repeated value to the clustered index of many rows (rows having null value for this field) and keeping the reference ID of the clustered index means : each row with a NULL value indexed field wastes a size as large as the PK (for this reason experts recommend to have a reasonable PK size if you have composite PK).

answered Aug 26, 2016 at 15:19

Piruz Hashemian

1862 silver badges7 bronze badges

Comments

Anish Ramaswamy · Accepted Answer · 2025-06-28 18:30:45Z

This question leaves out a lot of detail like which storage engine you're using. Assuming you're using the more popular InnoDB - this doc goes into detail about the different row formats.

https://dev.mysql.com/doc/refman/8.4/en/innodb-row-format.html

Now you need to determine/decide which row format you're using/you want to use. Let's assume you're using the default, which is DYNAMIC (as decided by https://dev.mysql.com/doc/refman/8.4/en/innodb-parameters.html#sysvar_innodb_default_row_format).

Each index record contains a 5-byte header that may be preceded by a variable-length header. The header is used to link together consecutive records, and for row-level locking.

The variable-length part of the record header contains a bit vector for indicating NULL columns. If the number of columns in the index that can be NULL is N, the bit vector occupies CEILING(N/8) bytes. (For example, if there are anywhere from 9 to 16 columns that can be NULL, the bit vector uses two bytes.) Columns that are NULL do not occupy space other than the bit in this vector. The variable-length part of the header also contains the lengths of variable-length columns. Each length takes one or two bytes, depending on the maximum length of the column. If all columns in the index are NOT NULL and have a fixed length, the record header has no variable-length part.

For each non-NULL variable-length field, the record header contains the length of the column in one or two bytes. Two bytes are only needed if part of the column is stored externally in overflow pages or the maximum length exceeds 255 bytes and the actual length exceeds 127 bytes. For an externally stored column, the 2-byte length indicates the length of the internally stored part plus the 20-byte pointer to the externally stored part. The internal part is 768 bytes, so the length is 768+20. The 20-byte pointer stores the true length of the column.

The record header is followed by the data contents of non-NULL columns.

So with this we can conclude that:

There is some special extra space reserved per-index for InnoDB to store information about which columns in that index can potentially include NULL values.
1. You can calculate the space taken here by using the formula given and when you know how many indexes you have, and how many nullable columns you have per-index.
The part of the record header that contains the length of the column will only include data for records where the value is not NULL.
The storage that follows the record header is only used for the data contents of non-NULL columns.
1. This means that the NULL value itself is not stored per-column, per-row.

So the final conclusion here is that for NULL values, you save a lot of space because the only space needed is to store per-index how many columns are NULL-able.

Collectives™ on Stack Overflow

Does MySQL index NULL values?

6 Answers 6

1 Comment

4 Comments

4 Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

1 Comment

4 Comments

4 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related