MySQL Select with Offset is Faster Than no Offset

Question

I've been messing around with query performance for a system with pagination to make the data selection as fast as possible, but I've come across something I don't quite understand. To my knowledge, when a limit with an offset is used, MySQL has to iterate through each row before the offset and then discard them, so in theory a query with an offset of 10,000 would be much slower than one without, which is normally true as in this case

select SQL_NO_CACHE * from `customers` where `NetworkID`='\func uuid()' 
    order by `DateTimeAdded` desc limit 0, 100;
/* finishes in 2.497 seconds */

 select SQL_NO_CACHE * from `customers` where `NetworkID`='\func uuid()' 
   order by `DateTimeAdded` desc limit 10000, 100;
 /* finishes in 2.702 seconds */

But, if I use an inner join to join the table to itself with only UserID column for doing the sorting and limiting, it's consistently faster with the offset of 10,000 than without one, which completely stumps me. Example here would be

select SQL_NO_CACHE * from `customers` 
    inner join (select `UserID` from `customers` where `NetworkID`='\func uuid()' 
        order by `DateTimeAdded` desc limit 100) 
    as `Results` using(`UserID`)
/* finishes in 1.133 seconds */

select SQL_NO_CACHE * from `customers` 
    inner join (select `UserID` from `customers` where `NetworkID`='\func uuid()' 
        order by `DateTimeAdded` desc limit 10000, 100) 
    as `Results` using(`UserID`)
/* finishes in 1.120 seconds */

Why is the query using the offset always faster than the query without the offset?

Explains:

I have posted a Google Docs spreadsheet here with the explains content here

_{Note: The tests above were done in PHP looping 20 times each}

^{Note²: customers is a view, not a base table}

try different offsets, see if u get same trend. might be this specific offset has a very simple join — Itay Moav -Malimovka
– Itay Moav -Malimovka, Commented Jun 4, 2015 at 14:21
I have, if I do this with 30,000 even, it's still consistently faster than the query without the offset — Brian Leishman
– Brian Leishman, Commented Jun 4, 2015 at 14:22
optimize the table and see if it is the same (neutrlize all unknown factors first) — Itay Moav -Malimovka
– Itay Moav -Malimovka, Commented Jun 4, 2015 at 14:23
@Brian Leishman : I wonder whether you always run those 2 queries the same order for tests. — a1ex07
– a1ex07, Commented Jun 4, 2015 at 14:23
Also thought of that @a1ex07, if I switch the order of the queries, the one with the offset still wins — Brian Leishman
– Brian Leishman, Commented Jun 4, 2015 at 14:24

Rick James · Accepted Answer · 2015-06-09 14:11:53Z

1

Case 1: The optimizer can use an index on the ORDER BY. LIMIT 10 will be faster than LIMIT 10000,10 because it can stop reading rows sooner.

Case 2: The optimizer cannot (or chooses not to) use an index for the ORDER BY. In this case, the entire set of rows (after WHERE) is collected, that set is sorted, and only then the OFFSET and LIMIT are applied. In this case the value of OFFSET makes little difference; most of the time was consumed fetching rows, filtering them, and sorting them.

INDEX(x,y)
SELECT ... WHERE x=2               ORDER BY y LIMIT ... -- case 1
SELECT ... WHERE x=2 AND deleted=0 ORDER BY y LIMIT ... -- case 2

INDEX(NetworkID, DateTimeAdded)         -- composite
SELECT ... WHERE NetworkID='...' ORDER BY DateTimeAdded DESC ... -- Case 1

INDEX(NetworkID), INDEX(DateTimeAdded)  -- separate
SELECT ... WHERE NetworkID='...' ORDER BY DateTimeAdded DESC ... -- Case 3

Case 3 might be like Case 1 because it might use INDEX(DateTimeAdded). Or, of the optimizer chooses to use the other index, then it is a slow Case 2. Anyway, it is not as good as using the composite index that can handle both the WHERE and the ORDER BY.

If you can manage to get to Case 1, I recommend you also "remember where you left off" to make Pagination even more efficient. See my Pagination blog.

3 Comments

Brian Leishman Over a year ago

Yes, I've read a ton about pagination and I've come across the strategies of "remembering where I left off", but none of which would really help me in my situation. Reason 1 being that this always requires that I have numerical data which I simply can't use because I have multiple servers that merge. Reason 2, is that I show a filtered subset of the data, that can be ordered by the user in any way, which means that the data is never in numerical order.

Rick James Over a year ago

Have I at least explained why the OFFSET does not impact the speed? (Namely, that most of the time happened before applying the OFFSET.)

Brian Leishman Over a year ago

Yes, I do believe you have though explained why my queries seem odd, and thank you for that!

Collectives™ on Stack Overflow

MySQL Select with Offset is Faster Than no Offset

1 Answer 1

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related