0

I have a simple database with large amount of rows.

CREATE TABLE `tbl`(
    `id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    `name` NCHAR(50) NOT NULL
    ... other fields ...
  ) Engine=innodb;

I want to make paging imitation. And I know that classic offset is slow operation. Thats why I tried a trick from this website

My version of query is

SELECT * FROM `tbl` JOIN (SELECT id FROM `tbl` ORDER BY id LIMIT 1000000, 100) as b on `b`.`id` = `tbl`.`id`;

but the trick wasn't effective. The query works extremely slow, because mysql still reads all 1000000 lines.

And even separated inner query works witn primary keys only, but doesn't use a b-tree.

SELECT id FROM `tbl` ORDER BY id LIMIT 1000000, 100

So, why mysql engine doesn't use primary index when this query is totally covered with it?

Also I can't use where id > SOME_NUMBER condition because my table is very sparsed, and i don't know boundary ids of random page that was requested by user.

1 Answer 1

2

Consider the following, where I force 2.2M rows into a table

Schema

create table tbl
(   id int auto_increment primary key,
    thing int not null
)engine=MyISAM;

insert tbl(thing) values (7),(7),(7),(7),(7),(7),(7),(7),(7),(7),(7),(7),(7),(7),(7),(7),(7);

insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;
insert into tbl(thing) select thing from tbl;

The Counts

select count(*) as theCount,max(id) as theMax,min(id) as thMin from tbl;
+----------+---------+-------+
| theCount | theMax  | thMin |
+----------+---------+-------+
|  2228224 | 2228224 |     1 |
+----------+---------+-------+

Query A (yours, uses a derived table)

explain 
SELECT *  
FROM `tbl` 
JOIN 
(SELECT id FROM `tbl` ORDER BY id LIMIT 1000000, 100) as b 
on `b`.`id` = `tbl`.`id`;
+----+-------------+------------+--------+---------------+---------+---------+------+---------+-------------+
| id | select_type | table      | type   | possible_keys | key     | key_len | ref  | rows    | Extra       |
+----+-------------+------------+--------+---------------+---------+---------+------+---------+-------------+
|  1 | PRIMARY     | <derived2> | ALL    | NULL          | NULL    | NULL    | NULL | 1000100 | NULL        |
|  1 | PRIMARY     | tbl        | eq_ref | PRIMARY       | PRIMARY | 4       | b.id |       1 | NULL        |
|  2 | DERIVED     | tbl        | index  | NULL          | PRIMARY | 4       | NULL | 2228224 | Using index |
+----+-------------+------------+--------+---------------+---------+---------+------+---------+-------------+

Query B (not a derived table)

explain 
SELECT t1.*  
FROM tbl t1 
JOIN tbl t2 
on t2.id = t1.id 
where t2.id>1000000 
limit 100 
+----+-------------+-------+--------+---------------+---------+---------+--------------------+---------+--------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref                | rows    | Extra                    |
+----+-------------+-------+--------+---------------+---------+---------+--------------------+---------+--------------------------+
|  1 | SIMPLE      | t2    | range  | PRIMARY       | PRIMARY | 4       | NULL               | 1195836 | Using where; Using index |
|  1 | SIMPLE      | t1    | eq_ref | PRIMARY       | PRIMARY | 4       | so_gibberish.t2.id |       1 | NULL                     |
+----+-------------+-------+--------+---------------+---------+---------+--------------------+---------+--------------------------+

For those not familiar with using Explain, see a write-up I did here

Sign up to request clarification or add additional context in comments.

1 Comment

Does main distinction of Query B is "where"-condition insread of "offset"? As I mentioned, I don't know boundary ids. So, I haven't id for substitution in condition, if user requires random pages, not sequential ones. Anyway your answer helped to realize much complex solution, thank you.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.