Revisions to Why correlated scalar is 10x times slower in MySQL comparing to PG

Post Undeleted by Slimboy Fat

occurred Oct 3 at 11:51

added 2740 characters in body

Source Link

edited Oct 3 at 11:51

663
1
4
11

OkWell, so answering my own question.

create table ttt(id int, txt text); 

insert into ttt
with recursive r(id) as
(select 1 union all select id + 1 from r where id < 6e6)
select id, concat('name', id)
from r;

sfdasfdaPG

postgres=# select count(*) from ttt;
  count
---------
 6000000
(1 row)


Time: 454.881 ms
postgres=# select sum(id) from ttt;
      sum
----------------
 18000003000000
(1 row)


Time: 544.896 ms
postgres=# select sum(id), max(txt) from ttt;
      sum       |    max
----------------+------------
 18000003000000 | name999999
(1 row)


Time: 2541.881 ms (00:02.542)

MYSQL

mysql> select count(*) from ttt;
+----------+
| count(*) |
+----------+
|  6000000 |
+----------+
1 row in set (0.38 sec)

mysql> select sum(id) from ttt;
+----------------+
| sum(id)        |
+----------------+
| 18000003000000 |
+----------------+
1 row in set (6.04 sec)

mysql> select sum(id), max(txt) from ttt;
+----------------+------------+
| sum(id)        | max(txt)   |
+----------------+------------+
| 18000003000000 | name999999 |
+----------------+------------+
1 row in set (8.13 sec)

Let's compare size on disk

PG

postgres=# select
postgres-#   c.relname table_name,
postgres-#   pg_relation_filepath(c.oid) file,
postgres-#   pg_size_pretty(pg_total_relation_size(c.oid)) total_size
postgres-# from pg_class c
postgres-# where c.relname = 'ttt';
 table_name |     file     | total_size
------------+--------------+------------
 ttt        | base/5/98593 | 253 MB
(1 row)

MYSQL

mysql> select
    ->   t.table_name,
    ->   f.file_name file,
    ->   round((t.data_length + t.index_length)/1024/1024, 2) total_mb
    -> from information_schema.tables as t
    -> join information_schema.files as f
    ->   on f.tablespace_name = concat(t.table_schema, '/', t.table_name)
    -> where t.table_schema = 'mydb'
    ->   and t.table_name   = 'ttt'
    ->   and t.engine = 'InnoDB'
    ->   and f.file_type = 'TABLESPACE';
+------------+----------------+----------+
| TABLE_NAME | file           | total_mb |
+------------+----------------+----------+
| ttt        | ./mydb/ttt.ibd |   255.80 |
+------------+----------------+----------+

So this is almost the same but queries take 10x times longer (except count(*)).

It would be understandable if full scan was 10% slower or 20% slower… or even 100% slower.

But 1000+% slower that is a bit too much! ((6.04-0.54)/(0.54)*1000 = 1018)

PS.

postgres=# show max_parallel_workers_per_gather;
 max_parallel_workers_per_gather
---------------------------------
 0
(1 row)

mysql> show variables like 'innodb_file_per_table';
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| innodb_file_per_table | ON    |
+-----------------------+-------+
1 row in set (0.00 sec)

Ok, so answering my own question.

create table ttt(id int, txt text);
insert into ttt
with recursive r(id) as
(select 1 union all select id + 1 from r where id < 6e6)
select id, concat('name', id)
from r;

sfdasfda

Well, answering my own question.

create table ttt(id int, txt text); 

insert into ttt
with recursive r(id) as
(select 1 union all select id + 1 from r where id < 6e6)
select id, concat('name', id)
from r;

PG

postgres=# select count(*) from ttt;
  count
---------
 6000000
(1 row)


Time: 454.881 ms
postgres=# select sum(id) from ttt;
      sum
----------------
 18000003000000
(1 row)


Time: 544.896 ms
postgres=# select sum(id), max(txt) from ttt;
      sum       |    max
----------------+------------
 18000003000000 | name999999
(1 row)


Time: 2541.881 ms (00:02.542)

MYSQL

mysql> select count(*) from ttt;
+----------+
| count(*) |
+----------+
|  6000000 |
+----------+
1 row in set (0.38 sec)

mysql> select sum(id) from ttt;
+----------------+
| sum(id)        |
+----------------+
| 18000003000000 |
+----------------+
1 row in set (6.04 sec)

mysql> select sum(id), max(txt) from ttt;
+----------------+------------+
| sum(id)        | max(txt)   |
+----------------+------------+
| 18000003000000 | name999999 |
+----------------+------------+
1 row in set (8.13 sec)

Let's compare size on disk

PG

postgres=# select
postgres-#   c.relname table_name,
postgres-#   pg_relation_filepath(c.oid) file,
postgres-#   pg_size_pretty(pg_total_relation_size(c.oid)) total_size
postgres-# from pg_class c
postgres-# where c.relname = 'ttt';
 table_name |     file     | total_size
------------+--------------+------------
 ttt        | base/5/98593 | 253 MB
(1 row)

MYSQL

mysql> select
    ->   t.table_name,
    ->   f.file_name file,
    ->   round((t.data_length + t.index_length)/1024/1024, 2) total_mb
    -> from information_schema.tables as t
    -> join information_schema.files as f
    ->   on f.tablespace_name = concat(t.table_schema, '/', t.table_name)
    -> where t.table_schema = 'mydb'
    ->   and t.table_name   = 'ttt'
    ->   and t.engine = 'InnoDB'
    ->   and f.file_type = 'TABLESPACE';
+------------+----------------+----------+
| TABLE_NAME | file           | total_mb |
+------------+----------------+----------+
| ttt        | ./mydb/ttt.ibd |   255.80 |
+------------+----------------+----------+

So this is almost the same but queries take 10x times longer (except count(*)).

It would be understandable if full scan was 10% slower or 20% slower… or even 100% slower.

But 1000+% slower that is a bit too much! ((6.04-0.54)/(0.54)*1000 = 1018)

PS.

postgres=# show max_parallel_workers_per_gather;
 max_parallel_workers_per_gather
---------------------------------
 0
(1 row)

mysql> show variables like 'innodb_file_per_table';
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| innodb_file_per_table | ON    |
+-----------------------+-------+
1 row in set (0.00 sec)

Post Deleted by Slimboy Fat

occurred Oct 3 at 11:16

Source Link

answered Oct 3 at 11:16

Slimboy Fat

663
1
4
11

Ok, so answering my own question.

TL;DR: InnoDB is super slow for full table scans.

To demonstrate let's just create bigger table and run some trivial queries.

create table ttt(id int, txt text);
insert into ttt
with recursive r(id) as
(select 1 union all select id + 1 from r where id < 6e6)
select id, concat('name', id)
from r;

sfdasfda

Collectives™ on Stack Overflow

Return to Answer