6

I have read in the PostgreSQL docs that without an ORDER statement, SELECT will return records in an unspecified order.

Recently on an interview, I was asked how to SELECT records in the order that they inserted without an PK or created_at or other field that can be used for order. The senior dev who interviewed me was insistent that without an ORDER statement the records will be returned in the order that they were inserted.

Is this true for PostgreSQL? Is it true for MySQL? Or any other RDBMS?

1
  • 2
    No. It cannot be guaranteed. And, after all, an INDEX is not a PRIMARY KEY, yet it will (equally indeterminately) upset the order Commented Mar 7, 2020 at 21:18

4 Answers 4

9

I can answer for MySQL. I don't know for PostgreSQL.

The default order is not the order of insertion, generally.

In the case of InnoDB, the default order depends on the order of the index read for the query. You can get this information from the EXPLAIN plan.

For MyISAM, it returns orders in the order they are read from the table. This might be the order of insertion, but MyISAM will reuse gaps after you delete records, so newer rows may be stored earlier.

None of this is guaranteed; it's just a side effect of the current implementation. MySQL could change the implementation in the next version, making the default order of result sets different, without violating any documented behavior.

So if you need the results in a specific order, you should use ORDER BY on your queries.

Sign up to request clarification or add additional context in comments.

1 Comment

+10 good answer. And whatever the default, or observed, returned order is, we should not rely on that to satisfy a requirement. For the benefit of the future reader/maintainer, there needs to be an indication of the requirement, "rows should be returned in insert order" , and the query should make that explicit in the ORDER BY clause.
9

In the case of PostgreSQL, that is quite wrong.

If there are no deletes or updates, rows will be stored in the table in the order you insert them. And even though a sequential scan will usually return the rows in that order, that is not guaranteed: the synchronized sequential scan feature of PostgreSQL can have a sequential scan "piggy back" on an already executing one, so that rows are read starting somewhere in the middle of the table.

However, this ordering of the rows breaks down completely if you update or delete even a single row: the old version of the row will become obsolete, and (in the case of an UPDATE) the new version can end up somewhere entirely different in the table. The space for the old row version is eventually reclaimed by autovacuum and can be reused for a newly inserted row.

Comments

3

Following BK's answer, and by way of example...

DROP TABLE IF EXISTS my_table;

CREATE TABLE my_table(id INT NOT NULL) ENGINE = MYISAM;

INSERT INTO my_table VALUES (1),(9),(5),(8),(7),(3),(2),(6);

DELETE FROM my_table WHERE id = 8;

INSERT INTO my_table VALUES (4),(8);

SELECT * FROM my_table;
+----+
| id |
+----+
|  1 |
|  9 |
|  5 |
|  4 | -- is this what
|  7 |
|  3 |
|  2 |
|  6 |
|  8 | -- we expect?
+----+

Comments

3

Without an ORDER BY clause, the database is free to return rows in any order. There is no guarantee that rows will be returned in the order they were inserted.

With MySQL (InnoDB), we observe that rows are typically returned in the order by an index used in the execution plan, or by the cluster key of a table.

It is not difficult to craft an example...

CREATE TABLE foo 
( id INT NOT NULL
, val VARCHAR(10) NOT NULL DEFAULT ''
, UNIQUE KEY (id,val) 
) ENGINE=InnoDB; 

INSERT INTO foo (id, val) VALUES (7,'seven') ;
INSERT INTO foo (id, val) VALUES (4,'four') ; 

SELECT id, val FROM foo ; 

MySQL is free to return rows in any order, but in this case, we would typically observe that MySQL will access rows through the InnoDB cluster key.

  id   val
----   ----- 
   4   four
   7   seven 

Not at all clear what point the interviewer was trying to make. If the interviewer is trying to sell the idea, given a requirement to return rows from a table in the order the rows were inserted, a query without an ORDER BY clause is ever the right solution, I'm not buying it.

We can craft examples where rows are returned in the order they were inserted, but that is a byproduct of the implementation, ... not guaranteed behavior, and we should never rely on that behavior to satisfy a specification.

6 Comments

It's quite difficult to craft an example that proves the interviewer wrong though
The question specifically mentioned the omission of a PK
@Strawberry: i missed the part about no PK; i suppose this also assumes that we can;t create a UNIQUE index in non-NULL columns that would be used as the cluster key; and disallows a predicate that would make use of a secondary index, ,,,
I guess it's open to interpretation, but yes, I'd say so,
interviewer would need to add several more restrictions... which storage engine (InnoDB and not MyISAM), and InnoDB cluster key on synthetic rowid, no secondary index, no predicates in the query that would favor the use of a secondary index, ... or even without that, with an AUTO_INCREMENT column as the cluster key, a guarantee that a later inserts will not supply a value for that AUTO_INCREMENT column lower than previously inserted rows.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.