1

PostgreSQL version: 9.6.18

I would like to know whether it is possible to select randomly multiple rows from a table (possibly based on a random selection of the values of a given column in the table, for example the primary key, sequence, etc.)? Let's say that I have a table containing 20 rows and I wish to return 4 rows randomly. After some Googling I saw that for 1 single row, the simultaneous use of offset, random() and limit in the select clause had been suggested as a solution. So I tried to modify a bit the concept in order to adapt it for returning randomly multiple rows and not just a single row. Here is my test case:

with testtab as
(
    select 'pkey-01' as primary_key, 'value-01' as colval union all
    select 'pkey-02' as primary_key, 'value-02' as colval union all
    select 'pkey-03' as primary_key, 'value-03' as colval union all
    select 'pkey-04' as primary_key, 'value-04' as colval union all
    select 'pkey-05' as primary_key, 'value-05' as colval union all
    select 'pkey-06' as primary_key, 'value-06' as colval union all
    select 'pkey-07' as primary_key, 'value-07' as colval union all
    select 'pkey-08' as primary_key, 'value-08' as colval union all
    select 'pkey-09' as primary_key, 'value-09' as colval union all
    select 'pkey-10' as primary_key, 'value-10' as colval union all
    select 'pkey-11' as primary_key, 'value-11' as colval union all
    select 'pkey-12' as primary_key, 'value-12' as colval union all
    select 'pkey-13' as primary_key, 'value-13' as colval union all
    select 'pkey-14' as primary_key, 'value-14' as colval union all
    select 'pkey-15' as primary_key, 'value-15' as colval union all
    select 'pkey-16' as primary_key, 'value-16' as colval union all
    select 'pkey-17' as primary_key, 'value-17' as colval union all
    select 'pkey-18' as primary_key, 'value-18' as colval union all
    select 'pkey-19' as primary_key, 'value-19' as colval union all
    select 'pkey-20' as primary_key, 'value-20' as colval
)
select
    t1.primary_key,
    t1.colval
from testtab as t1 offset floor(random() * (select count(*) from testtab as t2)) limit 4;

The above mentioned code in which I've modified limit 1 to limit 4, indeed returns 4 rows randomly to some extent, that is, the offset is a random. But then the problem is that the 4 returned rows are always contiguous. So for example if the offset is 3, then the query will definitely return in order 3, 4, 5 and 6.

primary_key    colval
------------  ---------
pkey-03        value-03
pkey-04        value-04
pkey-05        value-05
pkey-06        value-06

I would like to know whether there is any way to achieve this in a way that the returned rows are not in a contiguous block? So for example instead of 3, 4, 5 and 6 the query would actually return four random rows, something like 13, 1, 8, 16 etc.

So I'm looking for something like Sample() function in R or PROC SURVEYSELECT in SAS that can achive the same in PostgreSQL. Is it possible?

Thanks in advance.

1
  • 1
    SELECT * FROM testtab ORDER BY random() LIMIT 4;? Commented Dec 14, 2020 at 19:01

1 Answer 1

2

you can order by random() and it simplifies the query

with testtab as
(
    select 'pkey-01' as primary_key, 'value-01' as colval union all
    select 'pkey-02' as primary_key, 'value-02' as colval union all
    select 'pkey-03' as primary_key, 'value-03' as colval union all
    select 'pkey-04' as primary_key, 'value-04' as colval union all
    select 'pkey-05' as primary_key, 'value-05' as colval union all
    select 'pkey-06' as primary_key, 'value-06' as colval union all
    select 'pkey-07' as primary_key, 'value-07' as colval union all
    select 'pkey-08' as primary_key, 'value-08' as colval union all
    select 'pkey-09' as primary_key, 'value-09' as colval union all
    select 'pkey-10' as primary_key, 'value-10' as colval union all
    select 'pkey-11' as primary_key, 'value-11' as colval union all
    select 'pkey-12' as primary_key, 'value-12' as colval union all
    select 'pkey-13' as primary_key, 'value-13' as colval union all
    select 'pkey-14' as primary_key, 'value-14' as colval union all
    select 'pkey-15' as primary_key, 'value-15' as colval union all
    select 'pkey-16' as primary_key, 'value-16' as colval union all
    select 'pkey-17' as primary_key, 'value-17' as colval union all
    select 'pkey-18' as primary_key, 'value-18' as colval union all
    select 'pkey-19' as primary_key, 'value-19' as colval union all
    select 'pkey-20' as primary_key, 'value-20' as colval
)
select
    t1.primary_key,
    t1.colval
from testtab t1
order by random()
limit 4;
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you very much for your time and your help. Just to make sure that I understand the concept correctly, in the solution you suggest, if the table has N rows then there will be N calls of the random() function, that is, one random() call per row, then the N generated random values are sorted based on the default ascending order, yet because they were attributed to rows randomly, the sort doesn't always put rows in contiguous blocks and at the end, using LIMIT 4 just retrieves the first four of the sorted random values, and therefore 4 randomly selected rows. Do I understand correctly?
Thank you very much. I didn't know this method. Very intersting and quite useful!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.