1

Get random row from table foo.

create table foo as 
select generate_series(1,1000000) as val;

'val' is serial field (not primary). Field dosn't have breaks.

This query might return 0,1,2,3,4... rows and in all rows val has different value. Why?

select * from foo where val = 
    (floor(random() * (select max(val) from foo))+1)::int;

Slightly change query to

select * from foo where val = 
    (select (floor(random() * (select max(val) from foo))+1)::int as v);

Result as expected, single random row from table

7
  • Please add some sample data (as formatted text!) and the expected output based on that sample data (and possibly include the create table statement for the foo table) Commented Jan 23, 2015 at 9:43
  • 1
    the create method is incorrect. use like thiscreate table bar as select generate_series(1,1000000) Commented Jan 23, 2015 at 9:51
  • 1
    I'm using Postgresql 9.3 64bit on Windows 7, and I can confirm I get the same behaviour. If you do an "explain" on the two queries you get different query plans. Why the first query behaves is it does is beyond me... Commented Jan 23, 2015 at 9:58
  • To WIngedPanther. I corrected query. Commented Jan 23, 2015 at 10:22
  • Basically problem with last filter. In last case filter looks "Filter: (val = $1)" and it removes always the same amount of rows (1 less then total row count in table). In first case filter looks " Filter: (val = (floor((random() * ($0)::double precision)))::integer)" and count of removed rows different from time to time. Commented Jan 23, 2015 at 10:29

1 Answer 1

2

PostgreSQL's random function is volatile which means it may return diferent values every time it is evaluated, your first query compares a different random number with each row of the table, your second computes a single random value and compares every row with that.

suppose you want a shuffled deck of cards:

select * from deck_of_cards order by random();

or maybe yahtzee us more your thing:

select floor(random()*6)+1::int from generate_series (1,6);
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks a lot. I got the idea why it happens, and I understood what happens in execution plan. Since function is mutable then in first case filter step has to check match of val field with new random value on every row. Basically on 1000000 rows, there is 1 chance from 1000000 when generated value and field value will match (if consider val field has no gaps/breaks). That is why this query return different amount of rows from 0 to 5, but statistically around 1 value match.
I made wrap around random to make immutable function and planner chose as expected "good" behavior

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.