1

In postgres, when I call a function on some data, like so:

select f(col_nums) from tbl_name
where col_str = '12345'

then function f will be applied on each row where col_str = '12345'.

On the other hand, if I call an aggregation function on some data, like so:

select g_agg(col_nums) from tbl_name
where col_str = '12345'

then the function g_agg will be called on the the entire column but will result in a single value.

Q: How can I make a function that will be applied on the entire column and return a column of the same size while at the same time being aware of all the values in the the subset?

For example, can I create a function to calculate cumulative sum?

select *, sum_accum(col_nums) as cs from tbl_name
where col_str = '12345'

such that the result of the above query would look like this:

 col_str | more_cols | col_numbers | cs
---------+-----------+-------------+----
  12345  |    567    |     1       |  1
  12345  |    568    |     2       |  3
  12345  |    569    |     3       |  6
  12345  |    570    |     4       | 10

Is there no choice but to pass a sub-query result to a function and then join with the original table?

2 Answers 2

1

Use window functions

A window function performs a calculation across a set of table rows that are somehow related to the current row. This is comparable to the type of calculation that can be done with an aggregate function. But unlike regular aggregate functions, use of a window function does not cause rows to become grouped into a single output row — the rows retain their separate identities. Behind the scenes, the window function is able to access more than just the current row of the query result.

e.g.

select *, sum(col_nums) OVER(PARTITION BY T.COLX, T.COLY) as cs 
from tbl_name T
where col_str = '12345'

Note that it is the addition on a over clause that changes an aggregate from its traditional use to a window function:

the OVER clause causes it to be treated as a window function and computed across an appropriate set of rows

In the over clause has a partition by (analogous to group by) which controls the window that the calculations are performed in; and it also allows an order by which is valid for some functions but not all.

select *
   -- running sum using an order by
 , sum(col_nums) OVER(PARTITION BY T.COLX ORDER BY T.COLY) as cs 

   -- but count does not permit ordering
 , count(*) OVER(PARTITION BY T.COLX) as cs_count
from tbl_name T
where col_str = '12345'
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks! Exactly what I was looking for.
1

The function that you want is a cumulative sum. This is handled by window functions:

select t.*, sum(col_nums) over (order by more_cols) as cs
from tbl_name t
where col_str = '12345';

I am guessing that the order by sequence is defined by the second column. It can be any column including col_nums.

You can do this for all values of col_str at the same time, using the partition by clause:

select t.*, sum(col_nums) over (partition by col_str order by more_cols) as cs
from tbl_name t

1 Comment

Thanks, Gordon. You and @Used_By_Already gave similar answers, but I gotta give it to him since he got here first.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.