92

I have a table that looks like this:

id   count
1    100
2    50
3    10

I want to add a new column called cumulative_sum, so the table would look like this:

id   count  cumulative_sum
1    100    100
2    50     150
3    10     160

Is there a MySQL update statement that can do this easily? What's the best way to accomplish this?

9 Answers 9

119

Using a correlated query:


  SELECT t.id,
         t.count,
         (SELECT SUM(x.count)
            FROM TABLE x
           WHERE x.id <= t.id) AS cumulative_sum
    FROM TABLE t
ORDER BY t.id

Using MySQL variables:


  SELECT t.id,
         t.count,
         @running_total := @running_total + t.count AS cumulative_sum
    FROM TABLE t
    JOIN (SELECT @running_total := 0) r
ORDER BY t.id

Note:

  • The JOIN (SELECT @running_total := 0) r is a cross join, and allows for variable declaration without requiring a separate SET command.
  • The table alias, r, is required by MySQL for any subquery/derived table/inline view

Caveats:

  • MySQL specific; not portable to other databases
  • The ORDER BY is important; it ensures the order matches the OP and can have larger implications for more complicated variable usage (IE: psuedo ROW_NUMBER/RANK functionality, which MySQL lacks)
Sign up to request clarification or add additional context in comments.

9 Comments

I would add "ORDER BY t.id ASC" to the main query, just to make sure it'll always work
My first thought also was to add ORDER BY. But it does not matter. Until addition turns into non-associative, at least :)
@OMG Poines: I think you need to use a SELECT in the JOIN (SELECT @running_total := 0) part of the variables example.
for "using a correlated query" where does your table x come from ?
Unless there is optimization happening internally, the correlated subquery is the equivalent of a triangular join performing in O(N^2) time--which will not scale.
|
99

If performance is an issue, you could use a MySQL variable:

set @csum := 0;
update YourTable
set cumulative_sum = (@csum := @csum + count)
order by id;

Alternatively, you could remove the cumulative_sum column and calculate it on each query:

set @csum := 0;
select id, count, (@csum := @csum + count) as cumulative_sum
from YourTable
order by id;

This calculates the running sum in a running way :)

7 Comments

Use a cross join to define the variable without needing to use SET.
My table has 36 million records, so this was really helpful to speed things up!
Note that ordering by cumulative_sum might force full table scan.
This does work and seems quite fast; any suggestions how this can be extended to do a cumulative sum in a group? e.g. group by Name or similar, and then do a cumulative sum only for records with the same name
Prefer answer of OLAP function in MySQL 8.0+, as stated in stackoverflow.com/a/52278657/3090068
|
53

MySQL 8.0/MariaDB supports windowed SUM(col) OVER():

SELECT *, SUM(cnt) OVER(ORDER BY id) AS cumulative_sum
FROM tab;

Output:

┌─────┬──────┬────────────────┐
│ id  │ cnt  │ cumulative_sum │
├─────┼──────┼────────────────┤
│  1  │ 100  │            100 │
│  2  │  50  │            150 │
│  3  │  10  │            160 │
└─────┴──────┴────────────────┘

db<>fiddle

3 Comments

I am looking for Cumulative sum using windows function.Thank you.
@lukasz szozda, how would you insert this data into a database table column so it can be used in other tables? Thanks
@kejo INSERT INTO table_name(id, cnt, cumulative_sum) SELECT ... FROM ... or CREATE TABLE table_name AS SELECT ... FROM ...
3
UPDATE t
SET cumulative_sum = (
 SELECT SUM(x.count)
 FROM t x
 WHERE x.id <= t.id
)

1 Comment

Although the OP did ask for an update, this is denormalized and will probably be inconvenient to maintain correctly.
3
select Id, Count, @total := @total + Count as cumulative_sum
from YourTable, (Select @total := 0) as total ;

2 Comments

Please explain your answer
The answer works and is one liner. It also initializes/resets the variable to zero at the begining of select.
2

Sample query

SET @runtot:=0;
SELECT
   q1.d,
   q1.c,
   (@runtot := @runtot + q1.c) AS rt
FROM
   (SELECT
       DAYOFYEAR(date) AS d,
       COUNT(*) AS c
    FROM  orders
    WHERE  hasPaid > 0
    GROUP  BY d
    ORDER  BY d) AS q1

Comments

2

select id,count,sum(count)over(order by count desc) as cumulative_sum from tableName;

I have used the sum aggregate function on the count column and then used the over clause. It sums up each one of the rows individually. The first row is just going to be 100. The second row is going to be 100+50. The third row is 100+50+10 and so forth. So basically every row is the sum of it and all the previous rows and the very last one is the sum of all the rows. So the way to look at this is each row is the sum of the amount where the ID is less than or equal to itself.

3 Comments

While this might solve the problem, it's better to explain it a bit so it will benefit others :)
this isn't a co-related subquery or a subquery for that matter... co-related subquery follows SELECT ...., (SELECT .... FROM table2 WHERE table2.id = table1.id ) FROM table1 what you have is a window query..
A more detailed explanation of this windowing technique: dev.mysql.com/doc/refman/8.0/en/window-functions-frames.html
1

You could also create a trigger that will calculate the sum before each insert

delimiter |

CREATE TRIGGER calCumluativeSum  BEFORE INSERT ON someTable
  FOR EACH ROW BEGIN

  SET cumulative_sum = (
     SELECT SUM(x.count)
        FROM someTable x
        WHERE x.id <= NEW.id
    )

    set  NEW.cumulative_sum = cumulative_sum;
  END;
|

I have not tested this

Comments

0
  select t1.id, t1.count, SUM(t2.count) cumulative_sum
    from table t1 
        join table t2 on t1.id >= t2.id
    group by t1.id, t1.count

Step by step:

1- Given the following table:

select *
from table t1 
order by t1.id;

id  | count
 1  |  11
 2  |  12   
 3  |  13

2 - Get information by groups

select *
from table t1 
    join table t2 on t1.id >= t2.id
order by t1.id, t2.id;

id  | count | id | count
 1  | 11    | 1  |  11

 2  | 12    | 1  |  11
 2  | 12    | 2  |  12

 3  | 13    | 1  |  11
 3  | 13    | 2  |  12
 3  | 13    | 3  |  13

3- Step 3: Sum all count by t1.id group

select t1.id, t1.count, SUM(t2.count) cumulative_sum
from table t1 
    join table t2 on t1.id >= t2.id
group by t1.id, t1.count;


id  | count | cumulative_sum
 1  |  11   |    11
 2  |  12   |    23
 3  |  13   |    36

1 Comment

Added some step by step to understand the final query

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.