0

I have a table storing pipes together with dates and cities. I need to calculate the pipes' length sum and percentage by city and installing year.

Here is the result I'm looking for:

  city   |   install_year   |   length   |   percentage
---------+------------------+------------+---------------
  A      |  2014            |  90        |   32.14
  A      |  2013            |  70        |   25.00
  A      |  2012            |  120       |   42.85
  B      |  2010            |  325       |   100.0

I build a test table with this script:

CREATE TABLE pipes (gid serial NOT NULL, city TEXT, install_year INTEGER, length INTEGER) ;

INSERT INTO pipes (city, install_year, length) VALUES ('A',2014,10), ('A',2014,20), ('A',2014,60), ('A',2013,70), ('A',2012,120), ('B',2010,325) ; 

To achieve my query, I use a window function to calculate pipes' length sum for each town, as follow:

SELECT
  city,
  install_year,
  sum(length) AS length,
  (sum(length)*100 / sum(length) OVER (PARTITION BY city)) AS percentage

FROM pipes 

GROUP BY city, install_year 

ORDER BY city, install_year DESC ;

I get an error message asking me to add column 'length' to the GROUP BY clause, which does not give the same result at all (and I do not want to group by length, it would be pointless).

Anyone has an idea to do it differently? I'm afraid I will have to use a temporary table with a WITH mytable AS (...) SELECT ....

1
  • sum(length)*100 / sum(length) is always 100 so that's probably not what you want. Commented May 6, 2015 at 14:58

2 Answers 2

1

Window functions apply after group by, you should always use the WINDOW keyword to remember this.

You were close, you only need to sum the sums to get the total by city.

SELECT
  city,
  install_year,
  sum(length) AS length,
  sum(sum(length)) OVER w AS total_by_city,
  (sum(length) * 100) / (sum(sum(length)) OVER w) AS percentage
FROM pipes
GROUP BY city, install_year
WINDOW w AS (PARTITION BY city )
ORDER BY city, install_year DESC;
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks a lot! I had found another solution in the meantime (see next answer). Can you tell me your opinion about this?
Clever solution, but I think it's a bit less efficient because using a group by reduce the amount of rows that needs to be processed by window functions. Totally depend on your real world row number, a group by could kill performances too. I find your solution a bit harder to read too, the distinct doesn't help to understand what the query does at first glance, as in my solution, only the percentage needs some more thinking and the rest of the query is quite simple.
0

I had found this:

SELECT DISTINCT
  city,
  install_year,
  sum(length) OVER (PARTITION BY city, install_year) AS length,
  sum(length) OVER (PARTITION BY city, install_year)*100 / sum(length) OVER (PARTITION BY city) AS percentage

FROM pipes

ORDER BY city, install_year DESC ;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.