PostgreSQL: Display and count distinct occurrences of values across multiple columns

Question

Using PostgreSQL 9.4.1, I am trying to identify/display the occurrences of values over 3 different columns. See below (apologies for the formatting, I can't get a proper table format. Type, type1 and type2 are the column names. The table name is documents

CREATE TABLE documents
AS
  SELECT *
  FROM ( VALUES 
    ('USA','China','Africa'),
    ('China','USA','Chemicals'), 
    ('Chemicals','Africa','USA')
  ) AS t(type,type1,type2);

Below is \d+ of the table:

     Column     |  Type  |                       Modifiers                        
----------------+--------+--------------------------------------------------------
 id             | bigint | not null default nextval('documents_id_seq'::regclass)
 title          | text   | 
 description    | text   | 
 source         | text   | 
 url            | text   | 
 emaillink      | text   | 
 emailurl       | text   | 
 type           | text   | 
 language       | text   | 
 author         | text   | 
 publisheddate  | date   | default ('now'::text)::date
 comments       | text   | 
 classification | text   | 
 submittedby    | text   | 
 localurl       | text   | 
 type1          | text   | 
 type2          | text   | 
Indexes:
    "documents_pkey" PRIMARY KEY, btree (id)

I would like a query that returns:

Africa - 2   
Chemicals - 2  
China - 2   
USA - 3

This is a query likely to get run fairly liberally, so I'd like to avoid expensive queries if at all possible.

FuzzyTree · Accepted Answer · 2015-05-07 12:06:39Z

1

You can use union all to pivot the columns into rows and then do a group by to count the occurrences for each type

select type, count(*) from (
    select type1 as type from mytable
    union all select type2 from mytable
    union all select type3 from mytable
) t1 group by type

edited May 7, 2015 at 12:06

answered May 7, 2015 at 3:00

FuzzyTree

32.4k3 gold badges58 silver badges87 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Brooks Over a year ago

Thanks so much! I only have about 1000 records, so I couldn't easily discern a difference in speed, but went with the full join anyways.

Brooks Over a year ago

How would I sort the full join?

Brooks Over a year ago

FYI, the full join doesn't actually merge the three columns for some reason. The output appears to output each column individually. I saw duplicates.

bdunn · Accepted Answer · 2015-05-07 03:02:04Z

1

Try this:

SELECT WORD, COUNT(1) OCCURENCES
FROM (
    SELECT Type FROM TableName
    UNION ALL
    SELECT Type1 FROM TableName
    UNION ALL
    SELECT Type2 FROM TableName)
GROUP BY WORD;

answered May 7, 2015 at 3:02

bdunn

4824 silver badges19 bronze badges

1 Comment

Brooks Over a year ago

If I could select both as answers, I would, you were only what...2 minutes behind FuzzyTree?

Evan Carroll · Accepted Answer · 2017-03-10 07:31:38Z

0

Alternatively you can use ARRAY[]/unnest()

SELECT x, count(x)
FROM (
  SELECT ARRAY[type,type1,type2] AS array
  FROM documents
) AS t
CROSS JOIN LATERAL unnest(t.array)
  AS x
GROUP BY x;

     x     | count 
-----------+-------
 USA       |     3
 China     |     2
 Chemicals |     2
 Africa    |     2
(4 rows)

answered Mar 10, 2017 at 7:31

Evan Carroll

1

Collectives™ on Stack Overflow

PostgreSQL: Display and count distinct occurrences of values across multiple columns

3 Answers 3

3 Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related