Use values from one column as column names and sum data from another table

Question

I'm going to do my best to describe this. So I have 3 tables as follows:

project

id	title	department_id
1001	Project One	50
1002	Project two	50

metric

id	name
101	Number of exhibits
102	Number of stakeholders

project_metric

id	metric_id	project_id	value
1	101	1001	10
2	101	1002	20
3	102	1001	5
4	102	1002	33

What I'm trying to do is write a query that will produce the below result.

project_title	Number of exhibits	Number of stakeholders
Project One	10	5
Project Two	20	33

I'm not going to paste any of my attempts because they're pathetic.

A couple of notes:

PostgreSQL version is 9.5 on this db.
(IMPORTANT) This query will be implemented inside a long query that displays some 65 columns of information about a Project, but for brevity obviously I've omitted some project attributes. I'll post a partial snippet of the current query.

SELECT DISTINCT
    p.title AS "Project Title", d.name AS "Center/Department",
// This is a query that currently exists in the long query I mentioned.
    (SELECT DISTINCT string_agg(DISTINCT re.name, ', ') AS "Research Exchange"
     FROM research_exchange re
         JOIN project_research_exchange pre ON pre.research_exchange_id = re.id
     WHERE pre.project_id = p.id)
FROM project p
    LEFT JOIN department d ON p.department_id = d.id
GROUP BY p.title
ORDER BY p.title

Postgres 9.5 is too old. Consider upgrading to a current version. https://www.postgresql.org/support/versioning/ — Erwin Brandstetter, Mar 12 '22 at 23:16
I am thinking to do that, but it runs on an old grails and jdk8 so I'm not that confident to do that yet. I dumped a copy and testing it locally just to see how that goes. — Mondi Koci, Mar 13 '22 at 15:15

Erwin Brandstetter · Accepted Answer · 2022-03-12T21:31:32.980

For many result columns I would suggest a crosstab query. If you are unfamiliar, read this first:

PostgreSQL Crosstab Query

Your test case seems misleading as it would not require aggregation. Assuming multiple rows per project and metric in reality, you need to aggregate sums first.

SELECT p.title AS project_title, ct.*
FROM   crosstab(
  'SELECT project_id, metric_id, sum(value) AS sum_val
   FROM   project_metric
   GROUP  BY 1, 2
   ORDER  BY 1, 2'
 , $$SELECT unnest('{101, 102}'::int[])$$    -- more ... ①, ③
   ) AS ct(project_id int
         , "Number of exhibits" bigint
         , "Number of stakeholders" bigint)  -- more ... ②
JOIN   project p ON p.id = ct.project_id;

db<>fiddle here

Aggregate and cross-tabulate the core table before joining. That's substantially cheaper.

What's more, no need to join to table metric at all if we translate metric_id to name as output column in the crosstab() function call directly. Just make sure to keep the array in the second function parameter ① in sync with the column definition list ②.

If you have many result columns ("columns of information"), or their number is changing dynamically, consider building the statement dynamically. See:

About dollar-quoting ③:

Insert text with single quotes in PostgreSQL

Very helpful, appreciated. I have been testing crosstab for the past two days, what I didn't think of was calling project_id and use it for the JOIN. — Mondi Koci, Mar 12 '22 at 22:42

score 0 · Answer 2 · answered Mar 08 '22 at 21:48

The display of multiple rows to columns can be accomplished but utilising a SQL Pivot. This can be accomplished using CrossTab in PostgrSQL.

Using your example tables above

CREATE TABLE Project
(
    ID INT,
    Title VARCHAR(20),
    Department_ID INT
 );
CREATE TABLE Metric
(
    ID INT,
    Name  VARCHAR(50)
 );
CREATE TABLE Project_Metric
(
    ID INT,
    Metric_ID INT,
    Project_ID INT,
    Value INT
);
INSERT INTO Project(ID,Title,Department_ID) VALUES (1001,'Project One',50);
INSERT INTO Project(ID,Title,Department_ID) VALUES (1002,'Project Two',50);
INSERT INTO Metric(ID,Name) VALUES(101,'Number of Exhibits');
INSERT INTO Metric(ID,Name) VALUES(102,'Number of Stakeholders');
INSERT INTO Project_Metric(ID,Metric_ID,Project_ID,Value) VALUES(1,101,1001,10);
INSERT INTO Project_Metric(ID,Metric_ID,Project_ID,Value) VALUES(1,101,1002,20);
INSERT INTO Project_Metric(ID,Metric_ID,Project_ID,Value) VALUES(1,102,1001,5);
INSERT INTO Project_Metric(ID,Metric_ID,Project_ID,Value) VALUES(1,102,1002,33);

Using Crosstab

SELECT * FROM crosstab('Select 
  PR.Title,
  ME.Name,
  PM.Value
 FROM
   Project_Metric PM
 JOIN
   Project PR 
  ON 
    PR.ID = PM.Project_ID
  JOIN
    Metric ME
  ON
   ME.ID = PM.Metric_ID') AS cr (Project_Title VARCHAR(20),NumberOfExhibits INT,NumberOfStakeHolders INT)
ORDER BY Project_Title;

This gives the required results as above.

You should avoid using SQL key words such as Value as column names.

Thanks, it did work. It's amazing how it saved me some 200 lines of repetitive subqueries. — Mondi Koci, Mar 12 '22 at 04:47
Oops. Quick update, this actually doesn't sum up, the values rather it displays individual values. which then repeats the same project title as many times. — Mondi Koci, Mar 12 '22 at 06:55

Use values from one column as column names and sum data from another table

2 Answers2