3

I'm going to do my best to describe this. So I have 3 tables as follows:

project

id title department_id
1001 Project One 50
1002 Project two 50

metric

id name
101 Number of exhibits
102 Number of stakeholders

project_metric

id metric_id project_id value
1 101 1001 10
2 101 1002 20
3 102 1001 5
4 102 1002 33

What I'm trying to do is write a query that will produce the below result.

project_title Number of exhibits Number of stakeholders
Project One 10 5
Project Two 20 33

I'm not going to paste any of my attempts because they're pathetic.

A couple of notes:

  1. PostgreSQL version is 9.5 on this db.
  2. (IMPORTANT) This query will be implemented inside a long query that displays some 65 columns of information about a Project, but for brevity obviously I've omitted some project attributes. I'll post a partial snippet of the current query.
SELECT DISTINCT
    p.title AS "Project Title", d.name AS "Center/Department",

// This is a query that currently exists in the long query I mentioned. (SELECT DISTINCT string_agg(DISTINCT re.name, ', ') AS "Research Exchange" FROM research_exchange re JOIN project_research_exchange pre ON pre.research_exchange_id = re.id WHERE pre.project_id = p.id)

FROM project p LEFT JOIN department d ON p.department_id = d.id GROUP BY p.title ORDER BY p.title

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
Mondi Koci
  • 33
  • 1
  • 8
  • Postgres 9.5 is too old. Consider upgrading to a current version. https://www.postgresql.org/support/versioning/ – Erwin Brandstetter Mar 12 '22 at 23:16
  • I am thinking to do that, but it runs on an old grails and jdk8 so I'm not that confident to do that yet. I dumped a copy and testing it locally just to see how that goes. – Mondi Koci Mar 13 '22 at 15:15

2 Answers2

1

For many result columns I would suggest a crosstab query. If you are unfamiliar, read this first:

Your test case seems misleading as it would not require aggregation. Assuming multiple rows per project and metric in reality, you need to aggregate sums first.

SELECT p.title AS project_title, ct.*
FROM   crosstab(
  'SELECT project_id, metric_id, sum(value) AS sum_val
   FROM   project_metric
   GROUP  BY 1, 2
   ORDER  BY 1, 2'
 , $$SELECT unnest('{101, 102}'::int[])$$    -- more ... ①, ③
   ) AS ct(project_id int
         , "Number of exhibits" bigint
         , "Number of stakeholders" bigint)  -- more ... ②
JOIN   project p ON p.id = ct.project_id;

db<>fiddle here

Aggregate and cross-tabulate the core table before joining. That's substantially cheaper.

What's more, no need to join to table metric at all if we translate metric_id to name as output column in the crosstab() function call directly. Just make sure to keep the array in the second function parameter ① in sync with the column definition list ②.

If you have many result columns ("columns of information"), or their number is changing dynamically, consider building the statement dynamically. See:

About dollar-quoting ③:

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
  • Very helpful, appreciated. I have been testing crosstab for the past two days, what I didn't think of was calling project_id and use it for the JOIN. – Mondi Koci Mar 12 '22 at 22:42
0

The display of multiple rows to columns can be accomplished but utilising a SQL Pivot. This can be accomplished using CrossTab in PostgrSQL.

Using your example tables above

CREATE TABLE Project
(
    ID INT,
    Title VARCHAR(20),
    Department_ID INT
 );

CREATE TABLE Metric ( ID INT, Name VARCHAR(50) );

CREATE TABLE Project_Metric ( ID INT, Metric_ID INT, Project_ID INT, Value INT

);

INSERT INTO Project(ID,Title,Department_ID) VALUES (1001,'Project One',50); INSERT INTO Project(ID,Title,Department_ID) VALUES (1002,'Project Two',50);

INSERT INTO Metric(ID,Name) VALUES(101,'Number of Exhibits'); INSERT INTO Metric(ID,Name) VALUES(102,'Number of Stakeholders');

INSERT INTO Project_Metric(ID,Metric_ID,Project_ID,Value) VALUES(1,101,1001,10); INSERT INTO Project_Metric(ID,Metric_ID,Project_ID,Value) VALUES(1,101,1002,20); INSERT INTO Project_Metric(ID,Metric_ID,Project_ID,Value) VALUES(1,102,1001,5); INSERT INTO Project_Metric(ID,Metric_ID,Project_ID,Value) VALUES(1,102,1002,33);

Using Crosstab

SELECT * FROM crosstab('Select 
  PR.Title,
  ME.Name,
  PM.Value
 FROM
   Project_Metric PM
 JOIN
   Project PR 
  ON 
    PR.ID = PM.Project_ID
  JOIN
    Metric ME
  ON
   ME.ID = PM.Metric_ID') AS cr (Project_Title VARCHAR(20),NumberOfExhibits INT,NumberOfStakeHolders INT)
ORDER BY Project_Title;

This gives the required results as above.

You should avoid using SQL key words such as Value as column names.

armitage
  • 1,402
  • 2
  • 14
  • 20
  • Thanks, it did work. It's amazing how it saved me some 200 lines of repetitive subqueries. – Mondi Koci Mar 12 '22 at 04:47
  • Oops. Quick update, this actually doesn't sum up, the values rather it displays individual values. which then repeats the same project title as many times. – Mondi Koci Mar 12 '22 at 06:55