This question is the PostgreSQL version of a question about MySQL 5.6 here. Originally, it was one question for both RDBMS's but it was suggested to me that, given the different capabilities of the two systems, I should split the question - in particular I think that CTEs (WITH clause) should make the query far more elegant and readable!
Suppose I have a list of tumours (this data is simulated from real data):
CREATE table illness (nature_of_illness VARCHAR(25), created_at DATETIME);
INSERT INTO illness VALUES ('Cervix', '2018-01-03 15:45:40');
INSERT INTO illness VALUES ('Cervix', '2018-01-03 15:45:40');
INSERT INTO illness VALUES ('Cervix', '2018-01-03 15:45:40');
INSERT INTO illness VALUES ('Cervix', '2018-01-03 15:45:40');
INSERT INTO illness VALUES ('Cervix', '2018-01-03 15:45:40');
INSERT INTO illness VALUES ('Lung', '2018-01-03 17:50:32');
INSERT INTO illness VALUES ('Lung', '2018-02-03 17:50:32');
INSERT INTO illness VALUES ('Lung', '2018-02-03 17:50:32');
INSERT INTO illness VALUES ('Lung', '2018-02-03 17:50:32');
INSERT INTO illness VALUES ('Cervix', '2018-02-03 17:50:32');
-- 2017, with 1 Cervix and Lung each for the month of Jan - tie!
INSERT INTO illness VALUES ('Cervix', '2017-01-03 15:45:40');
INSERT INTO illness VALUES ('Lung', '2017-01-03 17:50:32');
INSERT INTO illness VALUES ('Lung', '2017-02-03 17:50:32');
INSERT INTO illness VALUES ('Lung', '2017-02-03 17:50:32');
INSERT INTO illness VALUES ('Lung', '2017-02-03 17:50:32');
INSERT INTO illness VALUES ('Cervix', '2017-02-03 17:50:32');
You want to find out which particular tumour was most common in a given month - so far so good!
Now, you will notice that for month 1 of 2017, there is a tie - so it makes no sense whatsoever to randomly pick one and give that as the answer - so ties have to be included - this makes the problem much more challenging.
I have a solution but it's quite complex - I'd like to know if my solution is optimal or not. The PostgreSQL fiddle is here! The query in the fiddle is very cumbersome - I'll have a look at using CTE's.
My first answer (which works with both PostgreSQL and MySQL) is included in the fiddle but I won't post it here as I believe that it will be superceded by PostgreSQL's superior capabilities and it would be just a copy of my answer to the MySQL question!
optimal? For performance, short code, readability, elegance, portability? What is your first priority? How many rows, and how many distinct kinds of tumour? And is there a tabletumourwith 1 row per existing tumour? And what's your version of Postgres? – Erwin Brandstetter May 07 '18 at 16:15a given month, but query in the fiddle returns all months. Big difference. Which is it? – Erwin Brandstetter May 07 '18 at 16:21