2

I have a complex SELECT query table that identifies relationship between event_id and attribute. Here is a simplification with a VALUES expression:

SELECT event_id, attribute
FROM (
  VALUES
    (1, '2D'),
    (1, 'IMAX'),
    (2, 'IMAX'),
    (3, '3D')
) event_attribute (event_id, attribute)

I want to include an extra (event_id, '2D') record for every event_id that is not already associated with a 3D or 2D attribute. How to conditionally append rows to a result set?

In case of the above table, the expected result would be:

(1, '2D'),
(1, 'IMAX'),
(2, 'IMAX'),
(2, '2D'),
(3, '3D')

There is also a table event with one row for each relevant id.

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
Gajus
  • 1,185
  • 15
  • 28
  • Complex query. I don't mind disclosing the full version of the query. However, long queries scare of people. I think it makes sense to extract the essence of the question. Here is the query that I ended up with https://gist.github.com/gajus/a803c6ade49971ccc117da5dec4151c7. – Gajus May 13 '18 at 13:01
  • Sure. It makes a difference for the best solution, though. Is there a separate table with all relevant, unique event_id, like event? – Erwin Brandstetter May 13 '18 at 13:06
  • Yes, event table with id exists. – Gajus May 13 '18 at 13:08
  • 1
    And I assume you mean "for every event_id that is not already associated with 3D or 2D attribute"? Your expected result suggests as much. – Erwin Brandstetter May 13 '18 at 13:09
  • 1
    That is correct. Oversight. I have updated the question. – Gajus May 13 '18 at 13:11
  • Damned, now my answers are wrong. Should have seen that one coming. – Gerard H. Pille May 13 '18 at 14:45

4 Answers4

5

Assuming (like was added later) a separate table event with all relevant unique id - this helps performance a bit:

WITH cte(event_id, attribute) AS (
   -- big SELECT query goes here instead of the VALUES expression
   VALUES
    (1, '2D'),
    (1, 'IMAX'),
    (2, 'IMAX'),
    (3, '3D')
   )
TABLE cte
UNION ALL
SELECT e.id, '2D'
FROM   event e
LEFT   JOIN cte ON cte.event_id = e.id
               AND cte.attribute IN ('2D','3D')
WHERE  cte.event_id IS NULL;

Related:

If your query only returns a subset of all event_id, you can't use the table event to simplify like this. Alternative without table event:

WITH cte AS (
   -- big SELECT query goes here instead of the VALUES expression
   VALUES
    (1, '2D'),
    (1, 'IMAX'),
    (2, 'IMAX'),
    (3, '3D')
   )
TABLE cte
UNION ALL
SELECT event_id, '2D'
FROM   cte
GROUP  BY 1
HAVING count(*) FILTER (WHERE attribute IN ('2D', '3D')) = 0;

This is somewhat similar to what you answered yourself, just shorter and a bit faster. In particular the aggregate FILTER clause should be instrumental. Related:

Since there are no indexes on the derived table from the CTE, the second query may be faster to begin with.

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
  • I don't recognize the syntax of TABLE cte. What is that doing? – Noumenon May 13 '18 at 19:34
  • 1
    @Noumenon: Short for SELECT * FROM cte. I added a link to explain above: https://stackoverflow.com/questions/30275979/is-there-a-shortcut-for-select-from-in-psql/30276023#30276023 – Erwin Brandstetter May 14 '18 at 00:47
2

cfr. http://sqlfiddle.com/#!17/82868/6

create table t as SELECT
  event_attribute.event_id,
  event_attribute.attribute
FROM (
  VALUES
    (1, '2D'),
    (1, 'IMAX'),
    (2, 'IMAX'),
    (3, '3D')
) event_attribute (event_id, attribute);


select E, '2D' a from (
  select distinct event_id E from t
   where not exists (
     select null from t t1
      where t1.event_id = t.event_id and t1.attribute = '3D')) X;

Probably faster for larger datasets:

select E, '2D' a from (
  select distinct event_id E from t
  except
  select distinct event_id E from t
   where attribute = '3D') T2;

see http://sqlfiddle.com/#!17/82868/14

As, per the edited question, it's better to avoid duplicate '2D' entries, "attribute = '3D'" should be replaced by "attribute in ('3D','2D')".

Gerard H. Pille
  • 3,255
  • 1
  • 9
  • 13
2

For the record, this is what I have come up with:

WITH
  event_attribute AS
  (
    SELECT
      event_attribute.event_id,
      event_attribute.attribute
    FROM (
      VALUES
        (1, '2D'),
        (1, 'IMAX'),
        (2, 'IMAX'),
        (3, '3D')
    ) event_attribute (event_id, attribute)
  ),
  append_attribute_event AS
  (
    SELECT event_id
    FROM event_attribute
    GROUP BY event_id
    HAVING
      MAX(CASE "attribute" WHEN '2D' THEN 1 ELSE 0 END) = 0 AND
      MAX(CASE "attribute" WHEN '3D' THEN 1 ELSE 0 END) = 0
  )
SELECT * FROM event_attribute
UNION ALL
SELECT event_id, '2D' "attribute"
FROM append_attribute_event

However, I am pretty sure Gerard's suggestion is better.

Gajus
  • 1,185
  • 15
  • 28
1

A straight forward solution would be:

select event_id, attribute
from event_attribute
union all
select event_id, '2D' 
from event_attribute t1 
where not exists (
    select 1 from event_attribute t2 
    where t1.event_id = t2.event_id 
      and t2.attribute in ('2D','3D')
);
Lennart - Slava Ukraini
  • 23,240
  • 3
  • 32
  • 69