Query result precedence if filter columns have values

Question

I have a table "customer_config" with these columns:

company (varchar)
warehouse (numeric)
section (numeric)
config_keyword (varchar)
config_value (varchar)

The two config_* columns can apply to an entire company (warehouse and section are null), an entire warehouse within a company (section is null), or a section within a warehouse.

So we could have a default row for the company, and then one or more rows that override a configuration value for a specific warehouse or warehouse & section.

I want to return only the most specific row for a given company, warehouse, and section. Something like this pseudocode:

results = select * from customer_config where (all match)
if results empty
    results = select * from customer_config where (company_code and warehouse match)
if results empty
    results = select * from customer_config where (company_code matches)

The most specific row shall take precedence.

Update

There can be multiple entries for the same config_keyword on the same level.
Is it also possible to return multiple rows for a single keyword?

A "list of values" would be just another string, and the given solutions work for that all the same. It's unclear what more you need. (Probably not a simple "list of values"?) Please ask your new question as new question. Don't change the nature of a question after answers have been given. A generous bounty does not change that principle. — Erwin Brandstetter, Mar 31 '22 at 03:49
@ErwinBrandstetter Apologies for not being clear. By a list of values I mean that a single keyword could have multiple rows, each with a unique value — secondbreakfast, Mar 31 '22 at 05:24

Erwin Brandstetter · Accepted Answer · 2022-04-04T14:13:40.577

"Only the most specific row"

While looking for a single result row (like your original question indicated):

This is a bit more verbose, but very clear and probably fastest if supported with an index on (company_code, warehouse, section) - like you should probably have anyway (depends on undisclosed info).

SELECT * FROM customer_config
WHERE  (company_code, warehouse, section) = ($1, $2, $3)
UNION ALL
SELECT * FROM customer_config
WHERE  (company_code, warehouse) = ($1, $2)
UNION ALL
SELECT * FROM customer_config
WHERE  company_code = $1
LIMIT  1;

Postgres will stop executing as soon as a row has been found. Test with EXPLAIN ANALYZE, you'll see "never executed" for remaining sub-SELECTs.

Note that LIMIT 1 applies to the whole query, not the last SELECT. (You'd need parentheses to change that.) Similar:

How to force order of WHERE clauses in PostgreSQL?

"All rows for the most specific match"

If multiple rows can exist on each level.

Can be solved with pure SQL of course. For instance with a chain of CTEs. But this custom PL/pgSQL function should be more efficient:

CREATE OR REPLACE FUNCTION trade_volume (_company_code varchar, _warehouse numeric, _section numeric)
  RETURNS SETOF customer_config
  LANGUAGE plpgsql STABLE PARALLEL SAFE AS
$func$
BEGIN
   RETURN QUERY
   SELECT * FROM customer_config
   WHERE (company_code, warehouse, section) = ($1, $2, $3);
IF FOUND THEN RETURN; END IF;
RETURN QUERY
   SELECT * FROM customer_config
   WHERE (company_code, warehouse) = ($1, $2);
IF FOUND THEN RETURN; END IF;
RETURN QUERY
   SELECT * FROM customer_config
   WHERE  company_code = $1;
END
$func$;

Call:

SELECT * FROM trade_volume ('my_comany_code', 123456, 123);

Be sure to have the index mentioned above.

If the first query returns any rows, the function is done. The rest is not even planned. Etc.

Allow plpgsql's RETURN QUERY to execute its query using parallelism (Tom Lane)

Thanks, this looks almost exactly like what I need here, but I left out an important detail. It is possible for a keyword to have multiple records with different values. So some might be single key/value, but others could be a key/list of values. I modified your solution to union instead of union all as that seemed to give me unwanted duplicates, but the issue now is that it's evaluating the config_value as part of the union. — secondbreakfast, Mar 26 '22 at 07:23
@secondbreakfast: I added a solution for your extended question. In any case, UNION instead of UNION ALL would not make any sense for my query. — Erwin Brandstetter, Apr 01 '22 at 18:01
Awesome, thanks! This looks similar to an idea I had of just doing it programmatically in the server code, but having it within the database like this with a single db call would be more efficient. — secondbreakfast, Apr 04 '22 at 12:59
I know I'm asking a lot here....but due to constraints I am unable to create and rely on a function. Trying to wrap my head around chaining CTEs for a solution like you said...but I just can't figure it out. If someone could just point me in the direction of a blog post or something that gives a relevant example....I'd really appreciate it — secondbreakfast, Apr 04 '22 at 16:27
wait nevermind, I think I got it! Used multiple CTEs combined with "union all" and "not exists" for each more specific CTE — secondbreakfast, Apr 04 '22 at 16:40

score 3 · Answer 2 · answered Mar 25 '22 at 00:55

This should work:

SELECT DISTINCT ON (config_keyword)
       config_keyword, config_value
FROM customer_config
WHERE company_code = $1
  AND coalesce(warehouse, $2) = $2
  AND coalesce(section, $3) = $3
ORDER BY config_keyword,
         section IS NULL,
         warehouse IS NULL;

DISTINCT ON will return the first row for each config_keyword, and results where section is not NULL will sort before results where section is NULL, because FALSE < TRUE. The same for warehouse.

Eduard Okhvat · Answer 3 · 2022-04-03T02:51:36.203

0

I don't use PostgreSQL, but in oracle it can be done with analitic functions to return multiple rows for a single keyword:

select * 
  from (
     select rank() over (order by section, warehouse, company_code ) rn, 
            c.* 
       from customer_config c
      where company_code = $1
        and coalesce(warehouse, $2) = $2
        and coalesce(section, $3) = $3
       )
 where rn = 1;

edited Apr 03 '22 at 02:51

answered Apr 01 '22 at 03:39

Eduard Okhvat

606
4
14

Query result precedence if filter columns have values

Update

3 Answers3

"Only the most specific row"

"All rows for the most specific match"