3

I have a table "customer_config" with these columns:

company (varchar)
warehouse (numeric)
section (numeric)
config_keyword (varchar)
config_value (varchar)

The two config_* columns can apply to an entire company (warehouse and section are null), an entire warehouse within a company (section is null), or a section within a warehouse.

So we could have a default row for the company, and then one or more rows that override a configuration value for a specific warehouse or warehouse & section.

I want to return only the most specific row for a given company, warehouse, and section. Something like this pseudocode:

results = select * from customer_config where (all match)
if results empty
    results = select * from customer_config where (company_code and warehouse match)
if results empty
    results = select * from customer_config where (company_code matches)

The most specific row shall take precedence.

Update

There can be multiple entries for the same config_keyword on the same level.
Is it also possible to return multiple rows for a single keyword?

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
  • A "list of values" would be just another string, and the given solutions work for that all the same. It's unclear what more you need. (Probably not a simple "list of values"?) Please ask your new question as new question. Don't change the nature of a question after answers have been given. A generous bounty does not change that principle. – Erwin Brandstetter Mar 31 '22 at 03:49
  • @ErwinBrandstetter Apologies for not being clear. By a list of values I mean that a single keyword could have multiple rows, each with a unique value – secondbreakfast Mar 31 '22 at 05:24

3 Answers3

5

"Only the most specific row"

While looking for a single result row (like your original question indicated):

This is a bit more verbose, but very clear and probably fastest if supported with an index on (company_code, warehouse, section) - like you should probably have anyway (depends on undisclosed info).

SELECT * FROM customer_config
WHERE  (company_code, warehouse, section) = ($1, $2, $3)

UNION ALL SELECT * FROM customer_config WHERE (company_code, warehouse) = ($1, $2)

UNION ALL SELECT * FROM customer_config WHERE company_code = $1 LIMIT 1;

Postgres will stop executing as soon as a row has been found. Test with EXPLAIN ANALYZE, you'll see "never executed" for remaining sub-SELECTs.

Note that LIMIT 1 applies to the whole query, not the last SELECT. (You'd need parentheses to change that.) Similar:

"All rows for the most specific match"

If multiple rows can exist on each level.

Can be solved with pure SQL of course. For instance with a chain of CTEs. But this custom PL/pgSQL function should be more efficient:

CREATE OR REPLACE FUNCTION trade_volume (_company_code varchar, _warehouse numeric, _section numeric)
  RETURNS SETOF customer_config
  LANGUAGE plpgsql STABLE PARALLEL SAFE AS
$func$
BEGIN
   RETURN QUERY
   SELECT * FROM customer_config
   WHERE (company_code, warehouse, section) = ($1, $2, $3);

IF FOUND THEN RETURN; END IF;

RETURN QUERY SELECT * FROM customer_config WHERE (company_code, warehouse) = ($1, $2);

IF FOUND THEN RETURN; END IF;

RETURN QUERY SELECT * FROM customer_config WHERE company_code = $1; END $func$;

Call:

SELECT * FROM trade_volume ('my_comany_code', 123456, 123);

Be sure to have the index mentioned above.

If the first query returns any rows, the function is done. The rest is not even planned. Etc.

Related:

I made the function PARALLEL SAFE to allow parallelism in Postgres 14 or later. (Only relevant for big tables.) Quoting the release notes:

Allow plpgsql's RETURN QUERY to execute its query using parallelism (Tom Lane)

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
  • Thanks, this looks almost exactly like what I need here, but I left out an important detail. It is possible for a keyword to have multiple records with different values. So some might be single key/value, but others could be a key/list of values. I modified your solution to union instead of union all as that seemed to give me unwanted duplicates, but the issue now is that it's evaluating the config_value as part of the union. – secondbreakfast Mar 26 '22 at 07:23
  • @secondbreakfast: I added a solution for your extended question. In any case, UNION instead of UNION ALL would not make any sense for my query. – Erwin Brandstetter Apr 01 '22 at 18:01
  • Awesome, thanks! This looks similar to an idea I had of just doing it programmatically in the server code, but having it within the database like this with a single db call would be more efficient. – secondbreakfast Apr 04 '22 at 12:59
  • I know I'm asking a lot here....but due to constraints I am unable to create and rely on a function. Trying to wrap my head around chaining CTEs for a solution like you said...but I just can't figure it out. If someone could just point me in the direction of a blog post or something that gives a relevant example....I'd really appreciate it – secondbreakfast Apr 04 '22 at 16:27
  • wait nevermind, I think I got it! Used multiple CTEs combined with "union all" and "not exists" for each more specific CTE – secondbreakfast Apr 04 '22 at 16:40
3

This should work:

SELECT DISTINCT ON (config_keyword)
       config_keyword, config_value
FROM customer_config
WHERE company_code = $1
  AND coalesce(warehouse, $2) = $2
  AND coalesce(section, $3) = $3
ORDER BY config_keyword,
         section IS NULL,
         warehouse IS NULL;

DISTINCT ON will return the first row for each config_keyword, and results where section is not NULL will sort before results where section is NULL, because FALSE < TRUE. The same for warehouse.

Laurenz Albe
  • 51,298
  • 4
  • 39
  • 69
0

I don't use PostgreSQL, but in oracle it can be done with analitic functions to return multiple rows for a single keyword:

select * 
  from (
     select rank() over (order by section, warehouse, company_code ) rn, 
            c.* 
       from customer_config c
      where company_code = $1
        and coalesce(warehouse, $2) = $2
        and coalesce(section, $3) = $3
       )
 where rn = 1;
Eduard Okhvat
  • 606
  • 4
  • 14