19

Is there any way to have a Postgres LIKE query on a ARRAY field?

Currently I want something like that:

SELECT * FROM list WHERE lower(array_field) LIKE '1234%'

Currently lower is not needed that much. However it should find ONE matching field inside the ARRAY. Is that even possible?

Currently I use a materialized view to generate the "list" table with a JOIN and a ARRAY_AGG(), since I JOIN a table where more values could be on the right table. Which would duplicate fields on the left table, which is not what I want.

Edit this is how I create the view (really sluggish and ugly):

CREATE MATERIALIZED VIEW article_list_new AS
SELECT a.id, 
       a.oa_nr, 
       a.date_deleted, 
       a.lock, 
       a.sds_nr, 
       a.kd_art_nr, 
       a.kd_art_index, 
       a.kd_art_extend, 
       a.surface, 
       a.execution, 
       a.surface_area, 
       a.cu_thickness, 
       a.endintensity, 
       a.drilling, 
       array_agg(o.id::text) AS offer_list 
FROM article_list a LEFT JOIN task_offer o ON o.article = a.oa_nr 
GROUP BY .....;  

I also need to return the IDs of the task_offer table.

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
Christian Schmitt
  • 421
  • 2
  • 5
  • 12

2 Answers2

17

You can use unnest() like dezso commented, for instance with a LATERAL join:

SELECT l.*
FROM   list l, unnest(array_field) a  -- implicit lateral
WHERE  lower(a) LIKE '1234%';

You don't need any of this for the presented case. No materialized view at all. This query on the underlying tables is faster, because it can use an index:

SELECT *  -- or selected columns
FROM   article_list a
JOIN   LATERAL  (                      -- only matching IDs
   SELECT array_agg(id) AS offer_list
   FROM   task_offer o
   WHERE  o.article = a.oa_nr            -- LATERAL reference  
   AND    id::text ILIKE '1234%' COLLATE "C"  -- or just LIKE
   ) o ON offer_list IS NOT NULL;

Returns a single row from article_list with an array offer_list of matching IDs in task_offer (if any) - an array of the original data type.

Add an index using COLLATE "C", so it can be used for left-anchored LIKE patterns:

CREATE INDEX task_offer_foo_idx ON task_offer (article, (id::text) COLLATE "C");

In older versions of Postgres, the operator class text_pattern_ops served the same purpose. See:

Or use a trigram index for infix matches (not left-anchored). See:

Or maybe use a plain join for selective patterns:

SELECT a.*, o.offer_list -- or selected columns
FROM   article_list a
JOIN   (                          -- only matching IDs
   SELECT article, array_agg(id) AS offer_list
   FROM   task_offer
   WHERE  id::text ILIKE '1234%' COLLATE "C"  -- or just LIKE
   GROUP  BY 1
   ) o ON o.article = a.oa_nr;

Needs an index on article_list.oa_nr, too. (Which you probably have.) Like:

CREATE INDEX article_list_oa_nr_idx ON article_list (oa_nr);
Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
  • I added some data oh and I would need to return the IDs of the Joined table aswell – Christian Schmitt Oct 12 '15 at 14:29
  • 1
    Only IDs matching the pattern or all linked IDs? Please update the question to clarify. – Erwin Brandstetter Oct 12 '15 at 20:42
  • I added an answer for just matching IDs. – Erwin Brandstetter Oct 13 '15 at 04:15
  • wow thanks. however offer_list could also be empty, is that a problem at all, I mean how to write the ON part if offer_list is empty nvm seems to be the same as before..?? Also the index does not work since task_offer does not contains oa_nr, i think you meant article, right? – Christian Schmitt Oct 13 '15 at 07:35
  • Also I think 10ms (raising if my table raises is really slow..., considering that this is yet not a problem but currently one table contains 30k rows and the other 10k rows and we produce ~5k rows on the left and ~5k on the right each year we will be really really slow at a certain point, however I could use the query to generate a materialized view..) – Christian Schmitt Oct 13 '15 at 07:48
  • Oh I just found out if I remove the offer_list IS NOT NULL it won't match correctly so I need to set it if I have a LIKE against the offer id. – Christian Schmitt Oct 13 '15 at 08:35
  • Erwin Brandstetter is there a way to speed up LIKE queries against unnest with an index? – Christian Schmitt Oct 13 '15 at 10:18
  • @ChristianSchmitt: article was a typo, yes. No, you cannot index unnest() directly. Set-returning functions are not allowed as index columns. The best strategy overall depends on all the details that are not in your question. And 30k or even 300k rows should not be a problem at all. Be sure to understand query time and IO time. – Erwin Brandstetter Oct 13 '15 at 14:01
  • hm.. running with 100ms++ on a local machine with only 30k rows seems to be realy slow for me. And what i've seen is that ON offer_list IS NOT NULL; slows down the query by a magnitutde of 100x, however I don't now how to fix that. I mean the first entries that I run with LIKE are fast, however when searching for "older data" it gets really really slow. like Execution time: 153.829 ms looks like io time. – Christian Schmitt Oct 13 '15 at 14:21
  • @ChristianSchmitt: If you are looking for performance optimization, I suggest you start a *new question* where you provide the necessary details. Consider instructions at the tag info for [postgresql-performance]. Depending on data distribution etc. other query styles can be faster ... – Erwin Brandstetter Oct 13 '15 at 22:44
4

Have you considered the parray_gin extension, also located here?

I haven't used it in production yet, but from what I have seen of it it does what you want against the array. You can create the relevant index on the materialized view.

However, I would only do that if you are going to have the materialized view anyway. Otherwise I think @ErwinBrandstetter's answer is better than creating a materialized view only to hold the index.

jjanes
  • 39,726
  • 3
  • 37
  • 48