12

I have a column that is using the JSON type. I would like to perform a query to select all distinct records for a particular field in the JSON string: I.e. given these three documents

{
  id: 1,
  s: "foo"
},
{
  id:2,
  s: "bar"
},
{
  id:3,
  s: "foo"
},

the query must check the "s" key for distinct values and return the documents with id 1 and 2.

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
  • Please provide a proper example with valid syntax, your version of Postgres, the table definition and what you have tried - even if it's not working. – Erwin Brandstetter May 27 '15 at 16:47
  • @ErwinBrandstetter sorry about that, you are right. Thing is I don't actually have anything ready, I'm evaluating switching from MongoDB to PostgreSQL and I need to make sure that I can translate a distinct query in MongoDB to PostgreSQL. – Trasplazio Garzuglio May 27 '15 at 16:55
  • Here's an additional postgresql JSON question you might find interesting: https://dba.stackexchange.com/q/281480/45101 – blong Dec 14 '20 at 02:26

1 Answers1

17

Assuming a JSON array in a Postgres 9.4 jsonb column, this would do the job:

SELECT DISTINCT ON (doc->'s') doc
FROM  (
   SELECT '[
    {
      "id":1,
      "s":"foo"
    },
    {
      "id":2,
      "s":"bar"
    },
    {
      "id":3,
      "s":"foo"
    }]'::jsonb AS j
   ) t
   , jsonb_array_elements(t.j) WITH ORDINALITY t1(doc, rn)
ORDER  BY doc->'s', rn;

Or, unless s is a nested object, it's probably cheaper to fold on the text value instead of the jsonb (sub-)record. Just use the operator ->> instead of -> in this case. The result is the same:

 doc
----------------------
'{"s": "bar", "id": 2}'
'{"s": "foo", "id": 1}'

Replace the subquery t with your actual table.

Key elements are jsonb_array_elements() (or json_array_elements()) in a LATERAL join with WITH ORDINALITY and then the Postgres-specific DISTINCT ON.

Related, with more explanation:

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600