Returning an array of JSON objects in PostgreSQL

Published on July 24, 2017 under Blog

For my internship with Microsoft and UCL Institute of Child Health I had to work with PostgreSQL. My back-end was running on Node.js, so naturally I was interested in getting JSON formatted data out of Postgres whenever possible. Alternative solution was to handle all the logic using JavaScript and make multiple SQL queries, but that would harm performance.

Note: Example below was tested with PostgreSQL 9.6, if you have a different version your experience might differ.

Example database

Imagine a database where you store some records. Each record has an ID, and there are several fields associated with this record. Each field has multiple options, and each option has its own ID, name and order. Essentially, there are two main tables in our database - the fields tables:

| record_id | field_id |
|-----------|----------|
| abc       | xyz      |
| abc       | qwe      |

And the options table:

| field_id | option_id | name   | order |
|----------|-----------|--------|-------|
| xyz      | 123       | Age    | 1     |
| xyz      | 456       | Height | 2     |

As seen above, we have a single record abc, which has fields xyz and qwe. Field xyz has two options, 123 (Age) and 456 (Height) while qwe has no options at all. Option Age should appear before option Height due to the values in the order column.

Fetching an array of JSON objects

For the sake of this tutorial, let's assume we know the ID of the record (e.g. abc), and we want to fetch all fields associated with it. We also want to fetch all options for these fields in the same query, without having any duplicate data. Below you can see the query that does exactly what we want, and underneath it is an explanation of the signficant parts.

SELECT
  f.*,
  CASE WHEN count(o) = 0 THEN ARRAY[]::json[] ELSE array_agg(o.option) END AS options
FROM field f
  LEFT OUTER JOIN
  (
    SELECT o1.field_id, json_build_object('id', o1.option_id, 'name', o1.name) as option
    FROM options o1
    ORDER BY o1.order
  ) o
    ON f.field_id = o.field_id
WHERE f.record_id = 'abc'
GROUP BY f.field_id
  1. First, we have to create subquery which will convert the rows from our options table into their JSON representation. This happens on lines 7-9. We're using Postgres function json_build_object which allows you to build JSON objects using the pattern json_build_object('key1', value1, 'key2', value2, ...) and order everything using the order column.
  2. We perform a left outer join of the fields and options tables, as see on line 5. This will make sure all rows from the fields table end up in our final result, regardless of whether they have any options or not.
  3. As we don't want any duplicates in our data, we group everything by field_id on line 13.
  4. On line 3, we generate our array from the JSON objects we created in our subquery. array_agg function grabs all options that were grouped and places them into an array, essentially creating an array of JSON objects we were after. To avoid any issues, we use CASE WHEN ... THEN ... ELSE ... END construct to return an empty array (ARRAY[]::json[]) if there are no options. If we wouldn't do this, we could either get a NULL object instead of an empty array or some Postgres error.

If you run this query through some JavaScript Postgres framework like pg-promise your output will be an object like the one shown below. Note how the options property on each row contains an array of JSON objects we specified.

[
  {
    "record_id": "abc",
    "field_id": "xyz",
    "options": [
      {
        "id": "123",
        "name": "Age"
      },
      {
        "id": "456",
        "name": "Height"
      }
    ]
  },
  {
    "record_id": "abc",
    "field_id": "qwe",
    "options": []
  }
]

End of Article

Timur Kuzhagaliyev Author

I'm a computer science graduate from UCL & Caltech, working as a systems engineer at Jump Trading. Before Jump, I was doing computer vision for Video Quality Analysis team at Amazon.