PostgreSQL: Efficiently split JSON array into rows

aidan

I have a table (Table A) that includes a text column that contains JSON encoded data.

The JSON data is always an array with between one and a few thousand plain object.

I have another table (Table B) with a few columns, including a column with a datatype of 'JSON'

I want to select all the rows from table A, split the json array into its elements and insert each element into table B

Bonus objective: Each object (almost) always has a key, x. I want to pull the value of x out into column, and delete x from the original object (if it exists).

E.g.: Table A

| id | json_array (text)              |
+----+--------------------------------+
|  1 | '[{"x": 1}, {"y": 8}]'         |
|  2 | '[{"x": 2, "y": 3}, {"x": 1}]' |
|  3 | '[{"x": 8, "z": 2}, {"z": 3}]' |
|  4 | '[{"x": 5, "y": 2, "z": 3}]'   |

...would become: Table B

| id | a_id | x    | json (json)        |
+----+------+------+--------------------+
|  0 |    1 |    1 | '{}'               |
|  1 |    1 | NULL | '{"y": 8}'         |
|  2 |    2 |    2 | '{"y": 3}'         |
|  3 |    2 |    1 | '{}'               |
|  4 |    3 |    8 | '{"y": 2}'         |
|  5 |    3 | NULL | '{"z": 3}'         |
|  6 |    4 |    5 | '{"y": 2, "z": 3}' |

This initially has to work on a few million rows, and would then need to be run at regular intervals, so making it efficient would be a priority.

Is it possible to do this without using a loop and PL/PgSQL? I haven't been making much progress.

Patrick

The json data type is not particularly suitable (or intended) for modification at the database level. Extracting "x" objects from the JSON object is therefore cumbersome, although it can be done.

You should create your table B (with hopefully a more creative column name than "json"; I am using item here) and make the id column a serial that starts at 0. A pure json solution then looks like this:

INSERT INTO b (a_id, x, item)
  SELECT sub.a_id, sub.x,
         ('{' ||
         string_agg(
             CASE WHEN i.k IS NULL THEN '' ELSE '"' || i.k || '":' || i.v END,
             ', ') ||
         '}')::json
  FROM (
    SELECT a.id AS a_id, (j.items->>'x')::integer AS x, j.items
    FROM a, json_array_elements(json_array) j(items) ) sub
  LEFT JOIN json_each(sub.items) i(k,v) ON i.k <> 'x'
  GROUP BY sub.a_id, sub.x
  ORDER BY sub.a_id;

In the sub-query this extracts the a_id and x values, well as the JSON object. In the outer query the JSON object is broken into its individual pieces and the objects with key x thrown out (the LEFT JOIN ON i.k <> 'x'). In the select list the pieces are put back together again with string concatenation and grouped into compound objects.

This necessarily has to be like this because json has no built-in manipulation functions of any consequence. This works on PG versions 9.3+, i.e. since time immemorial insofar as JSON support is concerned.

If you are using PG9.5+, the solution is much simpler through a cast to jsonb:

INSERT INTO b (a_id, x, item)
  SELECT a.id, (j.items->>'x')::integer, j.items #- '{x}'
  FROM a, jsonb_array_elements(json_array::jsonb) j(items);

The #- operator on the jsonb data type does all the dirty work here. Obviously, there is a lot of work going on behind the scenes, converting json to jsonb, so if you find that you need to manipulate your JSON objects more frequently then you are better off using the jsonb type to begin with. In your case I suggest you do some benchmarking with EXPLAIN ANALYZE SELECT ... (you can safely forget about the INSERT while testing) on perhaps 10,000 rows to see which works best for your setup.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

PostgreSQL: Efficiently split JSON array into rows

From Dev

Postgresql: Split columns into rows

From Dev

Postgresql split column into rows

From Dev

Split array by rows

From Dev

How to extract values from array json column into multiple rows in Postgresql?

From Dev

How to efficiently search a JSON array

From Dev

How to efficiently search a JSON array

From Dev

Split array rows with spaces in Javascript

From Dev

Split up object array into rows

From Dev

How to split row into many rows in postgresql

From Dev

Postgresql select rows(a result) as array

From Dev

Select rows where nested json array field includes any of values from a provided array in PostgreSQL?

From Dev

Find cumsum of subarrays split by indices for numpy array efficiently

From Dev

Pyspark: Split multiple array columns into rows

From Dev

How to split json array into strings?

From Dev

how to split JSON array in php

From Dev

How to split a JSON object into an array

From Dev

Split a json array to multiple files

From Dev

Split JSON into multiple variables/ an array

From Dev

SQLAlchemy, PostgreSQL: Query for rows with a specific item in an array

From Java

Insert array as new rows in postgresql without a loop

From Dev

PostgreSQL Select rows based on combination of array values

From Dev

PostgreSQL Select rows based on combination of array values

From Dev

PostgreSQL Function: Return Multiple Rows as JSON

From Dev

Postgresql merge rows with same key (hstore or json)

From Dev

split json object from json array in java

From Dev

Split json data from json array in jquery

From Dev

Postgresql from Json Object to array

From Dev

Using indexes in json array in PostgreSQL

Related Related

HotTag

Archive