FLATTEN

import FunctionDescription from '@site/src/components/FunctionDescription';

Transforms nested JSON data into a tabular format, where each element or field is represented as a separate row.

SQL Syntax

[LATERAL] FLATTEN ( INPUT => <expr> [ , PATH => <expr> ]
                                    [ , OUTER => TRUE | FALSE ]
                                    [ , RECURSIVE => TRUE | FALSE ]
                                    [ , MODE => 'OBJECT' | 'ARRAY' | 'BOTH' ] )
Parameter / KeywordDescriptionDefault
INPUTSpecifies the JSON or array data to flatten.-
PATHSpecifies the path to the array or object within the input data to flatten.-
OUTERIf set to TRUE, rows with zero results will still be included in the output, but the values in the KEY, INDEX, and VALUE columns of those rows will be set to NULL.FALSE
RECURSIVEIf set to TRUE, the function will continue to flatten nested elements.FALSE
MODEControls whether to flatten only objects ('OBJECT'), only arrays ('ARRAY'), or both ('BOTH').'BOTH'
LATERALLATERAL is an optional keyword used to reference columns defined to the left of the LATERAL keyword within the FROM clause. LATERAL enables cross-referencing between the preceding table expressions and the function.-

Output

The following table describes the output columns of the FLATTEN function:

ColumnDescription
SEQA unique sequence number associated with the input.
KEYKey to the expanded value. If the flattened element does not contain a key, it's set to NULL.
PATHPath to the flattened element.
INDEXIf the element is an array, this column contains its index; otherwise, it's set to NULL.
VALUEValue of the flattened element.
THISThis column identifies the element currently being flattened.

SQL Examples

SQL Examples 1: Demonstrating PATH, OUTER, RECURSIVE, and MODE Parameters

This example demonstrates the behavior of the FLATTEN function with respect to the PATH, OUTER, RECURSIVE, and MODE parameters.

SELECT
  *
FROM
  FLATTEN (
    INPUT => PARSE_JSON (
      '{"name": "John", "languages": ["English", "Spanish", "French"], "address": {"city": "New York", "state": "NY"}}'
    )
  );

┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
   seq          key              path              index                     value                                                                this                                                 
├────────┼──────────────────┼──────────────────┼──────────────────┼──────────────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────┤
      1  address           address                       NULL  {"city":"New York","state":"NY"}  {"address":{"city":"New York","state":"NY"},"languages":["English","Spanish","French"],"name":"John"} 
      1  languages         languages                     NULL  ["English","Spanish","French"]    {"address":{"city":"New York","state":"NY"},"languages":["English","Spanish","French"],"name":"John"} 
      1  name              name                          NULL  "John"                            {"address":{"city":"New York","state":"NY"},"languages":["English","Spanish","French"],"name":"John"} 
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

-- PATH helps in selecting elements at a specific path from the original JSON data.
SELECT
  *
FROM
  FLATTEN (
    INPUT => PARSE_JSON (
      '{"name": "John", "languages": ["English", "Spanish", "French"], "address": {"city": "New York", "state": "NY"}}'
    ),
    PATH => 'languages'
  );

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
   seq          key              path              index             value                     this              
├────────┼──────────────────┼──────────────────┼──────────────────┼───────────────────┼────────────────────────────────┤
      1  NULL              languages[0]                     0  "English"          ["English","Spanish","French"] 
      1  NULL              languages[1]                     1  "Spanish"          ["English","Spanish","French"] 
      1  NULL              languages[2]                     2  "French"           ["English","Spanish","French"] 
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

-- RECURSIVE enables recursive flattening of nested structures.
SELECT
  *
FROM
  FLATTEN (
    INPUT => PARSE_JSON (
      '{"name": "John", "languages": ["English", "Spanish", "French"], "address": {"city": "New York", "state": "NY"}}'
    ),
    RECURSIVE => TRUE
  );

┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
   seq          key              path              index                     value                                                                this                                                 
├────────┼──────────────────┼──────────────────┼──────────────────┼──────────────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────┤
      1  address           address                       NULL  {"city":"New York","state":"NY"}  {"address":{"city":"New York","state":"NY"},"languages":["English","Spanish","French"],"name":"John"} 
      1  city              address.city                  NULL  "New York"                        {"city":"New York","state":"NY"}                                                                      
      1  state             address.state                 NULL  "NY"                              {"city":"New York","state":"NY"}                                                                      
      1  languages         languages                     NULL  ["English","Spanish","French"]    {"address":{"city":"New York","state":"NY"},"languages":["English","Spanish","French"],"name":"John"} 
      1  NULL              languages[0]                     0  "English"                         ["English","Spanish","French"]                                                                        
      1  NULL              languages[1]                     1  "Spanish"                         ["English","Spanish","French"]                                                                        
      1  NULL              languages[2]                     2  "French"                          ["English","Spanish","French"]                                                                        
      1  name              name                          NULL  "John"                            {"address":{"city":"New York","state":"NY"},"languages":["English","Spanish","French"],"name":"John"} 
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘


-- MODE specifies whether only objects ('OBJECT'), only arrays ('ARRAY'), or both ('BOTH') should be flattened.
-- In this example, MODE => 'ARRAY' is used, which means that only arrays within the JSON data will be flattened.
SELECT
  *
FROM
  FLATTEN (
    INPUT => PARSE_JSON (
      '{"name": "John", "languages": ["English", "Spanish", "French"], "address": {"city": "New York", "state": "NY"}}'
    ),
    MODE => 'ARRAY'
  );

---


-- OUTER determines the inclusion of zero-row expansions in the output.
-- In this first example, OUTER => TRUE is used with an empty JSON array, which results in zero-row expansions. 
-- Rows are included in the output even when there are no values to flatten.
SELECT
  *
FROM
  FLATTEN (INPUT => PARSE_JSON ('[]'), OUTER => TRUE);

┌─────────────────────────────────────────────────────────────────────────────────────────────────────────┐
   seq          key              path              index             value               this       
├────────┼──────────────────┼──────────────────┼──────────────────┼───────────────────┼───────────────────┤
      1  NULL              NULL                          NULL  NULL               NULL              
└─────────────────────────────────────────────────────────────────────────────────────────────────────────┘

-- In this second example, OUTER is omitted, and the output shows how rows with zero results are not included when OUTER is not specified.
SELECT
  *
FROM
  FLATTEN (INPUT => PARSE_JSON ('[]'));

SQL Examples 2: Demonstrating LATERAL FLATTEN

This example demonstrates the behavior of the FLATTEN function when used in conjunction with the LATERAL keyword.

-- Create a table for Tim Hortons transactions with multiple items
CREATE TABLE tim_hortons_transactions (
    transaction_id INT,
    customer_id INT,
    items VARIANT
);

-- Insert data for Tim Hortons transactions with multiple items
INSERT INTO tim_hortons_transactions (transaction_id, customer_id, items)
VALUES
    (101, 1, parse_json('[{"item":"coffee", "price":2.50}, {"item":"donut", "price":1.20}]')),
    (102, 2, parse_json('[{"item":"bagel", "price":1.80}, {"item":"muffin", "price":2.00}]')),
    (103, 3, parse_json('[{"item":"timbit_assortment", "price":5.00}]'));

-- Show Tim Hortons transactions with multiple items using LATERAL FLATTEN
SELECT
    t.transaction_id,
    t.customer_id,
    f.value:item::STRING AS purchased_item,
    f.value:price::FLOAT AS price
FROM
    tim_hortons_transactions t,
    LATERAL FLATTEN(input => t.items) f;

┌───────────────────────────────────────────────────────────────────────────┐
  transaction_id    customer_id      purchased_item         price       
├─────────────────┼─────────────────┼───────────────────┼───────────────────┤
             101                1  coffee                           2.5 
             101                1  donut                            1.2 
             102                2  bagel                            1.8 
             102                2  muffin                             2 
             103                3  timbit_assortment                  5 
└───────────────────────────────────────────────────────────────────────────┘

-- Find maximum, minimum, and average prices of the purchased items
SELECT
    MAX(f.value:price::FLOAT) AS max_price,
    MIN(f.value:price::FLOAT) AS min_price,
    AVG(f.value:price::FLOAT) AS avg_price
FROM
    tim_hortons_transactions t,
    LATERAL FLATTEN(input => t.items) f;

┌───────────────────────────────────────────────────────────┐
     max_price          min_price          avg_price     
├───────────────────┼───────────────────┼───────────────────┤
                 5                1.2                2.5 
└───────────────────────────────────────────────────────────┘
Last modified June 11, 2024 at 9:00 PM EST: clean up cautions and notes (d4a1b9a)