Building a Table with Dynamic Columns from a Key-Value Array in Snowflake: A Step-by-Step Guide

Building a Table with Dynamic Columns from a Key-Value Array in Snowflake

In this article, we will explore how to build a table with dynamic columns based on a key-value array in Snowflake. We’ll start by creating a sample table, parsing the JSON data, and then pivoting the results to create the desired output.

Understanding the Problem

The problem statement involves creating a table with dynamic columns from a key-value array in Snowflake. The input data is stored in a table named “tab” and contains two JSON arrays: one for each row. We want to build a table where the column names are derived from the keys of these JSON arrays, and the values are the corresponding values from those arrays.

Sample Data

To demonstrate this concept, let’s first create a sample table with the desired data:

CREATE OR REPLACE TABLE tab
AS
SELECT 1 ID, PARSE_JSON('[{"key":"Apple", "value":2}, {"key":"Orange", "value":3}]') AS DATA
UNION 
SELECT 2, PARSE_JSON('[{"key":"Apple", "value":5}, {"key":"Orange", "value":4}, {"key":"Cookie", "value":4}]');

This creates a table with two rows and two JSON arrays: one for each row. The PARSE_JSON function is used to convert the JSON strings into tables that we can query.

Step 1: Parse the JSON Data

To parse the JSON data, we’ll use the FLATTEN function in combination with the LATERAL keyword:

SELECT id, s.VALUE:key::TEXT AS key, s.VALUE:value::TEXT AS value
FROM tab,
LATERAL FLATTEN(input=>tab.DATA) s;

The FLATTEN function flattens the JSON array into individual rows. The LATERAL keyword specifies that we want to include the original table row in the result set.

Step 2: Pivot the Results

To pivot the results and create the desired output, we’ll use a Common Table Expression (CTE) to extract the key-value pairs from each JSON array:

WITH cte AS (
    SELECT id, s.VALUE:key::TEXT AS key, s.VALUE:value::TEXT AS value
    FROM tab,
    LATERAL FLATTEN(input=>tab.DATA) s
)
SELECT *
FROM cte
PIVOT (
    MAX(value)
    FOR KEY IN ('Apple', 'Orange', 'Cookie')
) AS p;

The PIVOT function pivots the results based on the specified columns. In this case, we’re using three pivot columns: ‘Apple’, ‘Orange’, and ‘Cookie’. The MAX aggregation function is used to calculate the maximum value for each key-value pair.

How It Works

Here’s a step-by-step breakdown of how the query works:

  1. The first section parses the JSON data using the FLATTEN and LATERAL functions.
  2. The second section uses a CTE to extract the key-value pairs from each JSON array.
  3. The third section pivots the results using the PIVOT function.

Output

The final output will be a table with three columns: ‘Id’, ‘Apple’, and ‘Cookie’. The values in these columns are the corresponding maximum values for each key-value pair:

ID  APPLE   COOKIE
1   2       3
2   5       4

Conclusion

In this article, we demonstrated how to build a table with dynamic columns based on a key-value array in Snowflake. We used the FLATTEN and LATERAL functions to parse the JSON data and then pivoted the results using the PIVOT function. This approach allows us to create tables with flexible column structures, making it ideal for data analysis and reporting use cases.

Further Reading

For more information on Snowflake’s JSON functions and data manipulation capabilities, refer to the official Snowflake documentation:

We hope this article has been informative and helpful. If you have any questions or need further clarification on any of the concepts discussed, feel free to ask!


Last modified on 2023-10-25