Extract Nested Data
    • Dark
      Light

    Extract Nested Data

    • Dark
      Light

    Article Summary

    This article is specific to the following platforms - Snowflake - Redshift - BigQuery - Delta Lake.


    Extract Nested Data

    The Extract Nested Data component flattens nested data into rows. This is generally done by taking nested data in the form of key:value pairs (such as a JSON dictionary) and using those keys as column names.

    The flow into this component should include a single variant-type column that is to be unpacked.

    In Amazon Redshift, the flow into this component should include a single, SUPER type column to be unpacked.

    In Google BigQuery, when flattening data, the number of rows output will be the product of the number of items in every array field selected, per input row. If any are empty, no output rows will be returned.


    Properties

    Snowflake Properties

    PropertySettingDescription
    NameStringA human-readable name for the component.
    Include Input ColumnsSelectChoose whether to include input columns.
    ColumnsTree StructureDefine the variant's structure here. Unique endpoints in this structure will be unpacked into columns. The variant from the input can usually be automatically detected, and its structure determined, using the Autofill button. Elements of the structure can be manually edited by right-clicking them on the tree and selecting "Delete" or "Edit" as appropriate.
    Outer JoinTrue/FalseIf False (the default), any input rows that can't be expanded—either because they can't be accessed in the path or because they have zero fields or entries—are completely omitted from the output. If True, a single row is generated for expansions with zero rows.
    Input AliasStringIf two properties have identical names, one will be given this prefix to differentiate them. More than two identically named properties will result in an error. This does not need changing in the vast majority of use cases.
    Array PrefixStringIf two array structures have identical names, one will be given this prefix to differentiate them. More than two identically named structures will result in an error. This does not need changing in the vast majority of use cases.
    Casting MethodSelectSelect a casting method. Options include Fail on invalid data (default), Replace all unparseable values with null, or Replace unparseable dates and timestamps with null.
    Case Column Alias NamesSelectSet the casing for alias columns names. Settings include Lower, No, or Upper. The default is No.

    Redshift Properties

    PropertySettingDescription
    NameStringA human-readable name for the component.
    ColumnsSelectSelect the columns to be written to the table.
    Column AliasStringAn alias for the input component reference (for example, Table Input or Fixed Flow). The default alias is i.
    Include Input ColumnsSelectChoose whether or not to include input columns. The default setting is No.
    Array PrefixStringA prefix for the input's array. This property will increment inputs with one-based indexing (for example, f1, f2, f3, ...) where differentiation is required. The default value is f.

    BigQuery Properties

    PropertySettingDescription
    NameStringA human-readable name for the component.
    ColumnsSelectSelect the columns to be written to the table.
    Outer JoinTrue/FalseIf False (the default), any input rows that can't be expanded—either because they can't be accessed in the path or because they have zero fields or entries—are completely omitted from the output. If True, a single row is generated for expansions with zero rows.
    Column AliasesStringGive an alternative name for the columns you select.
    Include Input ColumnsSelectChoose whether or not to include input columns.
    Input Column PrefixStringGive a prefix name for the input columns.

    Delta Lake Properties

    PropertySettingDescription
    NameStringA human-readable name for the component.
    ColumnsSelectSelect the columns to be written to the table.
    Outer JoinTrue/FalseIf False (the default), any input rows that can't be expanded—either because they can't be accessed in the path or because they have zero fields or entries—are completely omitted from the output. If True, a single row is generated for expansions with zero rows.
    Column AliasesSource ColumnSelect the source column that you wish to provide an alternative name (an alias) for.
    Target ColumnProvide an alternative name for the column.
    This property is optional and both columns can be left empty if the user does not wish to provide alternative names for columns.
    Include Input ColumnsSelectChoose whether or not to include input columns in the data extraction.
    Input Column PrefixStringGive a prefix name for the input columns. The default is input_

    What's Next