Overview
Array expansion lets you take a column that contains an array of values and expand each element into its own row during a sync. This is useful when your source data stores multiple identifiers or values in a single array column, but your destination expects one record per value.
For example, if a user has multiple device IDs stored in an array column, array expansion creates a separate sync row for each device ID, allowing you to sync each one individually to your destination.
Array expansion is currently supported with the lightning sync engine on Postgres, Snowflake, BigQuery, and Databricks sources.
How it works
When you map an array-type column as the identifier in your sync configuration, Hightouch automatically enables array expansion. Each element of the array becomes a separate row, and a generated column is used as the identifier value sent to the destination.
Example
Consider this source table with an array column device_ids:
| user_id | name | device_ids |
|---|---|---|
| 1 | Alice | ["ios_a1", "android_b2"] |
| 2 | Bob | ["ios_c3"] |
| 3 | Charlie | [] |
| 4 | Dana | ["web_d4", "web_d4"] |
When you map device_ids as the identifier for your sync, Hightouch expands each row so that every array element becomes its own record:
| _ht_device_ids_element | user_id | name |
|---|---|---|
ios_a1 | 1 | Alice |
android_b2 | 1 | Alice |
ios_c3 | 2 | Bob |
web_d4 | 4 | Dana |
The generated column _ht_device_ids_element contains the individual element value and is used as the identifier sent to the destination.
Empty arrays
Rows where the array column is empty ([]) or NULL produce zero output rows. In the example above, Charlie's row is omitted entirely because device_ids is an empty array.
Duplicate elements
If an array contains duplicate values (like Dana's two "web_d4" entries), the duplicates produce a primary key collision. The sync completes with a "Report" status and the duplicate rows are flagged as rejected. Only one row per unique element value is synced.
Configuration
Array expansion is configured automatically when you map an array-type column as the identifier in the "How should rows be matched?" section of your sync configuration.
When an array column is selected:
- Hightouch sets the
arrayExpansionSourceColumnin your sync config - The identifier mapping is automatically translated to use the generated element column (
_ht_{column}_element) - An info banner confirms that array expansion is active
To disable array expansion, change the identifier mapping to a non-array column.
Supported column types
Array expansion works with columns of the following types:
- Native arrays: Postgres
text[], SnowflakeARRAY, BigQueryARRAY<STRING>, DatabricksARRAY<STRING> - JSON arrays: Postgres
jsonb, SnowflakeVARIANT, BigQueryJSON - JSON string arrays:
VARCHARcolumns containing JSON-encoded arrays like'["a","b"]'
Hightouch automatically detects the column type and applies the appropriate expansion strategy.
Limitations
- Only one array column can be expanded per sync
- Array expansion requires the lightning sync engine (in-warehouse planner)
- Only supported on Postgres, Snowflake, BigQuery, and Databricks sources
- Duplicate values within an array cause row-level rejections
- Array expansion is not available on syncs that use Match Booster, as Match Booster applies its own identity resolution logic that is incompatible with row expansion