Supported syncing
| Type | Description | Supported Sync Modes |
|---|---|---|
| Any data set | Sync data from any source to a Snowflake Iceberg table | Upsert, Insert, Mirror |
For more information about sync modes, refer to the sync modes docs.
This destination is for Snowflake-managed Iceberg tables. If you want to sync to standard Snowflake tables, use the Snowflake destination instead.
Prerequisites
Before setting up the Snowflake Iceberg destination, you must have an external volume configured in Snowflake. Iceberg tables require an external volume that specifies where the table data and metadata are stored.
Snowflake credential setup
To allow Hightouch access to Snowflake for Iceberg tables, it's best to create a user specifically provisioned with access to the required tables, schemas, and external volumes. You can also use a personal Snowflake login for your credentials, as long as it has the correct permissions, specifically:
- Add, update, and delete (if applicable) rows in your sync's Iceberg table
- View the
INFORMATION_SCHEMA.COLUMNStable which is used for gathering metadata to set up the sync USAGEprivilege on the external volume used for Iceberg tables
Service account setup
You can use the following SQL template to create a service account with the necessary roles and permissions:
-- Edit the following variables
set ht_username='HIGHTOUCH_USER';
set ht_default_warehouse='<warehouse>';
set ht_database='<database>';
set ht_default_namespace='<database.schema>';
set ht_default_role='HIGHTOUCH_ICEBERG_ROLE';
set ht_external_volume='<external_volume_name>';
set ht_comment='Used for Hightouch Iceberg integrations';
-- Set role for grants
USE ROLE ACCOUNTADMIN;
-- Create a role for Hightouch
CREATE ROLE IF NOT EXISTS identifier($ht_default_role)
COMMENT = $ht_comment;
-- Create Hightouch's user
CREATE USER IF NOT EXISTS identifier($ht_username)
TYPE=service
rsa_public_key='<public_key>'
DEFAULT_WAREHOUSE=$ht_default_warehouse
DEFAULT_NAMESPACE=$ht_default_namespace
DEFAULT_ROLE=$ht_default_role
COMMENT=$ht_comment;
-- Grant permissions to the role
GRANT ROLE identifier($ht_default_role) TO ROLE SYSADMIN;
GRANT USAGE ON WAREHOUSE identifier($ht_default_warehouse) TO ROLE identifier($ht_default_role);
GRANT ROLE identifier($ht_default_role) TO USER identifier($ht_username);
GRANT USAGE ON DATABASE identifier($ht_database) TO ROLE identifier($ht_default_role);
GRANT USAGE ON ALL SCHEMAS IN DATABASE identifier($ht_database) TO ROLE identifier($ht_default_role);
GRANT SELECT ON ALL TABLES IN DATABASE identifier($ht_database) TO ROLE identifier($ht_default_role);
GRANT SELECT ON FUTURE TABLES IN DATABASE identifier($ht_database) TO ROLE identifier($ht_default_role);
GRANT SELECT ON ALL VIEWS IN DATABASE identifier($ht_database) TO ROLE identifier($ht_default_role);
GRANT SELECT ON FUTURE VIEWS IN DATABASE identifier($ht_database) TO ROLE identifier($ht_default_role);
-- Iceberg-specific permissions
GRANT USAGE ON EXTERNAL VOLUME identifier($ht_external_volume) TO ROLE identifier($ht_default_role);
GRANT CREATE ICEBERG TABLE ON SCHEMA <database.schema> TO ROLE identifier($ht_default_role);
Once you've created a Snowflake service account, you're ready to set up the connection in Hightouch.
Connect to Snowflake Iceberg
Go to the Destinations overview page and click the Add destination button. Select Snowflake Iceberg and click Continue. If you're not using a tunnel, you can then authenticate Hightouch to Snowflake by entering the following fields:
- Account identifier - You can find this identifier at the beginning of your Snowflake URL, for example, (
https://ACCOUNT_IDENTIFIER.snowflakecomputing.com). - Warehouse - This specifies the warehouse to use when Hightouch executes queries in Snowflake.
- Database - This specifies the database to use when Hightouch executes queries in Snowflake.
Account identifier format may differ based on Snowflake account age. For example, older Snowflake accounts often have identifiers that look like ACCOUNT_LOCATOR.CLOUD_REGION_ID.CLOUD, whereas newer Snowflake accounts have identifiers that look like ORGNAME-ACCOUNT_NAME.
For more details, visit Snowflake's account identifier docs.
Iceberg configuration
- External volume - The Snowflake external volume that provides storage for Iceberg tables. This must be configured in your Snowflake account before setting up the destination.
- Base location (optional) - A prefix path within the external volume where Iceberg tables will be stored.
Credentials
- Username - This can be your personal Snowflake login or a dedicated user for Hightouch. Note: Use the "Login name", which may differ from the Snowflake username.
- Role (optional) - Use this field to specify the role Hightouch should use when executing queries in Snowflake. If left blank, Hightouch uses the user's default role.
You have two options for finalizing your credentials:
- RSA key pair (recommended)
- Password
Snowflake is deprecating password authentication in favor of RSA key pair authentication or multi-factor authentication. Hightouch will continue to support password authentication for existing customers until they have migrated.
To ensure your credentials are correct, click Test connection. This confirms if Hightouch can connect to your database and verify access to the external volume.
RSA authentication
Generate a private key and public key by running the following commands in your terminal:
$ openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out rsa_key.p8
$ openssl rsa -in rsa_key.p8 -pubout -out rsa_key.pub
Save the password you used for the key, you will need to enter it when creating the Snowflake Iceberg destination in Hightouch.
Then in Snowflake, execute an ALTER USER command to assign the public key to your Hightouch Snowflake user:
alter user ht_user set rsa_public_key='MIIBIjANBgkqh...';
Finally, in Hightouch under the Authentication Method section, drag and drop or upload the private key you just generated into the Private key file field. Enter the password for your private key in the box below here, unless you used an unencrypted key.
SSH tunneling
Hightouch can connect directly to Snowflake over the public internet or via an SSH tunnel. Since data is encrypted in transit via TLS, a direct connection is suitable for most use cases. You may need to set up a tunnel if your Snowflake instance is on a private network or virtual private cloud (VPC).
Hightouch supports both standard and reverse SSH tunnels. To learn more about SSH tunneling, refer to Hightouch's tunneling documentation.
Sync configuration
Once you've set up your Snowflake Iceberg destination and have a model to pull data from, you can set up your sync configuration to begin syncing data. Go to the Syncs overview page and click the Add sync button to begin. Then, select the relevant model and the Snowflake Iceberg destination you want to sync to.
Table selection
Select the schema and Iceberg table you want to sync to. You can either select an existing Iceberg table or enable Automatically create and update Iceberg table to have Hightouch create a new Iceberg table using your configured external volume. If auto-create is enabled and the table already exists, Hightouch alters its definition to add new columns as needed.
Sync mode
Hightouch supports Upsert mode using the MERGE INTO statement, with the option to delete removed rows, Insert mode using the COPY INTO statement, and Mirror mode which will truncate the table and then insert all rows from the model.
Mirror mode will remove any existing data in the table before inserting all rows from the model
Record matching
To match rows from your model to rows in the Iceberg table, Hightouch requires you to select a unique identifier in the table you are syncing to.
The Snowflake column you select must be set as a UNIQUE, PRIMARY KEY column within Snowflake.
You can see columns that fit this criteria as available options in records matching section.
If there are no fields in the dropdown, you must add a unique type column to your Iceberg table. Then, click the refresh icon to access the newly created column.
In Insert mode, Snowflake automatically generates an identifier for every new record synced, so there is no need to match an existing record.
Column types
Iceberg tables have some data type limitations compared to standard Snowflake tables:
| Supported Types | Notes |
|---|---|
| Text (VARCHAR) | Standard text/string data |
| Number | INTEGER, FLOAT, NUMBER |
| Boolean | TRUE/FALSE values |
| Date | DATE type |
| Timestamp (NTZ) | Timestamp without timezone |
| Timestamp (LTZ) | Timestamp with local timezone (recommended for Iceberg) |
| Time | TIME type |
Semi-structured data types are not supported by Iceberg tables. This includes VARIANT, OBJECT, and ARRAY types. If your source data contains these types, you must transform them before syncing.
TIMESTAMP_LTZ is the recommended timestamp type for Iceberg tables. It internally stores UTC time with a specified precision. All operations are performed in the current session's time zone, controlled by the TIMEZONE session parameter. Note that TIMESTAMP_TZ is not supported by Iceberg tables.
If you see type errors, it may be because your model is producing the wrong format. If so, use typecasting or liquid templating to resolve the issue.
Field mapping
You can sync columns from your model to the columns in your Iceberg table. When auto-create is enabled, you can create new columns by mapping to column names that don't exist yet.
Batch size
You can tune the number of rows Hightouch upserts or inserts per query based on your needs and database threshold. The default is 50,000 rows per batch.
If you want to increase the sync's speed, you can increase the batch size. Keep in mind that in upsert mode, Snowflake fails the entire batch of rows if it detects any erroneous row. If you suspect that you will have many bad rows, don't use a high batch size. To avoid locks, ensure you account for your database's capacity when increasing the batch size.
Delete behavior
The delete behavior you select dictates what to do when a row no longer appears in your model's query results. You have the following options:
| Behavior | Description |
|---|---|
| Do nothing | Keep the row in the Iceberg table |
| Delete row | Remove the row from the Iceberg table entirely |
Tips and troubleshooting
External volume errors
If you see errors related to the external volume during connection testing, verify:
- The external volume exists in your Snowflake account
- Your user has
USAGEprivilege on the external volume - The external volume name is spelled correctly in the destination configuration
Common errors
To date, our customers haven't experienced any errors while using this destination. If you run into any issues, please don't hesitate to . We're here to help.
Live debugger
Hightouch provides complete visibility into the API calls made during each of your sync runs. We recommend reading our article on debugging tips and tricks to learn more.
Sync alerts
Hightouch can alert you of sync issues via Slack, PagerDuty, SMS, or email. For details, please visit our article on alerting.