SchemaSmith Documentation

Data Delivery - PostgreSQL (Enterprise)

Package and deliver Postgres data updates as controlled artifacts.

Data Delivery

Overview

In the Community edition, data delivery can be achieved via scripts. SchemaQuench supports those scripts and the TableData quench slot and behavior. The Enterprise Edition expands on this capability by offering the ability to export your data to JSON files and configure tables for automated data delivery. This feature uses the JSON files to generate MERGE scripts dynamically, which are applied immediately prior to the TableData scripts.

If you continue using the TableData scripts, we will loop to resolve dependency order. However, this can be expensive, especially with many tables or complex dependency chains. You can optionally assign names to your scripts, so they will sort to control their execution order and reduce or eliminate the need for looping, though this approach will increase the maintenance burden.

Configuration

With data delivery, we traverse the foreign keys to optimize the delivery order. We also use looping to resolve dependencies that cannot be accurately determined up front, such as circular references.

To facilitate this method, you can use DataTongs to capture the necessary JSON files. The DVDRental product in demo repository is configured to use data delivery for both seed data and the full set of test data.

At the table level, you can simplify setup and reduce script management by configuring the following seven properties in your table's JSON file:


    {
        "ContentFile": "Table Contents/public.actor.tabledata",
        "MergeType": "Insert/Update",
        "MergeDisableTriggers": false,
        "MergeDisableRules": false,
        "MergeUpdateDescendents": false,
        "MatchColumns": "",
        "MergeFilter": ""
    }
                                

A complete table JSON with these properties configured can be found at the bottom of this file.

Property Required Default Description
ContentFile Yes Relative path from the template root to the .tabledata JSON file created by DataTongs.
MergeType Yes Controls which MERGE clauses are generated: Insert, Insert/Update, or Insert/Update/Delete.
MergeDisableTriggers No false When true, wraps the merge with ALTER TABLE ... DISABLE TRIGGER ALL / ENABLE TRIGGER ALL.
MergeDisableRules No false When true, wraps the merge with ALTER TABLE ... DISABLE RULE ALL / ENABLE RULE ALL. PostgreSQL-specific.
MergeUpdateDescendents No false Controls the ONLY modifier for table inheritance. When false, uses the ONLY keyword so operations affect only the specified table, not inherited children.
MatchColumns No Auto-detected Comma-separated column names used to match source and target rows. Overrides auto-detection from PK or unique index.
MergeFilter No SQL predicate applied to the DELETE clause only, scoping which rows are eligible for deletion.

Important

  • The ContentFile path is relative to the template for which the table is defined.
  • The MergeFilter scopes the DELETE operation when included.
  • The supported MergeType values include Insert, Insert/Update, and Insert/Update/Delete.

Merge Strategies

Data delivery generates PostgreSQL MERGE statements (available in PostgreSQL 15+) to synchronize content file data with the target table. The MergeType property determines which clauses are included in the generated script.

MergeType Values

MergeType MERGE Clauses Generated Minimum PostgreSQL Version
Insert WHEN NOT MATCHED THEN INSERT 15+
Insert/Update WHEN NOT MATCHED THEN INSERT + WHEN MATCHED THEN UPDATE 15+
Insert/Update/Delete WHEN NOT MATCHED THEN INSERT + WHEN MATCHED THEN UPDATE + WHEN NOT MATCHED BY SOURCE THEN DELETE 17+

Generated MERGE Script Structure

The delivery engine wraps the MERGE statement inside a DO $$ PL/pgSQL block with the content file data supplied as a JSON input parameter. A simplified example:

DO $$
DECLARE
    _source jsonb := '[{"actor_id":1,"first_name":"Penelope","last_name":"Guiness"}]'::jsonb;
BEGIN
    MERGE INTO ONLY public.actor AS target
    USING (
        SELECT
            (elem->>'actor_id')::integer AS actor_id,
            (elem->>'first_name')::character varying AS first_name,
            (elem->>'last_name')::character varying AS last_name
        FROM jsonb_array_elements(_source) AS elem
    ) AS source
    ON target.actor_id = source.actor_id
    WHEN MATCHED THEN UPDATE SET
        first_name = source.first_name,
        last_name = source.last_name
    WHEN NOT MATCHED THEN INSERT (actor_id, first_name, last_name)
        OVERRIDING SYSTEM VALUE
        VALUES (source.actor_id, source.first_name, source.last_name);

    -- Reset sequence to max value after delivery
    PERFORM SETVAL(
        pg_get_serial_sequence('public.actor', 'actor_id'),
        (SELECT MAX(actor_id) FROM public.actor)
    );
END $$;

Additional Merge Options

MergeFilter
Only applies to the DELETE clause (WHEN NOT MATCHED BY SOURCE AND <filter> THEN DELETE). Restricts which rows are eligible for deletion, leaving rows outside the filter untouched. Has no effect when MergeType is Insert or Insert/Update.
MergeDisableTriggers
When set to true, the merge script is wrapped with ALTER TABLE ... DISABLE TRIGGER ALL before execution and ALTER TABLE ... ENABLE TRIGGER ALL after. This prevents triggers from firing during data synchronization.
MergeDisableRules
When set to true, the merge script is wrapped with ALTER TABLE ... DISABLE RULE ALL before execution and ALTER TABLE ... ENABLE RULE ALL after. This is a PostgreSQL-specific option that prevents rules from rewriting DML operations in ways that would interfere with the MERGE statement.
MergeUpdateDescendents
Controls the ONLY modifier for PostgreSQL table inheritance. When false (default), MERGE INTO ONLY schema.table ensures operations affect only the specified table, not child tables that inherit from it. Set to true to allow operations to cascade to inherited children.
Identity/Serial Columns
For tables with identity or serial columns, the generated INSERT clause includes OVERRIDING SYSTEM VALUE to allow explicit values. After delivery, a SETVAL call resets the underlying sequence to MAX(column) so subsequent inserts receive the correct next value.

Delivery Algorithm

Data delivery uses a two-pass algorithm to handle foreign key dependencies, including cases where circular references or nullable foreign keys prevent a single-pass topological sort from succeeding.

Foreign Key Classification

Before delivery begins, every foreign key on every table configured for data delivery is classified:

Classification Condition Handling
Required At least one FK column is NOT NULL The referenced table must be delivered first; enforced by topological sort.
Deferred All FK columns are nullable FK columns are inserted as NULL in Pass 1 and backfilled in Pass 2.
Self-referencing FK references the same table Ignored during ordering; handled naturally by the MERGE statement.
Out-of-set FK references a table not in the delivery set Ignored during ordering; the referenced data is assumed to already exist.

Pass 1: Topological Delivery

Tables are delivered in topological order based on their required dependency edges. The algorithm iterates until all deliverable tables are processed:

  • No deferred columns: Tables whose required dependencies have all been delivered receive the full merge (using the configured MergeType).
  • Has deferred columns: Tables with nullable FK columns pointing to not-yet-delivered tables receive an INSERT-only merge. The deferred FK columns are set to NULL. These tables are queued for Pass 2.
  • Failed deliveries: If a table fails during delivery (e.g., due to a constraint violation from an unresolved dependency), it is retried in the next iteration.

Circular Dependencies

If tables remain after topological iteration completes (all have unmet required dependencies), they form a hard circular dependency. These tables are delivered as-is with full merge scripts. Any FK constraint violations will surface as errors and can be resolved with migration scripts.

Pass 2: Deferred Column Update

After all tables have been inserted, tables that were deferred in Pass 1 receive a second full merge using their configured MergeType. This fills in the actual FK values now that all referenced rows exist in the database.

WhatIf Mode

When WhatIf mode is enabled, the delivery engine logs every step it would execute (table order, merge type, deferred columns, pass assignments) without running any SQL against the database. Use this to verify delivery plans before applying them to production.

Checkpoint and Resume

SchemaQuench tracks "Data Delivery" as a single checkpoint step per template. If delivery fails partway through (e.g., on the fifth of ten tables), the entire data delivery step re-executes on the next run. The merge-based algorithm is designed to be safely re-runnable, so tables that were already delivered are re-merged without data loss or duplication.

Content File Format

Content files use the .tabledata extension and contain a JSON array of row objects. Each object represents one row, with property names matching the column names of the source table.

[
{"actor_id": 1, "first_name": "Penelope", "last_name": "Guiness", "last_update": "2013-05-26T14:47:57.62"},
{"actor_id": 2, "first_name": "Nick", "last_name": "Wahlberg", "last_update": "2013-05-26T14:47:57.62"}
]

How Content Files Are Created

Content files are created by DataTongs, which queries the source database and extracts table data as JSON. DataTongs can also automatically configure the data delivery properties in your table JSON files when the ConfigureDataDelivery setting is enabled.

Column metadata is not stored in the content file. At delivery time, SchemaQuench queries the target database's information_schema to resolve column types, nullability, and identity status. This means the content file is decoupled from the target schema -- columns can be added or removed from the table definition without regenerating the content file, as long as the match columns remain present.

Excluded Columns

Scenario Behavior
Generated columns Excluded from the content file. PostgreSQL computes these values automatically.
Dropped columns Excluded from the content file. Columns that no longer exist on the target table are skipped during extraction.

Special Type Handling

Array Columns
Array values are serialized using ARRAY_TO_STRING with *,* as the delimiter and *NULL_VALUE_REPRESENTATION* for null elements within the array. At delivery time, values are deserialized with STRING_TO_ARRAY using the same conventions. This encoding preserves array elements that contain commas and distinguishes null elements from empty strings.

Match Column Detection

Match columns determine how source rows are paired with target rows in the MERGE ... ON clause. When MatchColumns is not specified, the engine auto-detects them using the following priority:

  1. Primary Key — Used if the table has one.
  2. Unique Index — If no PK exists, the unique index with the fewest columns is selected. Indexes where all columns are NOT NULL are preferred over those with nullable columns.

In the content file header, match columns may be prefixed with an asterisk (*) to indicate that the column is nullable, which affects how the ON clause handles NULL comparisons.

Platform Notes

PostgreSQL data delivery leverages platform-specific features that differ from the SQL Server and MySQL implementations. The following notes highlight version requirements and behaviors unique to PostgreSQL.

Version Requirements

Feature Minimum Version Notes
MERGE statement PostgreSQL 15 Baseline requirement for all data delivery. Supports WHEN MATCHED and WHEN NOT MATCHED clauses.
WHEN NOT MATCHED BY SOURCE PostgreSQL 17 Required only when MergeType is Insert/Update/Delete. Earlier versions support Insert and Insert/Update only.

PostgreSQL-Specific Behaviors

Behavior Details
Identity / Serial Columns OVERRIDING SYSTEM VALUE is added to the INSERT clause for identity columns, allowing explicit values to be inserted. After delivery, SETVAL resets the associated sequence to MAX(column) so the next auto-generated value is correct.
Table Inheritance The ONLY modifier is controlled by the MergeUpdateDescendents property. When false (default), MERGE INTO ONLY schema.table ensures operations apply only to the specified table, not child tables that inherit from it. Set to true to include inherited children.
Rule System The MergeDisableRules property is unique to PostgreSQL. It disables all rules on the target table during delivery, which is important when rules rewrite DML operations in ways that would interfere with the MERGE statement.
Array Type Handling PostgreSQL array columns are serialized with ARRAY_TO_STRING using *,* as the delimiter and *NULL_VALUE_REPRESENTATION* for null elements. During delivery, values are deserialized with STRING_TO_ARRAY and cast to the appropriate array type.

Additional Resources