PostgreSQL Walkthrough

DataTongs Walkthrough — PostgreSQL

Extract reference data from a live PostgreSQL database, wire it into an existing schema package, and close the SchemaSmith loop — the last leg of the three-tool workflow in 10 minutes.

DataTongs: extracting reference data from a live PostgreSQL database into a schema package

DataTongs extracts your data, writes it alongside the table JSON, and SchemaQuench delivers it in foreign-key order.

Prerequisites

DataTongs is the third tool in the SchemaSmith cycle — it comes after you've cast a schema with SchemaTongs and deployed it with SchemaQuench. This walkthrough assumes both have been run.

The SchemaTongs walkthrough has been completed

You need the ./my-northwind/ schema package produced by the SchemaTongs walkthrough — that package contains the structure of Northwind's tables, including the three tables this walkthrough wires reference data into. If you haven't run the SchemaTongs walkthrough, start there first.

The SchemaQuench walkthrough has been completed

The PostgreSQL demo from the SchemaQuench walkthrough leaves a Docker PostgreSQL instance running on localhost:5432 with a fully populated northwind database. DataTongs reads reference data out of that live database — if the demo isn't running, restart it from Demos/PostgreSQL/ before continuing.

SchemaSmith tools on your PATH

Same install as the first two walkthroughs. If you haven't added DataTongs to your PATH, do it now: it ships in the same GitHub release as SchemaTongs and SchemaQuench.

Step 1: Configure DataTongs

Pick three small reference tables to practice on. public.categories, public.region, and public.territories are ideal — Categories and Region are stand-alone lookups, and Territories has a foreign key to Region. Wiring all three in the same run means you can watch the declarative delivery resolve the FK graph on the next quench.

Create a DataTongs configuration file called datatongs-extract.json in the same directory where ./my-northwind/ lives:

{
  "Source": {
    "Platform": "PostgreSQL",
    "Server": "localhost",
    "Port": "5432",
    "User": "testuser",
    "Password": "your-password-here",
    "Database": "northwind"
  },
  "ContentPath": "./my-northwind/Templates/Northwind/data",
  "ScriptPath":  "./my-northwind/Templates/Northwind/Table Data",
  "Tables": [
    { "Name": "public.categories" },
    { "Name": "public.region" },
    { "Name": "public.territories" }
  ],
  "ShouldCast": {
    "ConfigureDataDelivery": true,
    "MergeUpdate": true,
    "MergeDelete": false,
    "DisableTriggers": true
  }
}

A few things to notice about this configuration:

  • ContentPath and ScriptPath live inside the template. Putting .tabledata files under data/ and the generated merge scripts under Table Data/ is the layout SchemaQuench expects — the Table Data folder maps to the TableData quench slot automatically.
  • ConfigureDataDelivery: true tells DataTongs to walk upward from ContentPath, find the template's Template.json, and write a DataDelivery block into each table's JSON file. This is the step that makes the reference data declarative instead of script-only.
  • MergeDelete: false matches the shipped Northwind demo — the delivery inserts and updates rows but never deletes from the target. If a user added a row to public.categories locally, the next delivery leaves it alone.
  • No KeyColumns overrides. All three tables have primary keys, so DataTongs auto-detects the match columns.

Step 2: Run DataTongs

From the directory containing datatongs-extract.json:

DataTongs --ConfigFile:datatongs-extract.json

DataTongs connects to the Docker PostgreSQL instance, reads each of the three tables in deterministic key order, serializes the rows to JSON, writes a merge script and a .tabledata file for each table, and — because ConfigureDataDelivery is on — opens each table's JSON inside ./my-northwind/Templates/Northwind/Tables/ and writes a DataDelivery block into it. Progress is logged to DataTongs - Progress.log in the working directory.

Step 3: Inspect What Changed

Open ./my-northwind/ in your editor. Three things are different.

New .tabledata files appeared

Under ./my-northwind/Templates/Northwind/data/ you now have:

data/
  public.categories.tabledata
  public.region.tabledata
  public.territories.tabledata

Each file is raw JSON — the rows DataTongs read from the live database, serialized one record per element. These are the source of truth for reference data from here on; they commit to source control alongside the table structure.

New merge scripts appeared

Under ./my-northwind/Templates/Northwind/Table Data/ you now have:

Table Data/
  Populate public.categories.sql
  Populate public.region.sql
  Populate public.territories.sql

Each is a self-contained MERGE statement with the row data embedded as a jsonb literal parsed through jsonb_to_recordset. You can run these directly against any compatible database (PostgreSQL 15 or higher, for native MERGE) and the reference data arrives. SchemaQuench can also deliver the data declaratively via the DataDelivery blocks below — the scripts are a fallback and a human-readable audit trail.

DataDelivery blocks were written into the table JSONs

Open ./my-northwind/Templates/Northwind/Tables/public.region.json. Before DataTongs ran, this file had structure only — the columns and indexes that SchemaTongs extracted (defaults like ShouldApplyExpression, Storage, and empty ForeignKeys / CheckConstraints arrays omitted here for readability):

{
  "Name": "region",
  "Columns": [
    { "Name": "region_id",          "DataType": "int2" },
    { "Name": "region_description", "DataType": "varchar(60)" }
  ],
  "Indexes": [
    { "Name": "pk_region", "PrimaryKey": true, "Unique": true, "IndexColumns": "region_id" }
  ]
}

After DataTongs ran with --ConfigureDataDelivery, the same file carries a new DataDelivery block:

{
  "Name": "region",
  "Columns": [
    { "Name": "region_id",          "DataType": "int2" },
    { "Name": "region_description", "DataType": "varchar(60)" }
  ],
  "Indexes": [
    { "Name": "pk_region", "PrimaryKey": true, "Unique": true, "IndexColumns": "region_id" }
  ],
  "DataDelivery": {
    "ContentFile": "data/public.region.tabledata",
    "MergeType": "Insert/Update",
    "MatchColumns": "region_id"
  }
}

SchemaTongs gave you the structure. DataTongs added the delivery wiring. See Schema Packages — DataDelivery for the full property reference.

The same block landed on categories and territories

Open public.categories.json and public.territories.json — each has a matching DataDelivery block pointing at its own .tabledata file. Territories' block also carries the relationship: SchemaQuench will see the foreign key from territories.region_id to region.region_id, sort region before territories, and deliver the data in foreign-key order automatically.

The DataTongs Cycle

You just ran the DataTongs leg of the SchemaSmith workflow:

  1. Configure. You wrote a small datatongs-extract.json pointing DataTongs at three reference tables, with ConfigureDataDelivery: true to wire the results into the schema package.
  2. Cast. DataTongs connected to PostgreSQL, read each table in deterministic key order, and wrote .tabledata files, merge scripts, and DataDelivery blocks — all in one pass.
  3. Inspect. You opened the package and saw the three tables carry delivery wiring they didn't have before. Reference data is now part of the schema package — declarative, diff-friendly, review-ready.

No hand-rolled merge scripts. No manual FK load-order sorting. No staging tables. You pointed DataTongs at a live database and got a schema package with reference data wired in. Every time.

Last reviewed May 2026 by the SchemaSmith Team.

What's Next

What's Next

You've closed the loop. SchemaTongs casts structure, DataTongs casts reference data, SchemaQuench delivers both. The three tables you just wired are now declaratively managed — their reference data ships alongside their structure. Any time you quench a package with DataDelivery blocks, SchemaQuench discovers them, sorts the tables in foreign-key order, and merges the data in two passes so the FK graph resolves automatically. You saw that mechanism at work when you ran the SchemaQuench walkthrough against ./demo/Northwind/ — the shipped demo has DataDelivery blocks on every populated table for the same reason.

Re-run DataTongs against your source of truth whenever reference data changes. Commit the updated .tabledata files and table JSONs. Re-run SchemaQuench to deliver. That's the declarative data loop.

Continuity note. This walkthrough extended ./my-northwind/ — the clean extraction sandbox from the SchemaTongs walkthrough — with delivery wiring for three small reference tables. The shipped ./demo/Northwind/ package stays untouched as your always-working demo; it already has DataDelivery blocks on every populated table, which is why the SchemaQuench walkthrough's quench delivered data without any extra setup.