DataTongs Walkthrough — PostgreSQL
Extract reference data from a live PostgreSQL database, wire it into an existing schema package, and close the SchemaSmith loop — the last leg of the three-tool workflow in 10 minutes.
DataTongs extracts your data, writes it alongside the table JSON, and SchemaQuench delivers it in foreign-key order.
Prerequisites
DataTongs is the third tool in the SchemaSmith cycle — it comes after you've cast a schema with SchemaTongs and deployed it with SchemaQuench. This walkthrough assumes both have been run.
The SchemaTongs walkthrough has been completed
You need the ./my-northwind/ schema package produced by the SchemaTongs walkthrough — that package contains the structure of Northwind's tables, including the three tables this walkthrough wires reference data into. If you haven't run the SchemaTongs walkthrough, start there first.
The SchemaQuench walkthrough has been completed
The PostgreSQL demo from the SchemaQuench walkthrough leaves a Docker PostgreSQL instance running on localhost:5432 with a fully populated northwind database. DataTongs reads reference data out of that live database — if the demo isn't running, restart it from Demos/PostgreSQL/ before continuing.
SchemaSmith tools on your PATH
Same install as the first two walkthroughs. If you haven't added DataTongs to your PATH, do it now: it ships in the same GitHub release as SchemaTongs and SchemaQuench.
Step 1: Configure DataTongs
Pick three small reference tables to practice on. public.categories, public.region, and public.territories are ideal — Categories and Region are stand-alone lookups, and Territories has a foreign key to Region. Wiring all three in the same run means you can watch the declarative delivery resolve the FK graph on the next quench.
Create a DataTongs configuration file called datatongs-extract.json in the same directory where ./my-northwind/ lives:
{
"Source": {
"Platform": "PostgreSQL",
"Server": "localhost",
"Port": "5432",
"User": "testuser",
"Password": "your-password-here",
"Database": "northwind"
},
"ContentPath": "./my-northwind/Templates/Northwind/data",
"ScriptPath": "./my-northwind/Templates/Northwind/Table Data",
"Tables": [
{ "Name": "public.categories" },
{ "Name": "public.region" },
{ "Name": "public.territories" }
],
"ShouldCast": {
"ConfigureDataDelivery": true,
"MergeUpdate": true,
"MergeDelete": false,
"DisableTriggers": true
}
}
A few things to notice about this configuration:
ContentPathandScriptPathlive inside the template. Putting.tabledatafiles underdata/and the generated merge scripts underTable Data/is the layout SchemaQuench expects — theTable Datafolder maps to theTableDataquench slot automatically.ConfigureDataDelivery: truetells DataTongs to walk upward fromContentPath, find the template'sTemplate.json, and write aDataDeliveryblock into each table's JSON file. This is the step that makes the reference data declarative instead of script-only.MergeDelete: falsematches the shipped Northwind demo — the delivery inserts and updates rows but never deletes from the target. If a user added a row topublic.categorieslocally, the next delivery leaves it alone.- No
KeyColumnsoverrides. All three tables have primary keys, so DataTongs auto-detects the match columns.
Step 2: Run DataTongs
From the directory containing datatongs-extract.json:
DataTongs --ConfigFile:datatongs-extract.json
DataTongs connects to the Docker PostgreSQL instance, reads each of the three tables in deterministic key order, serializes the rows to JSON, writes a merge script and a .tabledata file for each table, and — because ConfigureDataDelivery is on — opens each table's JSON inside ./my-northwind/Templates/Northwind/Tables/ and writes a DataDelivery block into it. Progress is logged to DataTongs - Progress.log in the working directory.
Step 3: Inspect What Changed
Open ./my-northwind/ in your editor. Three things are different.
New .tabledata files appeared
Under ./my-northwind/Templates/Northwind/data/ you now have:
data/
public.categories.tabledata
public.region.tabledata
public.territories.tabledata
Each file is raw JSON — the rows DataTongs read from the live database, serialized one record per element. These are the source of truth for reference data from here on; they commit to source control alongside the table structure.
New merge scripts appeared
Under ./my-northwind/Templates/Northwind/Table Data/ you now have:
Table Data/
Populate public.categories.sql
Populate public.region.sql
Populate public.territories.sql
Each is a self-contained MERGE statement with the row data embedded as a jsonb literal parsed through jsonb_to_recordset. You can run these directly against any compatible database (PostgreSQL 15 or higher, for native MERGE) and the reference data arrives. SchemaQuench can also deliver the data declaratively via the DataDelivery blocks below — the scripts are a fallback and a human-readable audit trail.
DataDelivery blocks were written into the table JSONs
Open ./my-northwind/Templates/Northwind/Tables/public.region.json. Before DataTongs ran, this file had structure only — the columns and indexes that SchemaTongs extracted (defaults like ShouldApplyExpression, Storage, and empty ForeignKeys / CheckConstraints arrays omitted here for readability):
{
"Name": "region",
"Columns": [
{ "Name": "region_id", "DataType": "int2" },
{ "Name": "region_description", "DataType": "varchar(60)" }
],
"Indexes": [
{ "Name": "pk_region", "PrimaryKey": true, "Unique": true, "IndexColumns": "region_id" }
]
}
After DataTongs ran with --ConfigureDataDelivery, the same file carries a new DataDelivery block:
{
"Name": "region",
"Columns": [
{ "Name": "region_id", "DataType": "int2" },
{ "Name": "region_description", "DataType": "varchar(60)" }
],
"Indexes": [
{ "Name": "pk_region", "PrimaryKey": true, "Unique": true, "IndexColumns": "region_id" }
],
"DataDelivery": {
"ContentFile": "data/public.region.tabledata",
"MergeType": "Insert/Update",
"MatchColumns": "region_id"
}
}
SchemaTongs gave you the structure. DataTongs added the delivery wiring. See Schema Packages — DataDelivery for the full property reference.
The same block landed on categories and territories
Open public.categories.json and public.territories.json — each has a matching DataDelivery block pointing at its own .tabledata file. Territories' block also carries the relationship: SchemaQuench will see the foreign key from territories.region_id to region.region_id, sort region before territories, and deliver the data in foreign-key order automatically.