SQL Server Reference

DataTongs for SQL Server

The full DataTongs reference for SQL Server — source connection, the Tables array, ShouldCast flags, generated MERGE scripts, and the ConfigureDataDelivery pipeline that wires reference data straight into your schema package.

DataTongs — extracting reference data from a live database into version-controlled files

Lookup tables, seed data, the whole library of reference rows every environment needs — they're a schema-management problem as much as the tables themselves.

Overview

DataTongs grips reference data from a live database and produces self-contained synchronization scripts that bring it into any target. Point it at a source, list the tables you care about, and it produces one SQL file per table — ready to drop into a schema package's Table Data folder or run directly against any compatible instance.

DataTongs supports SQL Server, PostgreSQL, and MySQL. On SQL Server, the generated scripts use MERGE with OPENJSON so the source data travels inside the script as an embedded JSON literal — no staging tables, no bulk-load dependencies, no external files at run time. The configuration shape and the workflow are identical across all three platforms.

Installation

GitHub Releases — Download the latest release from the SchemaSmith releases page. Unzip and place the executable on your PATH. DataTongs ships in the same distribution as SchemaTongs and SchemaQuench.

Running DataTongs

From the directory containing your DataTongs.settings.json:

DataTongs

With a custom config file:

DataTongs --ConfigFile:C:\configs\extract-reference-data.json

With a full SQL Server connection string override:

DataTongs --ConnectionString:"Data Source=myserver;Initial Catalog=AppMain;User ID=sa;Password=secret;TrustServerCertificate=True;"

DataTongs connects to the source database, reads every table listed in the Tables array, serializes each row to JSON in deterministic key order, and writes a sync script plus a sibling .tabledata file for each table. Progress is logged to DataTongs - Progress.log and any errors to DataTongs - Errors.log in the working directory.

Configuration reference

DataTongs reads configuration from multiple sources, merged in precedence order (highest priority last):

  1. Settings fileDataTongs.settings.json in the working directory (or the file specified by --ConfigFile)
  2. User secrets (debug builds only)
  3. Environment variables with the SmithySettings_ prefix
  4. Command-line switches (highest precedence)

For the full explanation of configuration loading, environment variable mapping, and CLI switches, see Configuration.

Complete DataTongs.settings.json

{
  "Source": {
    "Server": "",
    "Port": "",
    "User": "",
    "Password": "",
    "Database": "",
    "Platform": "SqlServer",
    "ConnectionProperties": {
      "TrustServerCertificate": "True"
    }
  },
  "ContentPath": "",
  "ScriptPath": "",
  "Tables": [
    { "Name": "dbo.Country",  "KeyColumns": "CountryCode" },
    { "Name": "dbo.Currency", "KeyColumns": "CurrencyCode", "Filter": "IsActive = 1" }
  ],
  "ShouldCast": {
    "OutputScripts": true,
    "OutputContentFiles": false,
    "DisableTriggers": false,
    "MergeUpdate": true,
    "MergeDelete": false,
    "MergeType": "Insert/Update",
    "ConfigureDataDelivery": false,
    "TokenizeScripts": false
  }
}

Source connection

KeyTypeDefaultDescription
Source:Platformstring(required)Set to SqlServer to select the SQL Server script generator.
Source:Serverstring(required)Database server hostname or IP.
Source:Portstring1433TCP port.
Source:Userstring(empty)Login username. SQL Server allows blank for Windows authentication.
Source:Passwordstring(empty)Login password.
Source:Databasestring(required)Source database to extract data from.
Source:ConnectionPropertiesobject{}Arbitrary key-value pairs appended to the connection string. Platform-specific keys — see Configuration.

The --ConnectionString switch bypasses all Source settings and passes the provided value directly to the SQL Server driver. Source:Platform is still required so the right script generator claims the connection.

Output paths

KeyTypeDefaultDescription
ScriptPathstring"."Directory where the generated Populate <schema>.<table>.sql merge scripts are written. Created automatically if it doesn't exist.
ContentPathstring"."Directory where the sibling .tabledata files (raw JSON row data) are written. When --ConfigureDataDelivery is on, this path is also used to locate the enclosing template — DataTongs walks upward from ContentPath looking for a Template.json.

The typical placement inside a schema package is ScriptPath pointing at Templates/<Name>/Table Data and ContentPath pointing at Templates/<Name>/data. That layout makes the script files visible to SchemaQuench's TableData quench slot and the .tabledata files reachable from each table's DataDelivery block.

Tables array

FieldTypeDefaultDescription
Namestring(required)Table name in schema.table format. If no schema prefix is given, dbo is assumed.
KeyColumnsstring(auto-detected)Comma-separated column names for the row-matching key. When blank, auto-detected from the table's primary key or best unique index. Prefix a column with * for NULL-safe comparison on nullable keys.
SelectColumnsstring(all columns)Comma-separated column names to extract. When blank, every non-excluded column is extracted. Use to narrow extraction to a specific subset.
Filterstring(empty)SQL WHERE clause (without the WHERE keyword) to filter which rows are extracted. Also applied to the delete clause when MergeDelete is enabled.
MergeTypestring(derived)Per-table override for the DataDelivery:MergeType default. Values: None, Insert, Insert/Update, Insert/Update/Delete. When blank, derived from global ShouldCast:MergeUpdate + ShouldCast:MergeDelete.

See Table configuration below for key-column auto-detection, nullable-key handling, and filter semantics. The Tables array itself can't be configured via environment variables — table definitions must live in the JSON configuration file.

Table configuration

Table name

Specify tables in schema.table format. If you omit the schema, dbo is assumed:

{ "Name": "dbo.Country" }
{ "Name": "Products" }
{ "Name": "HumanResources.Department" }

DataTongs validates that each table exists in the source database before attempting extraction. Tables that don't exist are skipped with an error message.

Key columns

Key columns define the row-matching predicate — the ON clause of the generated MERGE.

Auto-detection. When KeyColumns is blank, DataTongs queries the table's indexes and selects the primary key if one exists. If there's no primary key, it falls back to the first available unique index. Nullable columns in the detected key are automatically prefixed with * for NULL-safe comparison. This handles the vast majority of tables without any manual configuration.

Manual override. When you specify KeyColumns, DataTongs uses your list instead of auto-detection. Separate multiple columns with commas:

{ "Name": "dbo.OrderLine", "KeyColumns": "OrderID, LineNumber" }

Nullable key columns. If a key column allows NULLs, prefix it with *. This generates NULL-safe matching:

{ "Name": "dbo.Mapping", "KeyColumns": "SourceCode, *TargetCode" }

The generated ON clause becomes:

ON Source.[SourceCode] = Target.[SourceCode]
   AND (Source.[TargetCode] = Target.[TargetCode]
        OR (Source.[TargetCode] IS NULL AND Target.[TargetCode] IS NULL))

When auto-detection discovers a nullable column in a unique index, it automatically applies the * behavior.

No key available. If a table has no primary key, no unique index, and no KeyColumns configured, DataTongs skips the table with an error message.

Filter

The Filter field accepts a SQL WHERE clause (without the WHERE keyword). It controls two things:

  1. Which rows are extracted from the source database.
  2. Which target rows are eligible for deletion when MergeDelete is enabled.
{ "Name": "dbo.FeatureFlags", "KeyColumns": "FlagName", "Filter": "IsActive = 1" }

This extracts only active flags and, if MergeDelete is on, only deletes active flags that no longer exist in the source. Inactive flags in the target are untouched.

ShouldCast flags

Control how DataTongs generates merge scripts and whether it wires the results into a schema package.

FlagDefaultEffect
OutputScriptstrueGenerate merge scripts. When false, only .tabledata files are written.
OutputContentFilesfalseWrite raw data to sibling .tabledata content files. Automatically enabled when ConfigureDataDelivery is on.
DisableTriggersfalseWraps the generated script with ALTER TABLE ... DISABLE TRIGGER ALL / ENABLE TRIGGER ALL. Use when the target table has audit or notification triggers whose side effects are undesirable during a bulk data refresh.
MergeUpdatetrueIncludes the WHEN MATCHED update branch. When disabled, existing rows are left untouched regardless of whether their data differs.
MergeDeletetrueIncludes the WHEN NOT MATCHED BY SOURCE delete branch. When a Filter is configured, the delete clause respects the filter so that rows outside the filter are never removed.
MergeTypeInsert/UpdateDefault DataDelivery:MergeType for tables that don't set it explicitly. Values: None, Insert, Insert/Update, Insert/Update/Delete. Used when writing DataDelivery blocks via --ConfigureDataDelivery.
ConfigureDataDeliveryfalseAfter extraction, writes a DataDelivery block into each matching table's JSON file inside the enclosing schema package. See --ConfigureDataDelivery.
TokenizeScriptsfalseReplaces the source database name with script tokens in the generated merge scripts, matching SchemaTongs' tokenization behavior.

Delivery scenarios

These settings compose into three practical patterns:

ScenarioMergeUpdateMergeDeleteEffect
Full synctruetrueInsert missing, update changed, delete removed. Target matches source exactly.
Add and update, no deletestruefalseInsert missing, update changed, leave extra rows alone. Good when targets carry environment-specific additions.
Seed onlyfalsefalseInsert missing rows only. Existing rows untouched, nothing deleted. Good for seed data without overwriting local customizations.

Generated script anatomy

What DataTongs writes to disk on SQL Server is a single MERGE statement with the source data embedded as a JSON literal and parsed through OPENJSON. The script is self-contained — no staging tables, no external files, no bulk-load dependencies. Run it against any compatible database and the data arrives.

MERGE with OPENJSON

DECLARE @v_json NVARCHAR(MAX) = '[
{"CompanyName":"Speedy Express","Phone":"(503) 555-9831","ShipperID":1},
{"CompanyName":"United Package","Phone":"(503) 555-3199","ShipperID":2},
{"CompanyName":"Federal Shipping","Phone":"(503) 555-9931","ShipperID":3}
]';

ALTER TABLE [dbo].[Shippers] DISABLE TRIGGER ALL;     -- if DisableTriggers
SET IDENTITY_INSERT [dbo].[Shippers] ON;              -- if identity column exists

MERGE INTO [dbo].[Shippers] AS Target
USING (
  SELECT [CompanyName],[Phone],[ShipperID]
    FROM OPENJSON(@v_json)
    WITH (
      [CompanyName] NVARCHAR(40),
      [Phone] NVARCHAR(24),
      [ShipperID] INT
    )
) AS Source
ON Source.[ShipperID] = Target.[ShipperID]

WHEN MATCHED AND (<change detection>) THEN           -- if MergeUpdate
  UPDATE SET
    [CompanyName] = Source.[CompanyName],
    [Phone] = Source.[Phone]

WHEN NOT MATCHED THEN
  INSERT ([CompanyName], [Phone], [ShipperID])
  VALUES (Source.[CompanyName], Source.[Phone], Source.[ShipperID])

WHEN NOT MATCHED BY SOURCE THEN                       -- if MergeDelete
  DELETE
;

SET IDENTITY_INSERT [dbo].[Shippers] OFF;             -- if identity column exists
ALTER TABLE [dbo].[Shippers] ENABLE TRIGGER ALL;      -- if DisableTriggers

The comment markers (-- if DisableTriggers, -- if identity column exists, -- if MergeUpdate, -- if MergeDelete) indicate sections that are conditionally emitted. With all flags on and an identity column on the table, every block lands in the file; flip any of the flags and the relevant block disappears.

Identity columns

Identity columns are detected automatically. The script wraps the MERGE with SET IDENTITY_INSERT ... ON / OFF, includes the identity in the INSERT clause so explicit values carry, and excludes it from the UPDATE SET clause so existing identity values are never modified.

Computed columns

Computed columns are auto-excluded from extraction and from every section of the generated script. They can't be assigned, so attempting to insert or update them would fail at run time.

Special type handling

DataTongs detects column types and applies the correct extraction and restoration strategy for each. On SQL Server:

  • Identity columns — detected automatically. The script wraps the merge with SET IDENTITY_INSERT ON / OFF and includes the identity in the INSERT clause but excludes it from UPDATE SET.
  • Computed columns — auto-excluded from extraction and all script sections.
  • Geography — full round-trip via WKT (.ToString()) plus the SRID captured separately.
  • Geometry — extracted as WKT (basic support; restoration may need manual adjustment for complex cases).
  • HierarchyID — canonical string round-trip.
  • XML / NTEXT / TEXT / IMAGE — mapped through the OPENJSON layer with type-appropriate cast for change detection.

Auto-excluded on SQL Server: sql_variant, rowversion / timestamp, and ROWGUIDCOL columns. A warning is logged for each skipped column so you know exactly what was skipped.

When a column is excluded, DataTongs names the table, the column, and the underlying type in the log. If you need data from these columns, handle them with custom scripts outside of DataTongs.

--ConfigureDataDelivery

DataTongs can do more than write scripts — it can wire the extracted data straight into your schema package's declarative delivery pipeline. Pass --ConfigureDataDelivery and DataTongs writes a DataDelivery block into each matching table's JSON file after extracting the data.

DataTongs --ConfigureDataDelivery
DataTongs --ConfigureDataDelivery --TemplatePath:./Templates/Main

The switch can also be enabled via configuration:

{
  "ShouldCast": { "ConfigureDataDelivery": true }
}

What it does

For every table that produces output, DataTongs opens Tables/<schema>.<table>.json inside the template and writes (or overwrites) a DataDelivery block with:

PropertyValue
ContentFilePath to the written .tabledata file, relative to the template root.
MergeTypeEither the per-table MergeType from config, or a default derived from ShouldCast:MergeUpdate + ShouldCast:MergeDelete (see below).
MatchColumnsThe key columns used during extraction — either the per-table KeyColumns, or the primary key auto-detected from the source database (same logic that drives the generated script).
MergeFilterThe per-table Filter, if one was specified.
MergeDisableTriggersMirrors ShouldCast:DisableTriggers.

Tables whose JSON file doesn't exist in the template are skipped with a warning. No file is created from scratch — this tool configures existing tables, it doesn't scaffold new ones (use SchemaTongs for that).

Default MergeType derivation

When a table's config doesn't set MergeType explicitly, the default is derived from the global ShouldCast flags:

MergeUpdateMergeDeleteDefault MergeType
truetrueInsert/Update/Delete
truefalseInsert/Update
falseanyInsert

A per-table MergeType on the Tables[] entry always wins over the derived default.

Template root discovery

--ConfigureDataDelivery needs to know where your template lives. It uses this precedence:

  1. --TemplatePath:<path> CLI switch.
  2. TemplatePath config key.
  3. Auto-discovery: walk upward from ContentPath looking for the nearest directory containing a Template.json.

If nothing is found, DataTongs logs a warning and skips the configuration step — extraction still completes normally. See Schema Packages — DataDelivery for the full property reference and the schema validation that catches misconfigurations at authoring time.

Source query and change detection

How rows are extracted

DataTongs extracts data using a deterministic, JSON-shaped query. On SQL Server, that query uses FOR JSON AUTO. The principles that apply:

  • Read-friendly hints. The extraction query uses WITH (NOLOCK) on SQL Server to avoid blocking production workloads. DataTongs is extracting a snapshot of reference data, not transactional data requiring strict consistency.
  • Deterministic order. Results are ordered by the key columns to produce diff-friendly output. Run DataTongs twice against unchanged data and you get identical files.
  • Empty tables. When the query returns no data, DataTongs skips script generation for that table entirely.

Change detection

Nobody wants a script that updates every row just because it can. The update branch of the generated script fires only when the source row actually differs from the target row. DataTongs generates a type-aware comparison for every non-key, non-identity column.

For most columns:

NOT (Target.[Column] = Source.[Column]
     OR (Target.[Column] IS NULL AND Source.[Column] IS NULL))

This NULL-safe comparison treats two NULLs as equal (no update needed) and a NULL versus a non-NULL as different (update needed).

For special types — geography, XML, large text/binary types — the comparison is wrapped with the appropriate cast or normalization function (e.g., .ToString() for geography on SQL Server, CAST(... AS NVARCHAR(MAX)) for XML and NTEXT).

This approach means running DataTongs twice against unchanged data produces a script that matches every row but updates none — the script becomes a no-op for existing data. Diffs are clean. Reviews are honest.

Last reviewed May 2026 by the SchemaSmith Team.