This project demonstrates a practical pattern for handling satellite payload changes over time and merging versioned satellites back into a stable canonical model.
It uses DuckDB and automate_dv to build a small Data Vault from generated healthcare seed data, then shows how to:
- version a satellite when the payload column list changes
- keep versioned raw satellites as internal models
- expose a canonical Business Vault satellite with the original unversioned model name
- preserve downstream dbt refs while applying v2-over-v1 precedence logic
- Seed data:
patientsencounters
- Raw staging models:
stg_raw_patientsstg_raw_encounters
automate_dvstage models:stg_dv_patientsstg_dv_encounters
- Vault models:
- Hubs:
hub_patient,hub_encounter - Link:
link_encounter_patient - Raw Satellites:
sat_patient_details_v1,sat_patient_details_v2,sat_encounter_details - Business Vault canonical satellite:
sat_patient_details(merged from v1/v2)
- Hubs:
- Marts:
dim_patientsfact_encounters
The canonical Business Vault model sat_patient_details is built by merging sat_patient_details_v1 and sat_patient_details_v2 with explicit precedence rules:
- Join keys:
patient_hk,load_datetime,effective_from. - If both versions have the same key tuple,
v2is authoritative. - Shared descriptive columns use
coalesce(v2_col, v1_col)sov2values win when present. v1is fallback when no matchingv2row exists.v2-only columns (for examplesourced_market) are populated only when present inv2; they arenullforv1-only rows.
This keeps downstream refs stable (ref('sat_patient_details')) while allowing raw satellite versions to evolve internally.
Create %USERPROFILE%/.dbt/profiles.yml:
adv_merge:
target: dev
outputs:
dev:
type: duckdb
path: adv_merge.duckdb
threads: 4dbt deps
dbt seed
dbt runOr build all with tests:
dbt build