The open-source Claude plugin for data architects.
Community-built skills that turn Claude into a senior data architect β for modeling, platforms, cloud, AI, and modernization.
π Quick Start Β· π Browse Skills Β· βοΈ Contribute a Skill Β· πΊοΈ Roadmap
data-architecture is a living, community-built Claude plugin that gives Claude the skills of a senior data architect.
Think of it like brain modules for your AI β each skill you install teaches Claude how to:
- Design data models using Data Vault 2.0, Star Schema, 3NF, and AUDM
- Architect modern platforms: Data Mesh, Data Fabric, Lakehouse, Lambda
- Evaluate and choose cloud technologies: Snowflake, Databricks, Azure Synapse
- Build supply chain analytics from real KPI catalogs
- Design AI/ML feature stores and RAG architectures
- Execute data modernization and migration playbooks
Built at Data Architect School (Accenture, 2024) Β· 5-day curriculum Β· community-driven Β· MIT licensed
# 1. Clone
git clone https://github.com/wjlgatech/data-architecture.git
# 2. Pick a skill and read its instructions
cat skills/day1-modeling/SKILL.md
# 3. Paste into Claude's system prompt or use as a Project instructionOnce installed, Claude responds to built-in commands:
/design-model I need to model a pharmaceutical supply chain with 30 KPIs
/choose-architecture We have 5 source systems, daily batch + real-time events
/kpi-catalog Order Fulfillment domain, OTIF and Perfect Order needed
/audit-vault Review my Hub-Link-Satellite design for DV 2.0 compliance
| Day | Module | Commands | Status |
|---|---|---|---|
| 0οΈβ£ | Skill Orchestrator | discover-client, assess-maturity, orchestrate-engagement, translate-for-stakeholder, estimate-effort |
β Active |
| 1οΈβ£ | Intro to Data Architecture & Modeling | design-model, choose-architecture, kpi-catalog, audit-vault, dimension-map |
β Active |
| 2οΈβ£ | Data Management | design-mdm, check-data-quality, governance-check, lifecycle-plan, security-review |
β Active |
| 3οΈβ£ | Cloud Data & Technology | design-cloud-platform, design-data-platform, design-ingestion-pipeline, design-api-layer, multi-region-plan |
β Active |
| 4οΈβ£ | Data Intelligence, Analytics & AI | analyze-big-data, design-nlp-pipeline, build-mlops-pipeline, design-realtime-intelligence, responsible-ai-review |
β Active |
| 5οΈβ£ | Data Strategy & GenAI | design-genai-architecture, data-strategy-alignment, build-data-product, modernization-roadmap, operating-model-design |
β Active |
6 skills Β· 30 commands Β· full 5-day curriculum complete. PRs welcome to extend any module.
data-architecture/
βββ skills/ # π§ Claude skills (one folder = one skill module)
β βββ skill-orchestrator/ # Meta-skill: client intake, maturity, engagement orchestration
β βββ day1-modeling/ # Data modeling: Vault, Star, 3NF, AUDM
β β βββ SKILL.md # Main Claude instructions (paste into system prompt)
β β βββ metadata.json # Skill metadata, version, tags
β β βββ commands/ # Slash command definitions
β β βββ references/ # Deep reference material
β βββ day2-data-management/ # MDM, Data Quality, Governance, Lifecycle, Security
β βββ day3-cloud-data/ # Cloud platforms, Lakehouse, FHIR, multi-region
β βββ day4-analytics/ # Big data, clinical NLP, MLOps, real-time, responsible AI
β βββ day5-strategy/ # GenAI/RAG, data products, modernization, operating model
β βββ index.json # Machine-readable skill registry
β
βββ knowledge-base/ # π Cross-skill shared domain knowledge
β βββ healthcare-standards.md # HL7 FHIR, ICD-10, LOINC, SNOMED
β βββ cloud-platform-patterns.md
β βββ analytics-patterns.md
β βββ genai-data-patterns.md
β
βββ schemas/ # π JSON schemas for CI validation
βββ templates/ # π§© Copy-paste starters for new skills
βββ examples/ # π Real case studies (interactive HTML)
β βββ newlife-pharmacy/ # Pharma supply chain β Day 1
β βββ newlife-hospital/ # Healthcare HIS β Days 2β5
βββ docs/ # π Architecture decisions, specs
βββ tests/ # β
Validation scripts (run by CI)
βββ scripts/ # π οΈ CLI tooling
βββ .github/ # βοΈ Workflows, issue/PR templates
We merge PRs every day. If your skill passes CI, it gets merged.
# 1. Fork + clone
git clone https://github.com/YOUR_USERNAME/data-architecture.git
# 2. Create a branch
git checkout -b skill/your-skill-name
# 3. Copy the template
cp -r templates/skill-template skills/your-skill-name
# 4. Fill in SKILL.md and metadata.json
# 5. Validate locally
npm run validate
# 6. Open a PR β we'll review and merge same dayβ Full guide: CONTRIBUTING.md
β Easy wins: good first issue
![]() Paul Wu ποΈ Founder |
Proof of expertise, not slides. Each solution below is a fully-interactive artifact built end-to-end from a real case study. Click to explore.
NewLife Pharmacy β D2P supply chain, 30 KPIs, Data Vault 2.0 vs. Star Schema decision
One-line verdict: Chose Data Vault 2.0 over Star Schema because multi-vendor invoice discrepancies require storing multiple source "truths" simultaneously β something a Star Schema can't do without picking a winner at ETL time.
| Dimension | Decision |
|---|---|
| Architecture | Data Vault 2.0 β 9 Hubs, 6 Links, 6+ Satellites |
| Analytics Layer | Star Schema Information Marts on top of Business Vault |
| KPIs catalogued | 30 KPIs across Inventory, Order Fulfillment, Transportation, Returns, Warehousing |
| External enrichment | FDA Drug Shortages Β· Weather API Β· FreightWaves Β· IQVIA Β· EPA Emissions |
| Key insight | DV stores supplier A and supplier B versions in separate Satellites; golden record resolved in Business Vault β never at load time |
βΆ Open Interactive Solution β
NewLife Hospital β 300 hospitals, 90+ countries, 200M+ patients, HIPAA + GDPR
One-line verdict: Federated Hub-and-Spoke MDM β the only pattern that gives a global patient identity and data residency compliance simultaneously. Pure centralised violates GDPR. Pure decentralised makes "unified" impossible.
| Dimension | Decision |
|---|---|
| MDM Architecture | Federated Hub-and-Spoke β Global Hub (de-ID MPI + reference) + Regional Nodes (full PHI per jurisdiction) |
| MDM Domains | Party (Patient MPI, Physician) Β· Places (Facilities) Β· Things (Drugs) Β· Reference (ICD-10, LOINC, SNOMED) |
| Regulatory | GDPR Β· HIPAA Β· PIPL Β· DPDP Β· PDPA β attribute-level consent, data residency routing, right-to-erasure workflow |
| Data Quality | Profile β Rules β Cleanse β Monitor; β₯99% patient completeness; 100% drug code accuracy |
| Governance | EGC β DGC β Domain Owners β Stewards Β· RBAC + ABAC Β· Break-glass emergency access with full audit |
| Lifecycle | Hot/Warm/Cold/Archive metadata-driven policy engine β 7 lifecycle stages automated |
| Security | Zero Trust + field-level AES-256 + DevSecOps (Build β Test β Deploy β Operate) |
βΆ Open Interactive Solution β
NewLife Hospital β Multi-region healthcare data platform, FHIR R4 API, Medallion Lakehouse, 90+ countries
One-line verdict: Azure Medallion Lakehouse (Bronze/Silver/Gold) on Delta Lake β the only pattern that handles FHIR R4 streaming ingestion, multi-jurisdictional data residency, and clinical AI feature serving from a single coherent architecture.
| Dimension | Decision |
|---|---|
| Platform | Azure β ADF, Event Hub, Databricks, Delta Lake, Synapse, ADLS Gen2 |
| Architecture | Medallion Lakehouse β Bronze (raw FHIR) β Silver (cleaned) β Gold (marts) |
| APIs | FHIR R4 with SMART on FHIR OAuth 2.0, geo-load balancing, 99.9% SLA |
| Multi-Region | Hub-and-spoke β 5 regional nodes, data residency enforcement per GDPR/PIPL/PDPA |
| Security | Zero Trust, Private Endpoints, Azure Purview RBAC, field-level encryption |
| Clinical AI | Predictive sepsis, NLP discharge summaries, imaging triage β all within Medallion Gold |
βΆ Open Interactive Solution β
NewLife Hospital β Clinical NLP, Medical Imaging AI, Real-time Sepsis Alerting, MLOps, $2Mβ$6M Year-1 ROI
One-line verdict: Lambda architecture for batch + streaming analytics, with a unified MLOps platform (MLflow + Databricks) that governs clinical models from FDA SaMD Class II compliance to bedside alerting in under 60 seconds.
| Dimension | Decision |
|---|---|
| Big Data | Lambda architecture β Spark batch (Databricks) + Kafka/Event Hub streaming |
| Clinical NLP | spaCy + Med7 + BERT-clinical pipeline: 92%+ F1 on entity extraction |
| Imaging AI | CNN + ViT ensemble, 3-stage review workflow, FDA SaMD Class II governance |
| Real-time | NEWS2 sepsis score β Kafka β Feature Store β model inference β alert in <60s |
| MLOps | MLflow + AzureML: Experiment β Train β Validate β Deploy β Monitor β Retrain |
| Responsible AI | Bias audit, GDPR Art. 22 human-in-loop, FDA SaMD classification, explainability |
| ROI | Year 1: $2M invest β $6M return Β· Year 2: $4M β $16M Β· Year 3: $8M β $40M |
βΆ Open Interactive Solution β
NewLife Hospital β RAG pipeline, Data Products, $127M NPV business case, 5-year operating model
One-line verdict: A GenAI Clinical Intelligence Platform built on Retrieval-Augmented Generation, with PHI de-identification gate, vector store serving 200M+ patient records, and a federated data product marketplace β all governed by a CDO-led operating model with a measurable $127M NPV over 5 years.
| Dimension | Decision |
|---|---|
| GenAI Architecture | RAG pipeline β PHI De-ID β Chunking β Embedding β Vector Store β LLM β Audit |
| Vector Store | Azure AI Search (hybrid dense + sparse) β HIPAA-compliant, 200M+ patient records |
| Data Products | Federated marketplace β 12 certified products across Clinical, Ops, Finance, Research |
| Modernization | Legacy EHR β Cloud: Assess (3I) β Lift-and-Shift β Re-platform β Re-architect |
| Operating Model | CDO β Data Domains β Product Owners β Engineers Β· Hub-and-Spoke federated |
| Business Case | $127M NPV, 287% ROI, 18-month payback β board-ready financial model |
βΆ Open Interactive Solution β
| Case Study | Domain | Days | Link |
|---|---|---|---|
| NewLife Pharmacy Supply Chain | Pharmaceutical D2P | Day 1 | View β |
| NewLife Hospital β Data Management | Healthcare MDM + Governance | Day 2 | View β |
| NewLife Hospital β Cloud Platform | Healthcare Lakehouse + FHIR | Day 3 | View β |
| NewLife Hospital β Analytics & AI | Clinical NLP, MLOps, Sepsis AI | Day 4 | View β |
| NewLife Hospital β Strategy & GenAI | RAG, Data Products, $127M NPV | Day 5 | View β |
MIT Β© 2024 wjlgatech and contributors.
