Skip to content

xtdb/fhir-sandbox

Repository files navigation

FHIR to XTDB Importer

A batch data pipeline that imports healthcare data from Synthea-generated FHIR JSON bundle files into XTDB, a temporal database with PostgreSQL wire protocol compatibility.

Overview

This project demonstrates how to:

  • Parse and import FHIR R4 healthcare records efficiently
  • Store clinical data in XTDB with temporal tracking
  • Leverage XTDB's bitemporal capabilities for healthcare timeline queries

The importer extracts Patients, Encounters, and Conditions from FHIR bundles, mapping temporal fields to XTDB's _valid_from and _valid_to columns for point-in-time queries. All other resources found within each FHIR bundle are stored within the Patients table in its JSON format.

Features

  • Batch Import: Recursively processes all JSON files in a directory
  • Bitemporal Support: Maps valid onset/abatement dates for Conditions to XTDB valid-time boundaries, as well as valid period_start/_end dates for Encounters.
  • Connection Pooling: HikariCP for efficient database connections
  • Structured Logging: SLF4J with Logback for operational visibility

Prerequisites

  • Java 21 or later
  • Maven 3.9+
  • XTDB running with PostgreSQL wire protocol enabled (Port: 5434)

Quick Start

Build

cd xtdb-fhir
mvn clean package -DskipTests

This produces an executable JAR at target/xtdb-fhir-1.0-SNAPSHOT.jar.

Run

java -jar target/xtdb-fhir-1.0-SNAPSHOT.jar <input-directory>

Example:

java -jar target/xtdb-fhir-1.0-SNAPSHOT.jar ./data/fhir

Sample Output

========================================
Import complete in  1250 ms
Files processed:    50
Patients:           50
Encounters:         152
Conditions:         287
All Other Resources:450
Errors:             0
========================================

Configuration

Edit src/main/resources/application.properties:

db.host=localhost
db.port=5434
db.name=xtdb
db.user=xtdb
db.pool.size=5
db.pool.timeout=10000
Property Default Description
db.host localhost XTDB server hostname
db.port 5434 PostgreSQL wire protocol port
db.name xtdb Database name
db.user xtdb Connection username
db.pool.size 5 HikariCP connection pool size
db.pool.timeout 10000 Connection timeout in milliseconds

Project Structure

xtdb-fhir/
├── src/
│   ├── main/
│   │   ├── java/com/example/
│   │   │   ├── Application.java           # CLI entry point
│   │   │   ├── config/
│   │   │   │   └── DatabaseConfig.java    # Connection & pooling logic
│   │   │   ├── service/
│   │   │   │   └── FHIRImportService.java # Core import logic
│   │   │   └── util/
│   │   │       └── JsonUtil.java          # JSON navigation utilities
│   │   └── resources/
│   │       ├── application.properties     # Database configuration
│   │       └── logback.xml                # Logging configuration
│   └── test/
│       └── java/com/example/              # Unit and integration tests
├── data/fhir/                             # Static Sample FHIR JSON files
└── pom.xml                                # Maven build configuration

Architecture

Data Flow

FHIR JSON Files (Synthea)
        ↓
   JsonUtil.parseFile()
        ↓
┌───────────────────────────┐
│ Extract by resourceType:  │
│ • Patients                │
│ • Encounters              │
│ • Conditions              │
│ • Other (grouped by pt)   │
└───────────────────────────┘
        ↓
   Build XTDB Records
        ↓
   Insert via JDBC
        ↓
   XTDB Database

Temporal Mapping

FHIR Field XTDB Column Description
onsetDateTime _valid_from When a condition became active
abatementDateTime _valid_to When a condition resolved (NULL if data not found/condition is ongoing)
period_start _valid_from When the encounter started (encounter examples: Hospital, Doctors Appointment, etc.)
period_end _valid_tp When the encounter ended

This enables queries like: "What conditions were active on March 15, 2020?"

Testing

# Unit tests (no database required)
mvn test

# Integration tests (requires running XTDB)
mvn verify

Test Coverage

  • Unit Tests: JSON parsing, configuration loading
  • Integration Tests: Patient import, temporal queries, multi-resource bundles

Sample Data

The data/fhir/ directory contains 50+ synthetic patient records generated by Synthea. Each file is a FHIR Bundle containing:

  • Patient demographics
  • Encounter history
  • Medical conditions
  • Observations, medications, and other resources

Technologies

Technology Version Purpose
Java 21 Runtime with modern features (records, pattern matching)
Maven 3.9+ Build and dependency management
XTDB 2.x Temporal database with bitemporal queries
Jackson 2.17 JSON parsing
HikariCP 5.1 Connection pooling
JUnit 5 5.10 Testing framework
SLF4J/Logback 2.0/1.5 Logging

License

MIT License - see LICENSE for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •