Reference implementation of proposed JSON-LD 1.2 extensions for AI/ML data exchange, security hardening, and validation.
Companion implementation for: "Extending JSON-LD for Modern AI: Addressing Security, Data Modeling, and Implementation Gaps" — FLAIRS-39 (2026)
jsonld-ex extends the existing JSON-LD ecosystem with backward-compatible extensions that address critical gaps in:
- AI/ML Data Modeling —
@confidence,@source,@vectorcontainer, provenance tracking, multimodal annotations, calibration & aggregation metadata - Confidence Algebra — Full Subjective Logic framework (Jøsang 2016): opinions, cumulative/averaging fusion, trust discount, deduction, conflict detection, Byzantine-resistant fusion, temporal decay
- Compliance Algebra — GDPR regulatory uncertainty modeling: jurisdictional meet, compliance propagation, consent assessment, temporal triggers, erasure scope
- Similarity Metrics — Extensible registry with 7 built-in + 10 example metrics, metric selection advisory system (compare, analyze, recommend, evaluate)
- Data Protection — GDPR/privacy compliance with W3C DPV v2.2 interop: consent lifecycle, data subject rights (Art. 15–20), personal data classification
- Security Hardening —
@integritycontext verification, context allowlists, resource limits - Validation —
@shapenative validation with nested shapes, conditional constraints (@if/@then/@else), severity levels, shape inheritance (@extends) - Inference — Confidence propagation through inference chains, multi-source combination (noisy-OR, Dempster–Shafer)
- Graph Operations — Confidence-aware merging, semantic diff, conflict resolution
- Temporal Modeling —
@validFrom,@validUntil,@asOffor time-aware assertions - Dataset Metadata — ML dataset cards with Croissant interop (
to_croissant/from_croissant) - IoT Transport — CBOR-LD binary serialization, MQTT topic/QoS derivation, SSN/SOSA interop
- Context Versioning — Context diff, backward compatibility checking
- MCP Server — 53 tools exposing all library capabilities to LLM agents via the Model Context Protocol
jsonld-ex does not replace existing standards — it bridges them:
| Standard | Relationship |
|---|---|
| PROV-O | Bidirectional conversion via to_prov_o / from_prov_o (60–75% fewer triples) |
| SHACL | Bidirectional mapping via shape_to_shacl / shacl_to_shape |
| OWL | Bidirectional: shape_to_owl_restrictions / owl_to_shape |
| RDF-Star | Bidirectional: to_rdf_star_ntriples / from_rdf_star_ntriples, plus Turtle export |
| SSN/SOSA | Bidirectional IoT sensor metadata via to_ssn / from_ssn |
| Croissant | ML dataset metadata via to_croissant / from_croissant |
| DPV v2.2 | Data privacy vocabulary via to_dpv / from_dpv |
| CBOR-LD | Binary serialization with context compression |
┌───────────────────────────────────────────────────────────────────────────┐
│ MCP Server (53 tools, 5 resources, 4 prompts) │
├───────────────────────────────────────────────────────────────────────────┤
│ jsonld-ex Extensions (v0.6.5) │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Confidence Algebra (Subjective Logic) + Compliance Algebra (GDPR) │ │
│ │ Opinions, fusion, trust discount, deduction, Byzantine-resistant │ │
│ │ Jurisdictional meet, consent, propagation, erasure, triggers │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ │
│ │ AI/ML │ │ Security │ │ Validation │ │ Inference │ │
│ │ @confidence │ │ @integrity │ │ @shape │ │ propagation │ │
│ │ @source │ │ allowlist │ │ @if/@then │ │ combination │ │
│ │ @vector │ │ limits │ │ @extends │ │ conflict res. │ │
│ └────────────────┘ └────────────────┘ └────────────────┘ └────────────────┘ │
│ │
│ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ │
│ │ Data Protection│ │ Similarity │ │ Dataset / │ │ Context │ │
│ │ GDPR, DPV │ │ 7 built-in │ │ Croissant │ │ versioning │ │
│ │ consent, rights│ │ 10 examples │ │ interop │ │ diff, compat │ │
│ │ erasure, audit │ │ advisory sys. │ │ │ │ │ │
│ └────────────────┘ └────────────────┘ └────────────────┘ └────────────────┘ │
│ │
│ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ │
│ │ Temporal │ │ Merge / Diff │ │ Interop │ │ IoT Transport │ │
│ │ @validFrom │ │ graphs │ │ PROV-O, SHACL │ │ CBOR-LD, MQTT │ │
│ │ @validUntil │ │ conflict │ │ OWL, RDF-Star │ │ SSN/SOSA │ │
│ │ @asOf │ │ resolution │ │ SSN, Croissant │ │ topic, QoS │ │
│ └────────────────┘ └────────────────┘ └────────────────┘ └────────────────┘ │
├───────────────────────────────────────────────────────────────────────────┤
│ PyLD (Core JSON-LD 1.1 Processing) │
├───────────────────────────────────────────────────────────────────────────┤
│ JSON-LD 1.1 Specification │
└───────────────────────────────────────────────────────────────────────────┘
# Core (all features except IoT transport)
pip install jsonld-ex
# With IoT transport (CBOR-LD + MQTT helpers)
pip install jsonld-ex[iot]from jsonld_ex import annotate, get_confidence
doc = {
"@context": "http://schema.org/",
"@type": "Person",
"name": annotate(
"John Smith",
confidence=0.95,
source="https://ml-model.example.org/ner-v2",
extracted_at="2026-01-15T10:30:00Z",
method="NER",
),
}
get_confidence(doc["name"]) # 0.95from jsonld_ex import propagate_confidence, combine_sources
# Source (0.9 conf) → Rule (0.8 conf) → Conclusion
result = propagate_confidence([0.9, 0.8], method="dampened")
result.score # 0.849 (less aggressive than naive 0.72)
# Two sources independently say the same thing
combined = combine_sources([0.8, 0.7], method="noisy_or")
combined.score # 0.94from jsonld_ex import merge_graphs
graph_a = {"@context": "http://schema.org/", "@graph": [
{"@id": "ex:alice", "@type": "Person",
"name": {"@value": "Alice", "@confidence": 0.8, "@source": "model-A"}}
]}
graph_b = {"@context": "http://schema.org/", "@graph": [
{"@id": "ex:alice", "@type": "Person",
"name": {"@value": "Alice", "@confidence": 0.7, "@source": "model-B"}}
]}
merged, report = merge_graphs([graph_a, graph_b])
# Agreement → confidence boosted via noisy-OR: 0.94
# report.properties_agreed == 1, report.properties_conflicted == 0from jsonld_ex import add_temporal, query_at_time
nodes = [
{"@id": "ex:alice", "jobTitle": add_temporal(
{"@value": "Engineer", "@confidence": 0.9},
valid_from="2020-01-01", valid_until="2023-12-31",
)},
{"@id": "ex:alice", "jobTitle": add_temporal(
{"@value": "Manager", "@confidence": 0.85},
valid_from="2024-01-01",
)},
]
query_at_time(nodes, "2022-06-15") # → Engineer
query_at_time(nodes, "2025-01-01") # → Managerfrom jsonld_ex import to_cbor, from_cbor, payload_stats
doc = {"@context": "http://schema.org/", "@type": "SensorReading",
"value": {"@value": 42.5, "@confidence": 0.9}}
stats = payload_stats(doc)
# stats.cbor_ratio ≈ 0.65 (35% smaller than JSON)
# stats.gzip_cbor_ratio ≈ 0.45 (55% smaller than JSON)
payload = to_cbor(doc) # bytes for wire transmission
restored = from_cbor(payload) # back to dictfrom jsonld_ex import to_prov_o, from_prov_o
doc = {
"@context": "http://schema.org/",
"@type": "Person",
"name": {"@value": "Alice", "@confidence": 0.95,
"@source": "https://model.example.org/v2",
"@method": "NER"},
}
prov_doc, report = to_prov_o(doc)
# Full PROV-O graph with Entity, Activity, Agent nodes
# report.compression_ratio shows jsonld-ex is 3-5x more compact
round_tripped = from_prov_o(prov_doc)
# Back to inline annotations — lossless round-trip| Module | Key Exports | Description |
|---|---|---|
ai_ml |
annotate, get_confidence, get_provenance, filter_by_confidence |
Core annotation with 23 provenance fields |
confidence_algebra |
Opinion, cumulative_fuse, averaging_fuse, trust_discount, deduce, robust_fuse |
Subjective Logic framework (Jøsang 2016) |
compliance_algebra |
ComplianceOpinion, jurisdictional_meet, compliance_propagation, consent_validity, erasure_scope_opinion |
GDPR regulatory uncertainty modeling |
similarity |
similarity, compare_metrics, analyze_vectors, recommend_metric, evaluate_metrics, MetricProperties |
7 built-in + extensible metrics, advisory system |
data_protection |
annotate_protection, create_consent_record, is_consent_active, filter_by_jurisdiction |
GDPR/privacy compliance metadata |
data_rights |
request_erasure, execute_erasure, export_portable, right_of_access_report |
Data subject rights (GDPR Art. 15–20) |
dpv_interop |
to_dpv, from_dpv, compare_with_dpv |
W3C Data Privacy Vocabulary v2.2 |
validation |
validate_node, validate_document |
@shape validation with @if/@then, @extends |
security |
compute_integrity, verify_integrity, is_context_allowed |
@integrity and allowlists |
owl_interop |
to_prov_o, from_prov_o, shape_to_shacl, shacl_to_shape, to_ssn, from_ssn |
Bidirectional: PROV-O, SHACL, OWL, RDF-Star, SSN/SOSA |
dataset |
create_dataset_metadata, to_croissant, from_croissant |
ML dataset cards, Croissant interop |
inference |
propagate_confidence, combine_sources, resolve_conflict |
Confidence propagation and combination |
confidence_bridge |
combine_opinions_from_scalars, propagate_opinions_from_scalars |
Scalar-to-opinion bridge |
confidence_decay |
decay_opinion, exponential_decay, linear_decay, step_decay |
Temporal decay of evidence |
merge |
merge_graphs, diff_graphs |
Graph merging and diff |
temporal |
add_temporal, query_at_time, temporal_diff |
Time-aware assertions |
vector |
validate_vector, cosine_similarity, vector_term_definition |
@vector container support |
batch |
annotate_batch, validate_batch, filter_by_confidence_batch |
Batch operations |
context |
context_diff, check_compatibility |
Context versioning and migration |
cbor_ld |
to_cbor, from_cbor, payload_stats |
Binary serialization (requires cbor2) |
mqtt |
to_mqtt_payload, from_mqtt_payload, derive_mqtt_topic, derive_mqtt_qos |
IoT transport (requires cbor2) |
mcp |
MCP server (53 tools, 5 resources, 4 prompts) | LLM agent integration (requires mcp) |
Detailed documentation, usage examples, and API reference for each language implementation:
| Package | Path | Status |
|---|---|---|
| Python | packages/python/README.md |
✅ Published on PyPI — 23 modules, 53 MCP tools, 2025+ tests |
| JavaScript/TypeScript | packages/js/README.md |
🚧 Early development (v0.1.0) — 4 core modules (ai-ml, security, validation, vector) |
Formal specifications for each extension are in /spec:
- AI/ML Extensions — Confidence, provenance, vector embeddings
See DOCS_PLAN.md for the comprehensive documentation roadmap.
This is a research implementation accompanying an academic publication. Contributions welcome via issues and PRs.
MIT
@inproceedings{jsonld-ex-flairs-2026,
title={Extending JSON-LD for Modern AI: Addressing Security, Data Modeling, and Implementation Gaps},
author={Syed, Muntaser and Silaghi, Marius and Abujar, Sheikh and Alssadi, Rwaida},
booktitle={Proceedings of the 39th International FLAIRS Conference},
year={2026}
}A follow-up paper targeting NeurIPS 2026 Datasets & Benchmarks is in preparation, covering the formal confidence algebra, comprehensive benchmarks, and extended evaluation.