An agentic knowledge verification system that mathematically scores every claim using Jaseci OSP (nodes/edges/walkers), byLLM, jsonld-ex Subjective Logic, and Tavily web search. Built at Velric Miami Hackathon 2026.
Every major AI system today has the same fatal flaw: it states opinions as facts and guesses with the same confidence as knowledge.
When ChatGPT says "studies show remote work increases productivity by 13%," you have no way to know:
- Is that number from one study or twenty?
- Do other studies contradict it?
- Was the source a peer-reviewed journal or a blog post?
- How much of that answer is evidence vs. how much is the model filling in gaps?
This isn't a minor UX issue. Hallucinations in AI-generated research, due diligence, medical advice, legal analysis, and financial decisions cause real harm. Organizations are making million-dollar decisions based on AI outputs that look authoritative but have no mathematical grounding.
The root cause is simple: traditional AI agents treat confidence as a single number (or worse, don't track it at all). A scalar confidence = 0.5 is meaningless: it could mean "strong evidence that the probability is 50%" or "we have literally no evidence and are guessing." These are fundamentally different situations that require fundamentally different responses.
TrustGraph is an agentic AI system that doesn't just find information, it verifies it using formal mathematics from Subjective Logic (JΓΈsang 2016).
Every fact in a TrustGraph report comes with:
- A mathematical opinion tuple
(belief, disbelief, uncertainty, base_rate): not a vibe, not a guess, a formally computed score - A provenance chain : which source said it, when, and how trustworthy that source is
- Conflict detection : where sources disagree, quantified to a precise degree
- Trust-weighted evidence fusion : .gov and .edu sources count more than Reddit posts
You ask: "Is remote work more productive than office work?"
β
βΌ
βββββββββββββββββββββββββ
[1] β PLAN β Agent decomposes your question into
β (byLLM + Gemini) β 3-5 specific, verifiable claims
βββββββββββββ¬ββββββββββββ
βΌ
βββββββββββββββββββββββββ
[2] β SEARCH β Searches the web for each claim
β (Tavily API) β using optimized queries
βββββββββββββ¬ββββββββββββ
βΌ
βββββββββββββββββββββββββ
[3] β EXTRACT β LLM reads each source and extracts
β (byLLM + Gemini) β evidence for/against with relevance
βββββββββββββ¬ββββββββββββ
βΌ
βββββββββββββββββββββββββ
[4] β SCORE β Subjective Logic algebra:
β (jsonld-ex) β β’ Scalar β opinion tuple (b,d,u,a)
β β β’ Trust discount by source quality
β β β’ Cumulative fusion across sources
β β β’ Pairwise conflict detection
βββββββββββββ¬ββββββββββββ
βΌ
βββββββββββββββββββββββββ
[5] β REPORT β Synthesized brief with per-claim
β (byLLM + JSON-LD) β confidence, conflicts, provenance
βββββββββββββββββββββββββ
Traditional AI confidence is a single number. Subjective Logic uses four numbers, and that makes all the difference.
| Component | Meaning | Why It Matters |
|---|---|---|
| Belief (b) | Evidence FOR the claim | How much evidence supports this |
| Disbelief (d) | Evidence AGAINST the claim | How much evidence contradicts this |
| Uncertainty (u) | ABSENCE of evidence | How much we simply don't know |
| Base Rate (a) | Prior probability | What we'd assume with zero evidence |
Constraint: b + d + u = 1 : your total epistemic state is always fully accounted for.
| Scenario | Scalar Confidence | Subjective Logic Opinion |
|---|---|---|
| "Strong evidence it's 50/50" | 0.5 | b=0.45, d=0.45, u=0.10, a=0.5 |
| "We have no idea" | 0.5 | b=0.00, d=0.00, u=1.00, a=0.5 |
| "Sources violently disagree" | 0.5 | b=0.40, d=0.40, u=0.20, a=0.5 |
A traditional agent would treat all three as identical. TrustGraph distinguishes them, and that distinction drives completely different downstream decisions:
- Low uncertainty, balanced belief/disbelief β "The evidence genuinely shows this is a toss-up"
- High uncertainty β "We need more sources before making a call"
- High conflict β "Sources disagree: here's exactly where and by how much"
When multiple sources agree, cumulative fusion mathematically reduces uncertainty:
Source 1 alone: b=0.567, d=0.100, u=0.333 β P=0.733
Source 2 alone: b=0.675, d=0.075, u=0.250 β P=0.800
Fused (1 + 2): b=0.733, d=0.100, u=0.167 β P=0.817
β uncertainty dropped by 50%
This is exactly how human reasoning works: each independent source that agrees shrinks our uncertainty.
A .gov study and a Reddit comment shouldn't carry equal weight. TrustGraph applies trust discount : an opinion from an untrusted source gets its belief diluted and its uncertainty inflated:
High-trust source (0.9): b=0.510, d=0.090, u=0.400 β P=0.710 (verdict: SUPPORTED)
Low-trust source (0.3): b=0.045, d=0.105, u=0.850 β P=0.470 (verdict: CONTESTED)
Same raw evidence, but the low-trust source produces a much more uncertain opinion. The system knows it shouldn't rely on that source alone.
When two opinions point in opposite directions, TrustGraph detects and quantifies the conflict:
Source A says: "Remote work increases productivity" (b=0.7, d=0.1)
Source B says: "Remote work decreases productivity" (b=0.1, d=0.7)
Conflict degree: 0.84 (severe disagreement)
This surfaces in the report as a flagged conflict: the user sees exactly where the evidence is split and can investigate further.
| Use Case | Without TrustGraph | With TrustGraph |
|---|---|---|
| Research & Due Diligence | "Studies suggest X" (which studies? how many? do they agree?) | "3 sources support X (P=0.82), 1 contradicts (conflict=0.34), uncertainty=0.12" |
| Fact-Checking | "This claim is mostly true" | "Belief=0.73, Disbelief=0.10, Uncertainty=0.17 : supported with high confidence from .gov and .edu sources" |
| Medical Research | "Treatment A may be effective" | "4 peer-reviewed sources fuse to P=0.89, but 1 contradicts (conflict=0.41) : flag for human review" |
| Legal Analysis | "Precedent suggests..." | Per-claim provenance chain, source trust ratings, formal conflict quantification |
| Business Intelligence | "Market trends indicate..." | Mathematically weighted evidence from multiple sources with uncertainty quantified |
The key insight: TrustGraph doesn't eliminate uncertainty : it makes uncertainty visible and mathematically precise. This lets humans make better decisions because they know exactly what the AI knows, what it doesn't know, and where the evidence disagrees.
- Python 3.12+
- Gemini API Key (free) - Get one here
- Tavily API Key (free, 1000 searches/month) - Get one here
git clone https://github.com/jemsbhai/trustgraph.git
cd trustgraph
pip install jaseci jsonld-ex streamlitPowerShell (Windows):
$env:GEMINI_API_KEY = "your-gemini-api-key"
$env:TAVILY_API_KEY = "tvly-your-tavily-api-key"Bash (Mac/Linux):
export GEMINI_API_KEY="your-gemini-api-key"
export TAVILY_API_KEY="tvly-your-tavily-api-key"Web UI (recommended, best for demos):
streamlit run ui/app.pyOpens a browser at http://localhost:8501 with an interactive dashboard.
CLI (quick test):
jac run trustgraph.jacRuns the default query and prints results to terminal. Edit _query.txt to change the question.
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Streamlit Web UI β
β Query Input β Live Agent Log β Confidence Dashboard β
β Opinion Bars β Conflict Detection β JSON-LD Export β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Jaseci / Jac Layer (OSP + byLLM) β
β β
β NODES EDGES WALKER β
β Query Spawns TrustGraphAgent β
β Claim SupportsEdge β’ Plan (decompose)β
β Source ContradictsEdge β’ Search (Tavily) β
β Evidence DerivedFrom β’ Extract (byLLM) β
β ReportNode HasEvidence β’ Score (jsonld-ex)β
β HasClaim β’ Report (byLLM) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β jsonld-ex Confidence Algebra Bridge β
β β
β scalar_to_opinion() β fuse_evidence() β
β apply_trust_discount() β detect_conflicts() β
β opinion_summary() β build_jsonld_claim() β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β External Tools β
β Tavily Web Search β Gemini LLM (via byLLM) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
trustgraph/
βββ jac.toml # Jaseci project config (Gemini model)
βββ trustgraph.jac # Core agent: OSP graph model + walker + 5 byLLM functions
βββ bridge/
β βββ __init__.py
β βββ confidence.py # jsonld-ex Subjective Logic integration
β # scalar_to_opinion, fuse_evidence,
β # apply_trust_discount, detect_conflicts,
β # opinion_summary, build_jsonld_claim
βββ tools/
β βββ __init__.py
β βββ search.py # Tavily web search tool
βββ models/
β βββ graph.jac # Standalone OSP node/edge test
βββ ui/
β βββ app.py # Streamlit dashboard
βββ examples/
β βββ sample_output.jsonld # Example JSON-LD verification report
βββ README.md
βββ PLAN.md # Original architecture plan
This project uses Jaseci extensively, not as a thin wrapper, but as the core runtime for the entire agent.
The knowledge graph is defined using Jac's native node/edge primitives:
Nodes β the objects in our verification graph:
Queryβ the user's research questionClaimβ a specific verifiable statement decomposed from the querySourceβ a web source with URL, title, and trust scoreEvidenceβ extracted text from a source, with relevance and confidenceReportNodeβ the final synthesized report
Edges β typed relationships between nodes:
Spawnsβ Query β Claim (decomposition)SupportsEdge/ContradictsEdgeβ Evidence β Claim (for/against)DerivedFromβ Evidence β Source (provenance)HasEvidenceβ Claim β Evidence (collection)HasClaimβ Report β Claim (aggregation)
TrustGraphAgent is a Jac walker, an autonomous agent that traverses the graph executing the PlanβSearchβExtractβScoreβReport loop. The walker:
- Creates nodes and edges as it discovers information
- Carries state (
query_text,max_search_per_claim,report) - Orchestrates the full agentic pipeline in a single graph traversal
All LLM calls use Jac's by llm() declaration: no prompt engineering, no API boilerplate:
"""Given a research question, decompose it into 3-5 specific verifiable claims."""
def decompose_query(question: str) -> list[str]
by llm();The five byLLM functions:
decompose_query()β breaks a question into verifiable claimsextract_evidence()β analyzes source text for evidenceassess_claim()β synthesizes an assessment from collected evidencewrite_summary()β generates an executive summaryclaim_to_search_query()β optimizes a claim for web search
Jac natively imports our Python modules:
import from bridge.confidence { scalar_to_opinion, fuse_evidence, ... }
import from tools.search { web_search }This lets us use the full jsonld-ex library (pure Python) directly from Jac code.
| Criteria | Implementation |
|---|---|
| Goal | Verify claims and produce a mathematically grounded research brief |
| Tools | Web search (Tavily), LLM reasoning (Gemini via byLLM), confidence algebra (jsonld-ex Subjective Logic) |
| Loop | Plan β Search β Extract β Score β Report : executed per claim, with cross-claim conflict detection |
| Guardrails | Source trust heuristics (.gov=0.9, Reddit=0.35), confidence thresholds, structured output parsing with fallbacks, search timeouts |
| Product Surface | Streamlit web UI with live progress streaming, confidence visualization, JSON-LD export |
Every verification produces a machine-readable JSON-LD document conforming to Schema.org, jsonld-ex, and PROV-O vocabularies:
{
"@context": {
"@vocab": "https://schema.org/",
"ex": "https://jsonld-ex.org/vocab#",
"prov": "http://www.w3.org/ns/prov#"
},
"@type": "ex:TrustGraphReport",
"ex:query": "Is remote work more productive?",
"ex:claims": [
{
"@type": "ex:VerifiedClaim",
"ex:claimText": "Remote workers report higher output...",
"ex:confidence": {
"@type": "ex:SubjectiveOpinion",
"ex:belief": 0.733,
"ex:disbelief": 0.100,
"ex:uncertainty": 0.167,
"ex:baseRate": 0.5,
"ex:projectedProbability": 0.817
},
"prov:wasGeneratedBy": {
"@type": "prov:Activity",
"prov:wasAssociatedWith": "TrustGraph Agent"
}
}
],
"ex:conflicts": [...],
"ex:summary": "..."
}This output is interoperable with the entire semantic web ecosystem: SPARQL queries, RDF stores, SHACL validation, OWL reasoning, PROV-O provenance graphs.
| Component | Technology | Role |
|---|---|---|
| Graph Runtime | Jaseci OSP (nodes, edges, walkers) | Knowledge graph modeling + agentic traversal |
| LLM Integration | byLLM (by llm()) + Gemini via LiteLLM |
Claim decomposition, evidence extraction, synthesis |
| Confidence Scoring | jsonld-ex Subjective Logic (JΓΈsang 2016) | Opinion tuples, cumulative fusion, trust discount, conflict detection |
| Provenance | jsonld-ex + PROV-O vocabulary | Source tracking, attribution chains |
| Web Search | Tavily API | Real-time web evidence retrieval |
| Web UI | Streamlit | Interactive dashboard with live progress |
By default, TrustGraph lets the LLM decide how many claims to decompose (typically 3-5). You can override this for faster demos or deeper research.
CLI:
# Quick fact-check (2 claims, ~10 API calls)
jac run trustgraph.jac --claims 2 "Is coffee good for your health?"
# Default (3-5 claims, ~20 API calls)
jac run trustgraph.jac "Is coffee good for your health?"
# Deep research (7 claims, ~35 API calls)
jac run trustgraph.jac --claims 7 "Is coffee good for your health?"Web UI:
Use the Claims slider next to the query input. Set to 0 for auto, or 2-8 for explicit control.
| Claims | API Calls | Best For |
|---|---|---|
| 2-3 | ~15 | Quick fact-checks, live demos |
| 4-5 (default) | ~25 | Balanced research |
| 6-8 | ~35-50 | Deep due diligence, comprehensive reports |
- JΓΈsang, A. (2016). Subjective Logic: A Formalism for Reasoning Under Uncertainty. Springer.
- jsonld-ex: JSON-LD 1.2 Extensions for AI/ML : PyPI | GitHub
- Jaseci & Jac : docs.jaseci.org | GitHub
- W3C PROV-O : Provenance Ontology
MIT
Built at the Velric Miami Hackathon 2026 by Fifi and Muntaser β Agentic AI Track.