-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
design-neededNeeds architectural discussion before implementationNeeds architectural discussion before implementationrefactorInternal restructuring, no behavior changeInternal restructuring, no behavior change
Description
Summary
With the unified hook model (#66), transformations that produce derived data now run as OCI hooks before record minting. Vector embedding for search is currently implemented as a post-publication index step (VectorIndexHandler consuming IndexRecord events), but it should be a hook so that:
- Embeddings are computed before the record is minted (same as all other derived data)
- Embedding vectors are stored in feature tables like other hook outputs
- The search index reads from feature tables rather than computing embeddings on the fly
- The entire
IndexRecord→VectorIndexHandler/KeywordIndexHandlerfan-out machinery can be removed (tracked in refactor: event system overhaul — consumer groups, decoupled domains, simplified pipeline #68)
Current state
FanOutToIndexBackendshandler consumesRecordPublished, emitsIndexRecordper backend with routing keysVectorIndexHandler(routing_key="vector") callsbackend.ingest_batch()with sentence-transformersKeywordIndexHandler(routing_key="keyword") indexes metadata text- ChromaDB is the vector backend
Target state
- A vector embedding hook (OCI container) runs during validation, producing
features.jsonwith embedding vectors - Embeddings are stored in the hook's feature table via
InsertRecordFeatures - Search queries read from the feature table (or a materialized view / search index built from it)
- The
indexdomain's fan-out pattern is removed entirely
Depends on
- refactor: event system overhaul — consumer groups, decoupled domains, simplified pipeline #68 (event system overhaul removes the index fan-out infrastructure)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
design-neededNeeds architectural discussion before implementationNeeds architectural discussion before implementationrefactorInternal restructuring, no behavior changeInternal restructuring, no behavior change