-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
design-neededNeeds architectural discussion before implementationNeeds architectural discussion before implementationrefactorInternal restructuring, no behavior changeInternal restructuring, no behavior change
Description
Context
ValidationService.run_hooks() currently executes all hooks sequentially within a single UOW scope. All run_repo.save() calls hit the same session and only commit once at scope exit. This means:
- No observable progress: Intermediate state transitions (PENDING → RUNNING, per-hook results) are invisible to external observers until the entire pipeline completes
- All-or-nothing failure: If the worker crashes mid-pipeline, all progress is lost — the run stays PENDING with no partial results
- No granular commits: The
save()calls document intent but are no-ops within the session
This isn't a bug — the current approach is functionally correct. But as hooks are expensive OCI containers (seconds to minutes each), observable progress matters for UX.
Design
Refactor validation into a saga / outbox-driven state machine where each hook execution is its own UOW:
State machine
ValidationRun gains a current_hook_index: int field tracking pipeline position.
Each handler invocation:
- Runs one hook
- Saves result + advances
current_hook_index - Emits either:
RunNextHook(self-loop to continue pipeline)ValidationCompleted/ValidationFailed(terminal)
Event flow
DepositionSubmitted
→ ValidateDeposition (creates run, emits RunNextHook for hook 0)
→ RunNextHook (runs hook 0, saves result, emits RunNextHook for hook 1)
→ RunNextHook (runs hook 1, saves result, emits ValidationCompleted)
Benefits
- Each step commits independently → progress visible to UI polling
- Worker crash mid-hook → stale claim retry from last committed state
- Natural fit with existing outbox + worker infrastructure
- Overhead of extra DB round-trip per hook is negligible vs OCI container execution time
Trade-offs
- More events and handler invocations per validation run (one per hook instead of one total)
ValidationRunaggregate becomes slightly more complex (tracks position)- Need to handle the zero-hooks case (instant pass, same as today)
References
osa/domain/validation/service/validation.py— currentrun_hooks()implementationosa/infrastructure/event/worker.py— WorkerPool / stale claim mechanism
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
design-neededNeeds architectural discussion before implementationNeeds architectural discussion before implementationrefactorInternal restructuring, no behavior changeInternal restructuring, no behavior change