Skip to content

Add workflow try helpers and visualizer support#4413

Open
NathanFlurry wants to merge 6 commits intomainfrom
workflow-try-step-api
Open

Add workflow try helpers and visualizer support#4413
NathanFlurry wants to merge 6 commits intomainfrom
workflow-try-step-api

Conversation

@NathanFlurry
Copy link
Member

Description

Adds ctx.tryStep() and ctx.try() to the workflow engine and RivetKit wrapper so workflows can recover from terminal step, join, and race failures without swallowing scheduler control flow. It also updates the workflow visualizer to render named try scopes and handled failures, plus adds docs, stories, and integration coverage for the new behavior.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

  • pnpm exec vitest run tests/try.test.ts tests/join.test.ts tests/race.test.ts in rivetkit-typescript/packages/workflow-engine
  • pnpm exec vitest run tests/driver-memory.test.ts -t "tryStep and try recover terminal workflow failures" in rivetkit-typescript/packages/rivetkit
  • pnpm test workflow-to-xyflow in frontend
  • pnpm exec biome check src/components/actors/workflow/workflow-to-xyflow.ts src/components/actors/workflow/workflow-to-xyflow.test.ts src/components/actors/workflow/xyflow-nodes.tsx src/components/actors/workflow/workflow-visualizer.tsx src/components/actors/workflow/workflow-example-data.ts src/components/actors/workflow/xyflow-nodes.stories.tsx in frontend

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@railway-app
Copy link

railway-app bot commented Mar 13, 2026

🚅 Deployed to the rivet-pr-4413 environment in rivet-frontend

Service Status Web Updated (UTC)
frontend-cloud 😴 Sleeping (View Logs) Web Mar 13, 2026 at 5:03 am
frontend-inspector 😴 Sleeping (View Logs) Web Mar 13, 2026 at 5:01 am
ladle ❌ Build Failed (View Logs) Web Mar 13, 2026 at 4:50 am
website 😴 Sleeping (View Logs) Web Mar 13, 2026 at 4:42 am
mcp-hub ✅ Success (View Logs) Web Mar 13, 2026 at 4:31 am

@NathanFlurry
Copy link
Member Author

Screenshots relevant to the workflow try UI changes:

Workflow try control flow

Workflow try handled failures

@pkg-pr-new
Copy link

pkg-pr-new bot commented Mar 13, 2026

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@4413

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@4413

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@4413

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@4413

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@4413

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@4413

@rivetkit/sqlite-vfs

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sqlite-vfs@4413

@rivetkit/traces

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/traces@4413

@rivetkit/workflow-engine

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/workflow-engine@4413

@rivetkit/virtual-websocket

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/virtual-websocket@4413

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@4413

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@4413

commit: 6283a2a

@claude
Copy link

claude bot commented Mar 13, 2026

PR Review: Add workflow try helpers and visualizer support

This is a well-architected feature that correctly solves a real problem: recovering from terminal workflow failures without swallowing scheduler control-flow errors. The implementation is thorough and the test coverage is solid. A few issues worth addressing before merge.


Issues

1. TryBlockFailure.source vs TryBlockCatchKind naming mismatch

TryBlockFailure.source can be "block" (set when catching RollbackError in executeTry), but TryBlockCatchKind has no "block" variant—it uses "rollback" instead. Users reading the type definition would expect to write catch: ["block"] for rollback, but the actual config value is "rollback". This is a leaky abstraction: either rename source: "block" to source: "rollback", or add a "block" catch kind and keep both consistent.

2. Non-null assertion without a runtime guard in the engine

In WorkflowContextImpl.tryStep and .try (engine-level), the string overload uses run!. The ActorWorkflowContext wrapper correctly throws "Step run function missing", but the engine-level implementation does not. Worth adding the same guard for consistency, especially since WorkflowContextInterface is a public interface.

3. shouldMarkHandledFailure heuristic needs a comment

context.workflow.state !== "failed" is a retroactive heuristic: a step that failed but the overall workflow still completed is assumed to have been recovered. This can produce false positives for steps that failed and were retried successfully (not inside a try scope). An inline comment explaining the intent and limitation would help future maintainers.

4. No tests for nested try blocks

The hasTryAncestor flag, collectTreeStats recursion, and buildRenderTree parent-linking all suggest nesting is intentionally supported, but there are no tests for ctx.try("outer", ctx => ctx.try("inner", ...)). At minimum a single happy-path and a failure-propagation test for nested scopes would catch regressions.

5. winnerValue as T cast after removing null guard

The old code checked winnerValue !== null. The new code uses hasWinner && winnerName !== null and then returns winnerValue as T. Since winnerValue is initialized to undefined, this cast is technically unsound if T is not nullable. The hasWinner flag guards it correctly at runtime, but a comment making the invariant explicit would be helpful.


Minor notes

  • Hardcoded hex colors in TSX: xyflow-nodes.tsx has "#f59e0b", "#ef4444", "#18181b" scattered across multiple style props for the handled-failure UI. They are already defined as TYPE_COLORS constants elsewhere—wiring handled-failure styling through that map would centralize the palette.

  • Import path change: workflow-visualizer.tsx and xyflow-nodes.tsx switched from the @/components alias to relative paths. This is inconsistent with neighboring files and may want a follow-up pass.

  • try as a method name: Using a reserved keyword as a method name is legal in modern JS/TS but can confuse syntax highlighters and linters in some toolchain configurations. No action required, just worth being aware of.

  • parseStoredWorkflowError regex: The pattern is best-effort parsing and will return an object with name: 'Error' for any error whose toString() does not follow the Name: message convention. Acceptable for the use case.


Strengths

  • Symbol-based metadata attachment is a clean approach to threading failure info through error chains without mutating the error message or adding public properties.
  • shouldRethrowTryError safelist is correct and minimal—sleep, message wait, eviction, history divergence, and rollback-checkpoint all correctly bypass the catch.
  • Scheduler-yield merging for join/race branches (merging SleepError, MessageWaitError, StepFailedError into a single yield) is a meaningful correctness fix beyond the try feature itself.
  • Visualizer synthetic tryGroup approach—building a render tree with synthetic try nodes rather than special-casing them in the layout algorithm—is elegant and composable.
  • Test coverage is comprehensive: exhausted retries, critical errors, rollback opt-in, join/race failures inside try blocks, and the driver integration test all cover realistic scenarios.
  • Documentation in QUICKSTART.md and control-flow.md clearly communicates the semantics and the "what is still rethrown" contract.

@NathanFlurry
Copy link
Member Author

Updated workflow UI screenshots after the alignment polish:

Full canvas:
workflow try control flow latest

Try header close-up:
workflow try header closeup

Failed node close-up:
workflow failed node closeup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant