Workers HMD

Workers Health Mediated Deployments (HMD) is a system that facilitates the staged, gradual rollout of new Workers versions to production. It integrates directly with Workers Observability for health metrics.

Without HMD

The manual way to do this is to…

Create a new Worker version
Create a new deployment with something like 90% vCurrent and 10% vNew
Monitor health metrics for a time period then decide to progress or cancel the deployment (and then revert to 100% vCurrent)
Then increment the deployment percent by a little bit and repeat
Keep incrementing until you reach 100%, each time waiting and monitoring

So tedious… this should be automated!

HMD for Workers

Here's how the feature works:

Release Definition: users define a release plan with…

Service Level Objectives (SLOs) based on Service Level Indicators (SLIs) sourced from Workers Observability
Customizable stages that define…
- % rollout
- Soak time
- Whether the stage progresses manually or automatically

Version Creation: To initiate a production release, users create a new Worker version. By default, this has 0% traffic routed to it.
Release Start: Users then start a release, which is an instance of the release plan. Each stage in the plan progressively increases the percentage of traffic directed from the current Worker version to the new one. For example, a release might consist of stages at 0%, 25%, 50%, 75%, and 100% rollout.
Staged Rollout with Soak Periods: Within each stage, a soak period begins. During this time, the system continuously monitors SLOs.
Progression and Failing:

If the soak period completes without any SLO violations, the stage can either be manually or automatically progressed to the next stage, increasing the traffic to the new Worker version.
Crucially, if an SLO is violated at any point, the rollout automatically aborts. The deployment is immediately reverted to 100% of the old Worker version, and the new version receives 0% of the traffic.

Completion: If all stages successfully pass without SLO violations, the new Worker version reached 100% deployment, meaning all production traffic is now routed to it. At this point, the release is considered complete.

Instructions

1. Connect your Worker

Then you should see...

2. Create a release plan

Set polling rate and SLOs:

Configure stages (and don't forget to hit save!):

3. Start a release

Select which Workers versions you want to roll out to.

Then click create release! You'll get live updates of your release progress:

You can approve or deny stages that aren't set to "Auto Progress":

In the Cloudflare dashboard you can see your deployments being actioned:

4. See history and why releases failed

Or a successful deployment!

Architecture

Cloudflare Workers and vite
Durable Objects for all app state
Workflows to power the release stages
Workers Observability to get health metrics
One project with no infrastructure dependencies and a single deploy command!

TODO

When workflows terminate is working in miniflare, move all cleanup logic to the routes level. This should greatly simplify the workflow since it currently needs to check when it gets a stop signal in multiple places
Figure out why headers aren't passed through vite. Then fix OpenAPI spec to use GET where it's using POST for the ConnectionId
Try an AI agent in the SLO evaluator
Build out the Workers Observability integration so queries can be created there and selected in this app

Dev

Command	Description
`npm run dev`	Start development server - Most commonly used for local development
`npm run deploy`	Deploy to Cloudflare - Push your changes to production
`npm run build`	Build for production - Compile TypeScript, bundle assets, and generate API docs
`npm run check`	Pre-deployment validation - Verify code compiles and deployment readiness
`npm run lint`	Format code - Auto-format source code with Prettier
`npm run preview`	Preview built app - Test the production build locally
`npm run generate-types`	Generate TypeScript types - Run when OpenAPI schema is modified
`npm run cf-typegen`	Generate Worker types - Run when new Cloudflare bindings are added
`npm run lint-openapi`	Validate OpenAPI schema - Check API documentation for errors
`npm run preview-openapi`	Preview API docs - View OpenAPI documentation in browser

Debugging

Useful logs for seeing exactly what's happening. Just call npx wrangler tail!

🚀 Starting release: dff05d56
----------
Worker Name: simulated-service
Account ID: c855e2dd8145bf97126b4cdd08464a5a
----------

🎬 Starting stage 1: 600s soak

=== CF DEPLOYMENT API REQUEST ===
Account: c855e2dd8145bf97126b4cdd08464a5a
Worker: simulated-service
Old Version: e591e866-cd70-4d35-b5b3-c652bb7259e3 (90%)
New Version: 14715573-1e39-4a3d-b7cf-f778a1a4fcca (10%)
================================

🛁 Stage 1 soak - Checking SLOs

=== Observability ===
P999 Wall: 893
P99 Wall: 251
P90 Wall: 65
P50 Wall: 43
=====================

📊 SLO Evaluation: All 2 SLO(s) passed

✅ Stage 1 soak passed - All SLOs satisfied

🛁 Stage 1 soak - Checking SLOs

=== Observability ===
P999 Wall: 899
P99 Wall: 231
P90 Wall: 68
P50 Wall: 42
=====================      

📊 SLO Evaluation: All 2 SLO(s) passed

✅ Stage 1 soak passed - All SLOs satisfied

🛁 Stage 1 soak completed

✅ Stage 1 completed

🎬 Starting stage 2: 600s soak

Simulation

Here's a simple Worker and script you can run to simulate unhealthy SLOs. The rate=100 url param will cause 1 in 100 requests to have a large latency. Uncomment the 500 error line to simulate 500 errors. Set iterations to something like ten thousand or a million to just keep it running during an HMD release.

export cf_account_id="replace-me"
export cf_api_token="replace-me"
export cf_subdomain="replace-me"

curl "https://api.cloudflare.com/client/v4/accounts/$cf_account_id/workers/scripts/simulated-service" \
  -X PUT \
  -H "Authorization: Bearer $cf_api_token" \
  -F "metadata={
        \"main_module\": \"simulated-service.mjs\",
        \"compatibility_date\": \"2025-07-24\",
        \"observability\": {
          \"enabled\": true
        }
      };type=application/json" \
  -F "simulated-service.mjs=@-;filename=simulated-service.mjs;type=application/javascript+module" <<EOF
export default {
  async fetch(request, env, ctx) {
    let url = new URL(request.url);
    let rate = url.searchParams.get('rate') ?? 100;
    const randomNumber = Math.floor(Math.random() * rate);
    // Simulate latencies
    if (randomNumber == 0) {
      await new Promise(resolve => setTimeout(resolve, 1000 + Math.floor(Math.random() * 200)));
      // return new Response("Simulated 500 error", { status: 500 });
    } else {
      await new Promise(resolve => setTimeout(resolve, 10 + Math.floor(Math.random() * 20)));
    }
    return new Response("Hello world!", { status: 200 });
  }
};
EOF

curl "https://api.cloudflare.com/client/v4/accounts/$cf_account_id/workers/scripts/simulated-service/subdomain" \
  -X POST \
  -H "Authorization: Bearer $cf_api_token" \
  -H "Content-Type: application/json" \
  -d '{"enabled": true}'

#!/bin/bash

URL="https://simulated-service.$cf_subdomain.workers.dev?rate=10"
ITERATIONS=1000

# Create temporary file to store results
TEMP_FILE=$(mktemp)
trap 'rm -f "$TEMP_FILE"' EXIT

echo "CURLING $URL $ITERATIONS times..."

# Function to make a request and log the result
make_request() {
  local i=$1
  http_code=$(curl -s -o /dev/null -w "%{http_code}" "$URL")
  echo "$http_code" >> "$TEMP_FILE"
  
  if [ $((i % 100)) -eq 0 ]; then
      echo "  Processed $i of $ITERATIONS requests..."
  fi
}

# Disable job control to suppress background job messages
set +m

# Run requests with max 2 parallel processes
for i in $(seq 1 $ITERATIONS); do
  make_request $i &
  
  # Limit to 2 background processes
  if [ $((i % 2)) -eq 0 ]; then
    wait
  fi
done

# Wait for any remaining background processes
wait

echo ""
echo "--- Results ---"
# Count and display results
sort "$TEMP_FILE" | uniq -c | while read count code; do
  echo "HTTP Status $code: $count times"
done

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
public		public
schemas		schemas
src		src
types		types
.editorconfig		.editorconfig
.gitignore		.gitignore
.prettier.rc		.prettier.rc
LICENSE		LICENSE
README.md		README.md
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
tsconfig.worker.json		tsconfig.worker.json
vite.config.ts		vite.config.ts
wrangler.jsonc		wrangler.jsonc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Workers HMD

Without HMD

HMD for Workers

Instructions

1. Connect your Worker

2. Create a release plan

3. Start a release

4. See history and why releases failed

Architecture

TODO

Dev

Debugging

Simulation

About

Uh oh!

Releases

Packages

Languages

License

markjmiller/workers-hmd

Folders and files

Latest commit

History

Repository files navigation

Workers HMD

Without HMD

HMD for Workers

Instructions

1. Connect your Worker

2. Create a release plan

3. Start a release

4. See history and why releases failed

Architecture

TODO

Dev

Debugging

Simulation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages