Skip to content

feat: state-aware idle detection with backoff and max idle duration#29

Open
malpou wants to merge 23 commits intocomputerlovetech:mainfrom
malpou:pm/idle-detection
Open

feat: state-aware idle detection with backoff and max idle duration#29
malpou wants to merge 23 commits intocomputerlovetech:mainfrom
malpou:pm/idle-detection

Conversation

@malpou
Copy link
Contributor

@malpou malpou commented Mar 22, 2026

Summary

Closes #28.

  • Adds idle detection: when an agent emits <!-- ralph:state idle -->, the engine applies configurable backoff delays between iterations
  • Exponential backoff: delay × backoff^(consecutive_idle - 1), capped at max_delay
  • Optional max cumulative idle time limit that stops the loop automatically
  • Non-idle iterations reset all idle tracking
  • New idle frontmatter block with delay, backoff, max_delay, and max fields
  • Duration strings supported: 30s, 5m, 6h, 1d
  • ITERATION_IDLE event type with distinct console rendering
  • Documented across CLI reference, quick reference, writing prompts guide, changelog, codebase map, and SKILL.md

Test plan

  • 500 tests pass (including 527 new assertions across idle detection, backoff math, max idle, reset on activity, frontmatter parsing, event rendering)
  • mkdocs build --strict passes with zero warnings
  • Manual test: run a ralph with idle config and verify backoff behavior

malpou and others added 23 commits March 22, 2026 15:38
…rser

Add IdleConfig dataclass, idle state tracking on RunState (consecutive_idle,
cumulative_idle_time, mark_idle, reset_idle), IDLE_STATE_MARKER constant,
parse_duration() for human-readable durations (30s, 5m, 6h), and frontmatter
field constants for the idle configuration block.

Co-authored-by: Ralphify <noreply@ralphify.co>
…XCEEDED status

Co-authored-by: Ralphify <noreply@ralphify.co>
Detect <!-- ralph:state idle --> marker in agent output, emit
ITERATION_IDLE events, apply exponential backoff delays between idle
iterations, reset idle tracking on non-idle iterations, and stop the
loop when cumulative idle time exceeds idle.max.

Co-authored-by: Ralphify <noreply@ralphify.co>
Add _validate_idle() to parse the idle frontmatter block with support
for duration strings (30s, 5m, 1h) and numeric values. Validates all
sub-fields (delay, backoff, max_delay, max) and rejects unknown fields.

Co-authored-by: Ralphify <noreply@ralphify.co>
Add dimmed idle indicator (◇) for idle iterations and show
"Stopped (idle):" summary when a run ends due to max idle time.

Co-authored-by: Ralphify <noreply@ralphify.co>
Document the new idle detection feature across CLI reference, quick
reference, writing prompts guide, changelog, codebase map, and the
new-ralph skill. Adds frontmatter field reference, usage examples,
and contributor guidance for the idle detection system.

Co-authored-by: Ralphify <noreply@ralphify.co>
TestValidateIdle unit tests already cover all validation error cases.
Remove 9 duplicate integration tests from TestIdleFrontmatter, keeping
only the 3 happy-path tests that verify CLI wiring.

Co-authored-by: Ralphify <noreply@ralphify.co>
Collapse TestComputeIdleDelay from 6 individual tests into 2 parametrized
tests. Remove test_idle_backoff_delay_applied which relied on wall-clock
timing assertions (covered by the unit-level backoff math tests).

Co-authored-by: Ralphify <noreply@ralphify.co>
Co-authored-by: Ralphify <noreply@ralphify.co>
…fy duration parser

Co-authored-by: Ralphify <noreply@ralphify.co>
…er duration fields

Co-authored-by: Ralphify <noreply@ralphify.co>
…ard-condition tests

Co-authored-by: Ralphify <noreply@ralphify.co>
Co-authored-by: Ralphify <noreply@ralphify.co>
…pts, and changelog

Co-authored-by: Ralphify <noreply@ralphify.co>
The IDLE_FIELD_DELAY/BACKOFF/MAX_DELAY/MAX constants were only referenced
in one function. Inline them as string literals to reduce the diff footprint.

Co-authored-by: Ralphify <noreply@ralphify.co>
Parametrize non-idle engine tests, merge mark/reset idle tests into one,
remove redundant no-idle CLI test and events test already covered by
integration tests.

Co-authored-by: Ralphify <noreply@ralphify.co>
Replace duplicate YAML example with cross-reference to cli.md, keeping
only the prompt-writing guidance relevant to this page.

Co-authored-by: Ralphify <noreply@ralphify.co>
Populate stdout_text from the full streamed output in streaming mode
and from captured stdout in blocking mode, so the engine can check
the complete agent output for markers not present in result_text.

Co-authored-by: Ralphify <noreply@ralphify.co>
The idle detection in _run_agent_phase now checks agent.stdout_text
when result_text doesn't contain the idle marker, ensuring idle state
is detected even when the marker appears in streamed output but not
in the final result field.

Co-authored-by: Ralphify <noreply@ralphify.co>
Cover the new stdout_text fallback path in idle detection (engine) and
verify stdout_text is populated in both streaming and blocking modes.

Co-authored-by: Ralphify <noreply@ralphify.co>
Replace the static log_info("Waiting...") message with structured
DELAY_STARTED and DELAY_ENDED events so the console emitter can render
a live countdown timer.

Co-authored-by: Ralphify <noreply@ralphify.co>
…tter

Add _DelayCountdown renderable that shows a ticking countdown for
inter-iteration delays, and wire up DELAY_STARTED/DELAY_ENDED event
handlers in ConsoleEmitter.

Co-authored-by: Ralphify <noreply@ralphify.co>
Co-authored-by: Ralphify <noreply@ralphify.co>
@malpou
Copy link
Contributor Author

malpou commented Mar 22, 2026

image

RALPH.md

---
agent: claude -p --dangerously-skip-permissions
idle:
  delay: 30s
  backoff: 2
  max_delay: 5m
  max: 30m
commands:
  - name: status
    run: ./show-status.sh
  - name: inbox
    run: ./show-inbox.sh
  - name: active-task
    run: ./show-active-task.sh
  - name: questions
    run: ./show-questions.sh
  - name: tests
    run: uv run pytest -x
    timeout: 120
  - name: git-log
    run: git log --oneline -15
  - name: git-status
    run: git status --short
---

# Role

You are an autonomous project manager and developer for **ralphify** — the open-source ralph loop harness framework. Each iteration is stateless: all persistent state lives in files on disk and in git. State files live in `pm/` (INBOX.md, TODO.md, QUESTIONS.md, tasks/).

Read the project's CLAUDE.md for coding conventions, test commands, and architecture before making changes.

# State

## Workflow status
{{ commands.status }}

## Inbox
{{ commands.inbox }}

## Active task
{{ commands.active-task }}

## Open questions
{{ commands.questions }}

## Test results
{{ commands.tests }}

## Recent commits
{{ commands.git-log }}

## Working tree
{{ commands.git-status }}

# Decision tree

Follow this priority order. Execute the **first** matching action, then stop.

1. **Tests failing?** Fix them immediately. Do not move on until all tests pass.

2. **Blocked on questions?** If `pm/QUESTIONS.md` has unanswered questions for the active task under `## Open`, output `<!-- ralph:state idle -->` and stop. Do not continue work on that task.

3. **Active task with incomplete plan steps?** Find the next unchecked step in `pm/tasks/<slug>/PLAN.md`. Implement it. Run tests. If tests pass, commit and check off the step in the plan. One step per iteration.

4. **Active task fully complete?** All plan steps are checked. Create a draft PR (`gh pr create --draft`), mark the task done in `pm/TODO.md` (move from `## Active` to `## Done`, change `[ ]` to `[x]`), switch back to the main branch.

5. **Inbox has items?** Pick the top unchecked item from `pm/INBOX.md`.
   - If it's a GitHub issue URL, run `gh issue view <number>` to get details.
   - Create `pm/tasks/<slug>/PLAN.md` with: summary, files to modify, 3–8 numbered steps with checkboxes, and acceptance criteria.
   - Add the task to `pm/TODO.md` under `## Active`: `- [ ] <slug> — <short description>`
   - Mark the inbox item done (change `[ ]` to `[x]`).
   - Create and switch to branch `pm/<slug>`.

6. **Nothing to do?** You MUST output the following marker as the very first line of your response, exactly as shown (this is how the engine detects idle state):

   <!-- ralph:state idle -->

   Then add a short status summary after it.

# Rules

- **One plan step per iteration.** Do not combine multiple steps.
- **Always run tests before committing.** Never commit with failing tests.
- **Commit conventions:** `feat:`, `fix:`, `refactor:`, `docs:` prefixes. Write clear commit messages.
- **Never hack tests to make them pass.** Fix the actual code, not the tests.
- **Work on `pm/<slug>` branches**, not main. Branch from main for each task.
- **No force pushes or rebases.**
- **State files always use `pm/` prefix paths.**
- **If blocked**, write to `pm/QUESTIONS.md` under `## Open`: `- **<task-slug>**: <question>`. Then stop.
- **Keep the codebase clean.** Follow existing conventions. Read CLAUDE.md.

@malpou malpou marked this pull request as ready for review March 22, 2026 16:25
Copy link
Collaborator

@kasperjunge kasperjunge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough work here — the implementation is solid and well-tested, and the idle token waste problem from #28 is real.

I'm interested in the idea of structured state communication between agents and the engine, and I can see the opportunity for a richer state system down the road (idle, blocked, waiting_for_review, etc.). However, this is a significant addition to the framework's surface area, and I want to take more time to think through the design before committing to a pattern.

A few things I'm weighing:

  • Whether the engine should own backoff logic, or whether a simpler agent-driven approach (e.g. <!-- ralph:delay 60s -->) would cover the immediate need with less complexity
  • How a state system fits into the longer-term direction of the framework
  • Whether we want to introduce structured agent→engine communication as a first-class concept now or later

I'm going to leave this open for now while I think on it. Appreciate the PR and the detailed issue — both are really helpful for framing the problem space.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

State-aware idle detection with backoff and max idle duration

2 participants