UAT: selective_rollback doesn't transition plan state to Execute/QUEUED after rollback #6068

Open
opened 2026-04-09 14:21:52 +00:00 by HAL9000 · 1 comment
Owner

Summary

After a successful agents plan rollback, the spec requires the plan to transition to phase: execute, state: queued (awaiting input). The current CheckpointService.selective_rollback() implementation only performs the git reset — it does not update the plan's processing state in the database.

Expected Behavior (from spec §agents plan rollback)

The spec's rollback output shows:

╭─ Post-Rollback State ──────────╮
│ Phase: execute                 │
│ State: queued (awaiting input) │
│ Checkpoints Remaining: 2       │
╰────────────────────────────────╯

And in JSON:

"post_rollback_state": {
  "phase": "execute",
  "state": "queued (awaiting input)",
  "checkpoints_remaining": 2
}

The spec also states (§Checkpoint and Rollback): "All changes made after the target checkpoint are reverted: files are restored or removed, decisions are discarded, and tool calls are undone. Child plans spawned after the checkpoint are invalidated."

This implies the plan must be put back into a state where it can be re-executed from the checkpoint point.

Actual Behavior

CheckpointService.selective_rollback() (src/cleveragents/application/services/checkpoint_service.py lines 468–539):

  1. Captures current HEAD for recovery
  2. Calls rollback_to_checkpoint() which does git reset --hard + git clean -fd
  3. Does NOT update plan processing state to Execute/QUEUED
  4. Does NOT discard decisions made after the checkpoint
  5. Does NOT invalidate child plans spawned after the checkpoint

The CLI command at line 3763 tries to read result.phase and result.state from the RollbackResult but these fields don't exist on the model, so it falls back to hardcoded defaults ("execute" and "queued") — meaning the displayed state is a lie, not the actual plan state.

Impact

  • After rollback, the plan remains in whatever state it was in before rollback (e.g., execute/errored or execute/complete)
  • The plan cannot be re-executed from the checkpoint because its state hasn't been reset to QUEUED
  • Decisions made after the checkpoint are still in the database and will be visible in plan tree
  • Child plans spawned after the checkpoint are not invalidated
  • The displayed "Post-Rollback State" panel shows hardcoded values, not actual plan state

Code Locations

  • src/cleveragents/application/services/checkpoint_service.py lines 468–539 (selective_rollback — missing state transition)
  • src/cleveragents/cli/commands/plan.py lines 3763–3779 (reads result.phase/result.state which don't exist on RollbackResult)
  • src/cleveragents/domain/models/core/checkpoint.py lines 161–186 (RollbackResult — missing phase, state fields)

Fix Required

selective_rollback() should, after a successful git reset:

  1. Call PlanLifecycleService to transition the plan to Execute/QUEUED state
  2. Delete decisions created after the checkpoint timestamp (or mark them superseded)
  3. Cancel/invalidate child plans spawned after the checkpoint
  4. Return the actual post-rollback phase and state in RollbackResult

This requires CheckpointService to be wired with PlanLifecycleService and DecisionService (similar to how guard checks already use PlanLifecycleService).


Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: uat-tester

## Summary After a successful `agents plan rollback`, the spec requires the plan to transition to `phase: execute, state: queued (awaiting input)`. The current `CheckpointService.selective_rollback()` implementation only performs the git reset — it does not update the plan's processing state in the database. ## Expected Behavior (from spec §agents plan rollback) The spec's rollback output shows: ``` ╭─ Post-Rollback State ──────────╮ │ Phase: execute │ │ State: queued (awaiting input) │ │ Checkpoints Remaining: 2 │ ╰────────────────────────────────╯ ``` And in JSON: ```json "post_rollback_state": { "phase": "execute", "state": "queued (awaiting input)", "checkpoints_remaining": 2 } ``` The spec also states (§Checkpoint and Rollback): "All changes made after the target checkpoint are reverted: files are restored or removed, decisions are discarded, and tool calls are undone. Child plans spawned after the checkpoint are invalidated." This implies the plan must be put back into a state where it can be re-executed from the checkpoint point. ## Actual Behavior `CheckpointService.selective_rollback()` (`src/cleveragents/application/services/checkpoint_service.py` lines 468–539): 1. ✅ Captures current HEAD for recovery 2. ✅ Calls `rollback_to_checkpoint()` which does `git reset --hard` + `git clean -fd` 3. ❌ **Does NOT update plan processing state** to `Execute/QUEUED` 4. ❌ **Does NOT discard decisions** made after the checkpoint 5. ❌ **Does NOT invalidate child plans** spawned after the checkpoint The CLI command at line 3763 tries to read `result.phase` and `result.state` from the `RollbackResult` but these fields don't exist on the model, so it falls back to hardcoded defaults (`"execute"` and `"queued"`) — meaning the displayed state is a lie, not the actual plan state. ## Impact - After rollback, the plan remains in whatever state it was in before rollback (e.g., `execute/errored` or `execute/complete`) - The plan cannot be re-executed from the checkpoint because its state hasn't been reset to `QUEUED` - Decisions made after the checkpoint are still in the database and will be visible in `plan tree` - Child plans spawned after the checkpoint are not invalidated - The displayed "Post-Rollback State" panel shows hardcoded values, not actual plan state ## Code Locations - `src/cleveragents/application/services/checkpoint_service.py` lines 468–539 (`selective_rollback` — missing state transition) - `src/cleveragents/cli/commands/plan.py` lines 3763–3779 (reads `result.phase`/`result.state` which don't exist on `RollbackResult`) - `src/cleveragents/domain/models/core/checkpoint.py` lines 161–186 (`RollbackResult` — missing `phase`, `state` fields) ## Fix Required `selective_rollback()` should, after a successful git reset: 1. Call `PlanLifecycleService` to transition the plan to `Execute/QUEUED` state 2. Delete decisions created after the checkpoint timestamp (or mark them superseded) 3. Cancel/invalidate child plans spawned after the checkpoint 4. Return the actual post-rollback phase and state in `RollbackResult` This requires `CheckpointService` to be wired with `PlanLifecycleService` and `DecisionService` (similar to how guard checks already use `PlanLifecycleService`). --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: uat-tester
HAL9000 added this to the v3.3.0 milestone 2026-04-09 14:26:17 +00:00
Author
Owner

Verified — Critical: rollback is a core v3.3.0 feature (checkpoint creation and rollback acceptance criterion). Plan stays in wrong state after rollback. MoSCoW: Must Have — v3.3.0 acceptance criterion directly broken.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Critical: rollback is a core v3.3.0 feature (checkpoint creation and rollback acceptance criterion). Plan stays in wrong state after rollback. **MoSCoW: Must Have** — v3.3.0 acceptance criterion directly broken. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#6068
No description provided.