UAT: CheckpointService.rollback_to_checkpoint only rolls back the first sandbox (sandbox_refs[0]) — multi-resource plans with multiple sandboxes are only partially reverted #2990

Open
opened 2026-04-05 03:18:57 +00:00 by freemo · 1 comment
Owner

Metadata

  • Branch: fix/checkpoint-rollback-all-sandboxes
  • Commit Message: fix(checkpoint): rollback_to_checkpoint iterates all sandbox_refs instead of only the first
  • Milestone: v3.3.0
  • Parent Epic: #362

Description

The CheckpointService.rollback_to_checkpoint method resolves the sandbox path by taking only the first element of plan.sandbox_refs (line 746 in checkpoint_service.py). For plans that involve multiple resources with separate sandboxes (e.g., a git-checkout resource and a database resource), only the first sandbox is rolled back. The remaining sandboxes are left in their post-checkpoint state, resulting in an inconsistent partial rollback.

Expected Behavior (from spec)

A checkpoint rollback should revert ALL sandboxes associated with the plan to their state at the checkpoint. The spec requires that rollback is an atomic, all-or-nothing operation across all resources.

Actual Behavior

In src/cleveragents/application/services/checkpoint_service.py lines 734–755:

def _resolve_sandbox_path(self, plan_id: str) -> str:
    if self._plan_lifecycle_service is not None:
        plan = self._plan_lifecycle_service.get_plan(plan_id)
        ...
        if not plan.sandbox_refs:
            raise BusinessRuleViolation(
                "Cannot rollback: sandbox is missing for this plan"
            )
        return str(plan.sandbox_refs[0])  # ← Only uses the FIRST sandbox ref!

The method returns only plan.sandbox_refs[0], so rollback_to_checkpoint only operates on the first sandbox. If a plan has resources in both a git-checkout sandbox and a database transaction sandbox, only the git-checkout sandbox is rolled back.

Code Location

  • src/cleveragents/application/services/checkpoint_service.py line 746 (return str(plan.sandbox_refs[0]))
  • src/cleveragents/application/services/checkpoint_service.py lines 718–755 (_resolve_sandbox_path)

Steps to Reproduce

  1. Create a plan with two resources: a git-checkout (git_worktree strategy) and a database (transaction_rollback strategy)
  2. Execute the plan, creating changes in both sandboxes
  3. Create a checkpoint
  4. Make more changes in both sandboxes
  5. Call CheckpointService.rollback_to_checkpoint(plan_id, checkpoint_id)
  6. Observe that only the git-checkout sandbox is rolled back; the database sandbox retains post-checkpoint changes

Impact

  • Multi-resource plans cannot be fully rolled back
  • The atomicity guarantee of checkpoint rollback is broken for plans with multiple sandboxes
  • This is a silent failure — no error is raised, but the rollback is incomplete

Subtasks

  • Audit _resolve_sandbox_path and identify all callers that depend on a single return value
  • Refactor _resolve_sandbox_path to return list[str] (all sandbox paths) instead of a single str
  • Update rollback_to_checkpoint to iterate over all resolved sandbox paths and roll back each one atomically
  • Add transaction/rollback guard: if any individual sandbox rollback fails, attempt to revert already-rolled-back sandboxes and raise a BusinessRuleViolation to preserve atomicity
  • Update all callers of _resolve_sandbox_path to handle the new return type
  • Add/update Behave BDD unit test scenarios covering multi-sandbox rollback (happy path and partial-failure path)
  • Add/update Robot Framework integration test for multi-resource plan rollback
  • Verify nox -e typecheck passes (Pyright, no # type: ignore)
  • Verify nox -e lint passes
  • Verify nox -e coverage_report reports >= 97%

Definition of Done

  • _resolve_sandbox_path returns all sandbox refs for the plan, not just the first
  • rollback_to_checkpoint rolls back every sandbox associated with the plan atomically
  • A partial-failure scenario raises BusinessRuleViolation and does not leave sandboxes in an inconsistent state
  • Behave BDD scenarios cover: single-sandbox rollback (regression), multi-sandbox rollback (new), and partial-failure atomicity (new)
  • Robot Framework integration test validates end-to-end multi-resource rollback
  • All nox stages pass
  • Coverage >= 97%
  • PR merged and this issue closed

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/checkpoint-rollback-all-sandboxes` - **Commit Message**: `fix(checkpoint): rollback_to_checkpoint iterates all sandbox_refs instead of only the first` - **Milestone**: v3.3.0 - **Parent Epic**: #362 ## Description The `CheckpointService.rollback_to_checkpoint` method resolves the sandbox path by taking only the first element of `plan.sandbox_refs` (line 746 in `checkpoint_service.py`). For plans that involve multiple resources with separate sandboxes (e.g., a git-checkout resource and a database resource), only the first sandbox is rolled back. The remaining sandboxes are left in their post-checkpoint state, resulting in an inconsistent partial rollback. ### Expected Behavior (from spec) A checkpoint rollback should revert **ALL** sandboxes associated with the plan to their state at the checkpoint. The spec requires that rollback is an atomic, all-or-nothing operation across all resources. ### Actual Behavior In `src/cleveragents/application/services/checkpoint_service.py` lines 734–755: ```python def _resolve_sandbox_path(self, plan_id: str) -> str: if self._plan_lifecycle_service is not None: plan = self._plan_lifecycle_service.get_plan(plan_id) ... if not plan.sandbox_refs: raise BusinessRuleViolation( "Cannot rollback: sandbox is missing for this plan" ) return str(plan.sandbox_refs[0]) # ← Only uses the FIRST sandbox ref! ``` The method returns only `plan.sandbox_refs[0]`, so `rollback_to_checkpoint` only operates on the first sandbox. If a plan has resources in both a git-checkout sandbox and a database transaction sandbox, only the git-checkout sandbox is rolled back. ### Code Location - `src/cleveragents/application/services/checkpoint_service.py` line 746 (`return str(plan.sandbox_refs[0])`) - `src/cleveragents/application/services/checkpoint_service.py` lines 718–755 (`_resolve_sandbox_path`) ### Steps to Reproduce 1. Create a plan with two resources: a git-checkout (git_worktree strategy) and a database (transaction_rollback strategy) 2. Execute the plan, creating changes in both sandboxes 3. Create a checkpoint 4. Make more changes in both sandboxes 5. Call `CheckpointService.rollback_to_checkpoint(plan_id, checkpoint_id)` 6. Observe that only the git-checkout sandbox is rolled back; the database sandbox retains post-checkpoint changes ### Impact - Multi-resource plans cannot be fully rolled back - The atomicity guarantee of checkpoint rollback is broken for plans with multiple sandboxes - This is a **silent failure** — no error is raised, but the rollback is incomplete ## Subtasks - [ ] Audit `_resolve_sandbox_path` and identify all callers that depend on a single return value - [ ] Refactor `_resolve_sandbox_path` to return `list[str]` (all sandbox paths) instead of a single `str` - [ ] Update `rollback_to_checkpoint` to iterate over all resolved sandbox paths and roll back each one atomically - [ ] Add transaction/rollback guard: if any individual sandbox rollback fails, attempt to revert already-rolled-back sandboxes and raise a `BusinessRuleViolation` to preserve atomicity - [ ] Update all callers of `_resolve_sandbox_path` to handle the new return type - [ ] Add/update Behave BDD unit test scenarios covering multi-sandbox rollback (happy path and partial-failure path) - [ ] Add/update Robot Framework integration test for multi-resource plan rollback - [ ] Verify `nox -e typecheck` passes (Pyright, no `# type: ignore`) - [ ] Verify `nox -e lint` passes - [ ] Verify `nox -e coverage_report` reports >= 97% ## Definition of Done - [ ] `_resolve_sandbox_path` returns all sandbox refs for the plan, not just the first - [ ] `rollback_to_checkpoint` rolls back every sandbox associated with the plan atomically - [ ] A partial-failure scenario raises `BusinessRuleViolation` and does not leave sandboxes in an inconsistent state - [ ] Behave BDD scenarios cover: single-sandbox rollback (regression), multi-sandbox rollback (new), and partial-failure atomicity (new) - [ ] Robot Framework integration test validates end-to-end multi-resource rollback - [ ] All nox stages pass - [ ] Coverage >= 97% - [ ] PR merged and this issue closed --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-new-issue-creator
freemo added this to the v3.3.0 milestone 2026-04-05 03:19:36 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • Priority: Confirmed
  • MoSCoW: Should Have

Valid finding verified during batch triage.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Issue triaged by project owner: - **State**: Verified - **Priority**: Confirmed - **MoSCoW**: Should Have Valid finding verified during batch triage. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#362 Epic: Security & Safety Hardening
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#2990
No description provided.