UAT: CheckpointManager.rollback_to silently returns False when sandbox_path not in checkpoint metadata #2839

Closed
opened 2026-04-04 20:46:41 +00:00 by freemo · 1 comment
Owner

Metadata

  • Branch: fix/checkpoint-manager-sandbox-path-metadata
  • Commit Message: fix(sandbox): auto-populate sandbox_path in CheckpointManager.create_checkpoint metadata
  • Milestone: v3.3.0
  • Parent Epic: #362

Description

The CheckpointManager.rollback_to method in src/cleveragents/infrastructure/sandbox/checkpoint.py has a silent failure mode: if the checkpoint's metadata does not contain a "sandbox_path" key, the method logs a warning and returns False without restoring anything.

The root cause is that create_checkpoint does not automatically populate sandbox_path in the checkpoint metadata. The create_checkpoint signature is:

def create_checkpoint(
    self,
    sandbox: Checkpointable,
    plan_id: str,
    phase: str,
    metadata: dict[str, Any] | None = None,
) -> SandboxCheckpoint:

The sandbox_path is only included in the checkpoint if the caller explicitly passes it in the metadata parameter. However, rollback_to requires it:

def rollback_to(self, checkpoint: SandboxCheckpoint) -> bool:
    ...
    sandbox_path = checkpoint.metadata.get("sandbox_path")
    if not sandbox_path or not os.path.isdir(sandbox_path):
        self._logger.warning(
            "Rollback skipped: sandbox path unknown or missing",
            checkpoint_id=checkpoint.checkpoint_id,
        )
        return False

This creates a silent failure: callers who create checkpoints without explicitly passing sandbox_path in metadata will find that rollback_to always returns False and never restores the sandbox state. The spec requires that checkpoints support rollback to a previous safe state (agents plan rollback <PLAN_ID> <CHECKPOINT_ID>).

Code Evidence

# In create_checkpoint:
if sandbox.context is not None:
    sandbox_path = sandbox.context.sandbox_path  # Used for snapshot, but NOT added to metadata

# In rollback_to:
sandbox_path = checkpoint.metadata.get("sandbox_path")  # Always None unless caller set it!

Expected Behavior

create_checkpoint should automatically populate sandbox_path in the metadata from sandbox.context.sandbox_path when the sandbox has a context. This is already done for the snapshot directory, but not for the metadata dict that rollback_to reads.

Actual Behavior

rollback_to silently returns False with a warning log, leaving the sandbox in its modified state.

Steps to Reproduce

  1. Create a CopyOnWriteSandbox or GitWorktreeSandbox
  2. Call sandbox.create(plan_id)
  3. Call checkpoint_manager.create_checkpoint(sandbox, plan_id, "pre_execute") (without passing sandbox_path in metadata)
  4. Make changes in the sandbox
  5. Call checkpoint_manager.rollback_to(checkpoint)
  6. Observe that rollback_to returns False and no rollback occurs, even though the snapshot was created

Code Locations

  • src/cleveragents/infrastructure/sandbox/checkpoint.pycreate_checkpoint (line ~130)
  • src/cleveragents/infrastructure/sandbox/checkpoint.pyrollback_to (line ~175)

Note

: A related issue (#2461) was previously filed covering the same root cause from a slightly different framing. This issue supersedes or duplicates that one; please verify and close the duplicate as appropriate.

Subtasks

  • Write a failing TDD Behave scenario (features/) that demonstrates rollback_to returning False when sandbox_path is absent from checkpoint metadata (red test)
  • Fix create_checkpoint in src/cleveragents/infrastructure/sandbox/checkpoint.py to auto-populate sandbox_path in meta from sandbox.context.sandbox_path after computing it for the snapshot
  • Ensure the fix gracefully handles the case where sandbox.context is None (no sandbox_path key added, or empty string)
  • Verify the TDD scenario now passes (green test)
  • Add/update unit tests in features/ to cover the sandbox.context is None edge case
  • Run nox -e typecheck — confirm no type errors introduced
  • Run nox -e unit_tests — confirm all Behave scenarios pass
  • Run nox -e coverage_report — confirm coverage ≥ 97%

Definition of Done

  • create_checkpoint automatically stores sandbox_path in checkpoint metadata whenever sandbox.context is not None
  • rollback_to successfully restores the sandbox to the checkpointed state when called after create_checkpoint without explicit metadata
  • TDD issue-capture Behave scenario exists and passes (green)
  • No regressions in existing checkpoint-related Behave scenarios
  • All nox stages pass
  • Coverage >= 97%

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/checkpoint-manager-sandbox-path-metadata` - **Commit Message**: `fix(sandbox): auto-populate sandbox_path in CheckpointManager.create_checkpoint metadata` - **Milestone**: v3.3.0 - **Parent Epic**: #362 ## Description The `CheckpointManager.rollback_to` method in `src/cleveragents/infrastructure/sandbox/checkpoint.py` has a silent failure mode: if the checkpoint's metadata does not contain a `"sandbox_path"` key, the method logs a warning and returns `False` without restoring anything. The root cause is that `create_checkpoint` does **not** automatically populate `sandbox_path` in the checkpoint metadata. The `create_checkpoint` signature is: ```python def create_checkpoint( self, sandbox: Checkpointable, plan_id: str, phase: str, metadata: dict[str, Any] | None = None, ) -> SandboxCheckpoint: ``` The `sandbox_path` is only included in the checkpoint if the caller explicitly passes it in the `metadata` parameter. However, `rollback_to` requires it: ```python def rollback_to(self, checkpoint: SandboxCheckpoint) -> bool: ... sandbox_path = checkpoint.metadata.get("sandbox_path") if not sandbox_path or not os.path.isdir(sandbox_path): self._logger.warning( "Rollback skipped: sandbox path unknown or missing", checkpoint_id=checkpoint.checkpoint_id, ) return False ``` This creates a silent failure: callers who create checkpoints without explicitly passing `sandbox_path` in metadata will find that `rollback_to` always returns `False` and never restores the sandbox state. The spec requires that checkpoints support rollback to a previous safe state (`agents plan rollback <PLAN_ID> <CHECKPOINT_ID>`). ### Code Evidence ```python # In create_checkpoint: if sandbox.context is not None: sandbox_path = sandbox.context.sandbox_path # Used for snapshot, but NOT added to metadata # In rollback_to: sandbox_path = checkpoint.metadata.get("sandbox_path") # Always None unless caller set it! ``` ### Expected Behavior `create_checkpoint` should automatically populate `sandbox_path` in the metadata from `sandbox.context.sandbox_path` when the sandbox has a context. This is already done for the snapshot directory, but not for the metadata dict that `rollback_to` reads. ### Actual Behavior `rollback_to` silently returns `False` with a warning log, leaving the sandbox in its modified state. ### Steps to Reproduce 1. Create a `CopyOnWriteSandbox` or `GitWorktreeSandbox` 2. Call `sandbox.create(plan_id)` 3. Call `checkpoint_manager.create_checkpoint(sandbox, plan_id, "pre_execute")` (without passing `sandbox_path` in metadata) 4. Make changes in the sandbox 5. Call `checkpoint_manager.rollback_to(checkpoint)` 6. Observe that `rollback_to` returns `False` and no rollback occurs, even though the snapshot was created ### Code Locations - `src/cleveragents/infrastructure/sandbox/checkpoint.py` — `create_checkpoint` (line ~130) - `src/cleveragents/infrastructure/sandbox/checkpoint.py` — `rollback_to` (line ~175) > **Note**: A related issue (#2461) was previously filed covering the same root cause from a slightly different framing. This issue supersedes or duplicates that one; please verify and close the duplicate as appropriate. ## Subtasks - [ ] Write a failing TDD Behave scenario (`features/`) that demonstrates `rollback_to` returning `False` when `sandbox_path` is absent from checkpoint metadata (red test) - [ ] Fix `create_checkpoint` in `src/cleveragents/infrastructure/sandbox/checkpoint.py` to auto-populate `sandbox_path` in `meta` from `sandbox.context.sandbox_path` after computing it for the snapshot - [ ] Ensure the fix gracefully handles the case where `sandbox.context` is `None` (no `sandbox_path` key added, or empty string) - [ ] Verify the TDD scenario now passes (green test) - [ ] Add/update unit tests in `features/` to cover the `sandbox.context is None` edge case - [ ] Run `nox -e typecheck` — confirm no type errors introduced - [ ] Run `nox -e unit_tests` — confirm all Behave scenarios pass - [ ] Run `nox -e coverage_report` — confirm coverage ≥ 97% ## Definition of Done - [ ] `create_checkpoint` automatically stores `sandbox_path` in checkpoint metadata whenever `sandbox.context` is not `None` - [ ] `rollback_to` successfully restores the sandbox to the checkpointed state when called after `create_checkpoint` without explicit metadata - [ ] TDD issue-capture Behave scenario exists and passes (green) - [ ] No regressions in existing checkpoint-related Behave scenarios - [ ] All nox stages pass - [ ] Coverage >= 97% --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-new-issue-creator
freemo added this to the v3.3.0 milestone 2026-04-04 20:46:46 +00:00
Author
Owner

State label reconciliation:

  • Previous state: State/Verified
  • Corrected to: State/Completed
  • Reason: Issue is closed but had a non-terminal state label. Per CONTRIBUTING.md, closed issues must have State/Completed or State/Wont Do.

Automated by CleverAgents Bot
Supervisor: Backlog Grooming | Agent: ca-backlog-groomer

State label reconciliation: - Previous state: `State/Verified` - Corrected to: `State/Completed` - Reason: Issue is closed but had a non-terminal state label. Per CONTRIBUTING.md, closed issues must have `State/Completed` or `State/Wont Do`. --- **Automated by CleverAgents Bot** Supervisor: Backlog Grooming | Agent: ca-backlog-groomer
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#362 Epic: Security & Safety Hardening
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#2839
No description provided.