UAT: snapshot sandbox strategy raises NotImplementedError — tools with checkpoint: "snapshot" (e.g., shell_execute) cannot be properly checkpointed #2472

Open
opened 2026-04-03 18:36:22 +00:00 by freemo · 1 comment
Owner

Metadata

  • Branch: fix/snapshot-sandbox-not-implemented
  • Commit Message: fix(sandbox): implement snapshot sandbox strategy in SandboxFactory
  • Milestone: v3.3.0
  • Parent Epic: #362

Description

The snapshot sandbox strategy is listed in the spec as a valid strategy and is referenced by built-in tools (e.g., shell_execute has checkpoint: "snapshot"), but the SandboxFactory raises NotImplementedError when this strategy is requested.

Expected Behavior (from spec)

The spec (docs/specification.md lines 7120, 7351) shows shell_execute with checkpoint: "snapshot". The snapshot strategy should provide a filesystem snapshot mechanism for sandboxing shell execution. The SandboxStrategyStr type in factory.py includes "snapshot" as a valid literal value.

Actual Behavior

In src/cleveragents/infrastructure/sandbox/factory.py lines 159–160:

if sandbox_strategy == STRATEGY_SNAPSHOT:
    raise NotImplementedError("Snapshot sandbox not yet implemented")

The _IMPLEMENTED_STRATEGIES frozenset (lines 51–53) also excludes "snapshot":

_IMPLEMENTED_STRATEGIES: frozenset[str] = frozenset(
    {"none", "git_worktree", "copy_on_write", "transaction_rollback", "overlay"}
)

This means any tool or resource that uses the snapshot strategy will fail at runtime with NotImplementedError. The shell_execute built-in tool is documented with checkpoint: "snapshot", making it impossible to properly checkpoint shell execution operations.

Code Locations

  • src/cleveragents/infrastructure/sandbox/factory.py lines 37, 51–53, 159–160
  • docs/specification.md lines 7120, 7351 (shell_execute with checkpoint: "snapshot")

Impact

  • shell_execute tool cannot be checkpointed as specified
  • Any resource type configured with sandbox_strategy: "snapshot" will fail at runtime
  • The SandboxStrategyStr type annotation includes "snapshot" as valid, creating a false impression that it is supported

Steps to Reproduce

  1. Call SandboxFactory().create_sandbox(resource_id="r1", original_path="/tmp", sandbox_strategy="snapshot")
  2. Observe NotImplementedError: Snapshot sandbox not yet implemented

Subtasks

  • Write a failing Behave scenario in features/ that reproduces the NotImplementedError when snapshot strategy is requested (TDD — failing test merged first)
  • Implement SnapshotSandbox class in src/cleveragents/infrastructure/sandbox/ providing filesystem snapshot/restore semantics
  • Register SnapshotSandbox in SandboxFactory.create_sandbox() for STRATEGY_SNAPSHOT
  • Add "snapshot" to _IMPLEMENTED_STRATEGIES frozenset in factory.py
  • Verify shell_execute tool can be checkpointed end-to-end using the snapshot strategy
  • Ensure all nox stages pass (nox -e lint, nox -e typecheck, nox -e unit_tests, nox -e coverage_report)

Definition of Done

  • A Behave scenario exists that reproduces the original NotImplementedError (failing test committed first per TDD workflow)
  • SandboxFactory().create_sandbox(..., sandbox_strategy="snapshot") no longer raises NotImplementedError
  • "snapshot" is present in _IMPLEMENTED_STRATEGIES
  • shell_execute tool can be checkpointed via the snapshot strategy without error
  • All nox stages pass
  • Coverage >= 97%

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-uat-tester

## Metadata - **Branch**: `fix/snapshot-sandbox-not-implemented` - **Commit Message**: `fix(sandbox): implement snapshot sandbox strategy in SandboxFactory` - **Milestone**: v3.3.0 - **Parent Epic**: #362 ## Description The `snapshot` sandbox strategy is listed in the spec as a valid strategy and is referenced by built-in tools (e.g., `shell_execute` has `checkpoint: "snapshot"`), but the `SandboxFactory` raises `NotImplementedError` when this strategy is requested. ### Expected Behavior (from spec) The spec (`docs/specification.md` lines 7120, 7351) shows `shell_execute` with `checkpoint: "snapshot"`. The `snapshot` strategy should provide a filesystem snapshot mechanism for sandboxing shell execution. The `SandboxStrategyStr` type in `factory.py` includes `"snapshot"` as a valid literal value. ### Actual Behavior In `src/cleveragents/infrastructure/sandbox/factory.py` lines 159–160: ```python if sandbox_strategy == STRATEGY_SNAPSHOT: raise NotImplementedError("Snapshot sandbox not yet implemented") ``` The `_IMPLEMENTED_STRATEGIES` frozenset (lines 51–53) also excludes `"snapshot"`: ```python _IMPLEMENTED_STRATEGIES: frozenset[str] = frozenset( {"none", "git_worktree", "copy_on_write", "transaction_rollback", "overlay"} ) ``` This means any tool or resource that uses the `snapshot` strategy will fail at runtime with `NotImplementedError`. The `shell_execute` built-in tool is documented with `checkpoint: "snapshot"`, making it impossible to properly checkpoint shell execution operations. ### Code Locations - `src/cleveragents/infrastructure/sandbox/factory.py` lines 37, 51–53, 159–160 - `docs/specification.md` lines 7120, 7351 (`shell_execute` with `checkpoint: "snapshot"`) ### Impact - `shell_execute` tool cannot be checkpointed as specified - Any resource type configured with `sandbox_strategy: "snapshot"` will fail at runtime - The `SandboxStrategyStr` type annotation includes `"snapshot"` as valid, creating a false impression that it is supported ### Steps to Reproduce 1. Call `SandboxFactory().create_sandbox(resource_id="r1", original_path="/tmp", sandbox_strategy="snapshot")` 2. Observe `NotImplementedError: Snapshot sandbox not yet implemented` ## Subtasks - [ ] Write a failing Behave scenario in `features/` that reproduces the `NotImplementedError` when `snapshot` strategy is requested (TDD — failing test merged first) - [ ] Implement `SnapshotSandbox` class in `src/cleveragents/infrastructure/sandbox/` providing filesystem snapshot/restore semantics - [ ] Register `SnapshotSandbox` in `SandboxFactory.create_sandbox()` for `STRATEGY_SNAPSHOT` - [ ] Add `"snapshot"` to `_IMPLEMENTED_STRATEGIES` frozenset in `factory.py` - [ ] Verify `shell_execute` tool can be checkpointed end-to-end using the `snapshot` strategy - [ ] Ensure all nox stages pass (`nox -e lint`, `nox -e typecheck`, `nox -e unit_tests`, `nox -e coverage_report`) ## Definition of Done - [ ] A Behave scenario exists that reproduces the original `NotImplementedError` (failing test committed first per TDD workflow) - [ ] `SandboxFactory().create_sandbox(..., sandbox_strategy="snapshot")` no longer raises `NotImplementedError` - [ ] `"snapshot"` is present in `_IMPLEMENTED_STRATEGIES` - [ ] `shell_execute` tool can be checkpointed via the `snapshot` strategy without error - [ ] All nox stages pass - [ ] Coverage >= 97% --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-uat-tester
freemo added this to the v3.3.0 milestone 2026-04-03 18:36:27 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • MoSCoW: Could Have — Desirable improvement but not necessary for the milestone.

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Issue triaged by project owner: - **State**: Verified - **MoSCoW**: Could Have — Desirable improvement but not necessary for the milestone. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#362 Epic: Security & Safety Hardening
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#2472
No description provided.