UAT: agents plan execute --format json hardcodes sandbox.strategy: "git_worktree" regardless of actual resource sandbox strategy #4794

Open
opened 2026-04-08 18:59:28 +00:00 by HAL9000 · 0 comments
Owner

Bug Report

Feature area: Sandbox and checkpoint safety model — Execute phase sandbox reporting
Severity: Medium
Discovered by: UAT Testing (uat-worker-sandbox-checkpoint)
Spec reference: docs/specification.md §agents plan execute JSON output


Expected Behavior (from spec)

The agents plan execute --format json output should report the actual sandbox strategy being used for the plan's resources. The spec's execute output envelope includes a sandbox field with the real strategy:

{
  "command": "plan execute",
  "data": {
    "sandbox": {
      "strategy": "<actual strategy from resource type>",
      "path": "...",
      "branch": "...",
      "status": "active"
    }
  }
}

For a plan operating on a postgres resource, sandbox.strategy should be "transaction_rollback". For a git-checkout resource, it should be "git_worktree".

Actual Behavior

The _execute_output_dict() function in src/cleveragents/cli/commands/plan.py (lines 362–383) hardcodes "strategy": "git_worktree" regardless of the actual sandbox strategy:

sandbox = {
    # TODO: derive strategy from plan's actual sandbox configuration
    # once the plan model exposes it; "git_worktree" is the current
    # default strategy used in all local execution environments.
    "strategy": "git_worktree",  # ← HARDCODED, always "git_worktree"
    "path": primary_ref,
    "branch": f"cleveragents/plan-{plan_id[:8]}",
    "status": "active",
}

The code even has a TODO comment acknowledging this is incorrect.

Code Location

  • src/cleveragents/cli/commands/plan.py_execute_output_dict() function (lines 362–383)

Impact

  • The JSON output for agents plan execute always reports "strategy": "git_worktree" even when the plan is using transaction_rollback (for database resources), copy_on_write (for filesystem resources), or overlay
  • Scripts and monitoring tools consuming the execute output cannot determine the actual sandbox strategy
  • The spec's structured output contract is violated for non-git-checkout resources

Fix Required

The Plan domain model should expose the sandbox strategy. The _execute_output_dict() function should derive the strategy from the plan's actual sandbox configuration:

# Option 1: Expose sandbox_strategy on Plan model
sandbox_strategy = plan.sandbox_strategy or "git_worktree"

# Option 2: Derive from sandbox_refs and resource registry
# (requires access to resource registry in the CLI layer)

The TODO comment at lines 366 and 376 already acknowledges this gap. The fix requires:

  1. Exposing the effective sandbox strategy on the Plan domain model (or via PlanLifecycleService)
  2. Using the actual strategy in _execute_output_dict()

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: uat-tester

## Bug Report **Feature area:** Sandbox and checkpoint safety model — Execute phase sandbox reporting **Severity:** Medium **Discovered by:** UAT Testing (uat-worker-sandbox-checkpoint) **Spec reference:** `docs/specification.md` §agents plan execute JSON output --- ## Expected Behavior (from spec) The `agents plan execute --format json` output should report the actual sandbox strategy being used for the plan's resources. The spec's execute output envelope includes a `sandbox` field with the real strategy: ```json { "command": "plan execute", "data": { "sandbox": { "strategy": "<actual strategy from resource type>", "path": "...", "branch": "...", "status": "active" } } } ``` For a plan operating on a `postgres` resource, `sandbox.strategy` should be `"transaction_rollback"`. For a `git-checkout` resource, it should be `"git_worktree"`. ## Actual Behavior The `_execute_output_dict()` function in `src/cleveragents/cli/commands/plan.py` (lines 362–383) hardcodes `"strategy": "git_worktree"` regardless of the actual sandbox strategy: ```python sandbox = { # TODO: derive strategy from plan's actual sandbox configuration # once the plan model exposes it; "git_worktree" is the current # default strategy used in all local execution environments. "strategy": "git_worktree", # ← HARDCODED, always "git_worktree" "path": primary_ref, "branch": f"cleveragents/plan-{plan_id[:8]}", "status": "active", } ``` The code even has a TODO comment acknowledging this is incorrect. ## Code Location - `src/cleveragents/cli/commands/plan.py` — `_execute_output_dict()` function (lines 362–383) ## Impact - The JSON output for `agents plan execute` always reports `"strategy": "git_worktree"` even when the plan is using `transaction_rollback` (for database resources), `copy_on_write` (for filesystem resources), or `overlay` - Scripts and monitoring tools consuming the execute output cannot determine the actual sandbox strategy - The spec's structured output contract is violated for non-git-checkout resources ## Fix Required The `Plan` domain model should expose the sandbox strategy. The `_execute_output_dict()` function should derive the strategy from the plan's actual sandbox configuration: ```python # Option 1: Expose sandbox_strategy on Plan model sandbox_strategy = plan.sandbox_strategy or "git_worktree" # Option 2: Derive from sandbox_refs and resource registry # (requires access to resource registry in the CLI layer) ``` The TODO comment at lines 366 and 376 already acknowledges this gap. The fix requires: 1. Exposing the effective sandbox strategy on the `Plan` domain model (or via `PlanLifecycleService`) 2. Using the actual strategy in `_execute_output_dict()` --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: uat-tester
HAL9000 added this to the v3.2.0 milestone 2026-04-09 05:18:22 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#4794
No description provided.