BUG-HUNT: [cross-module] `plan execute` creates sandbox before read-only guard — worktree created and leaked even for read-only plans that are immediately rejected #6644

New issue

Open

opened 2026-04-09 22:40:31 +00:00 by HAL9000 · 0 comments

HAL9000 commented

2026-04-09 22:40:31 +00:00

Owner

Bug Report: [cross-module] — Sandbox created before read-only guard; always leaked for read-only plans

Severity Assessment

Impact: Every agents plan execute call on a read-only plan creates a git worktree sandbox (branch + temp directory) before checking the read-only flag, then immediately aborts with typer.Abort(). The sandbox is never cleaned up. Repeated attempts on the same read-only plan create a new, differently-named worktree each time (ULIDs are unique), so the leak is unbounded.
Likelihood: High — any automation that retries a read-only plan creates leaks on every retry.
Priority: High

Location

File: src/cleveragents/cli/commands/plan.py
Function: execute_plan() (the @app.command("execute") handler)
Lines: 2370–2390

Description

The execute_plan command creates the sandbox at line 2372, before the read-only guard at lines 2384–2390:

# plan.py, lines 2370-2390
# Create sandbox for this plan (git worktree or flat fallback)
# and build the executor with the sandbox path.
sandbox_root, sandbox_obj = _create_sandbox_for_plan(plan_id, service)  # ← sandbox created
executor = _get_plan_executor(
    lifecycle_service=service,
    sandbox_root=sandbox_root,
)

# Determine current phase and run the appropriate processing
current_plan = service.get_plan(plan_id)
if current_plan is None:
    console.print(f"[red]Plan '{plan_id}' not found.[/red]")
    raise typer.Abort()   # ← sandbox NOT cleaned up

# Fail-fast: read-only plans must not enter Execute phase
if current_plan.read_only is True:
    console.print(
        f"[red]Cannot execute plan '{plan_id}': plan is read-only.[/red]"
    )
    raise typer.Abort()   # ← sandbox NOT cleaned up (worktree + branch leaked)

_create_sandbox_for_plan calls GitWorktreeSandbox.create(plan_id) which executes:

git worktree add -b cleveragents/plan-<plan_id> /tmp/ca-sandbox-<id>/ HEAD
Stores ctx.sandbox_path and ctx.sandbox_id

Both the temp directory and the branch are now on disk. Raising typer.Abort() exits Python without calling sandbox_obj.cleanup().

This is closely related to #6632 (sandbox not cleaned on execute failure), but is a distinct ordering bug: the guard that would prevent any work from being done runs AFTER the sandbox is already created.

Files Involved

File	Role
`src/cleveragents/cli/commands/plan.py`	Creates sandbox before read-only guard
`src/cleveragents/infrastructure/sandbox/git_worktree.py`	`GitWorktreeSandbox.create()` creates the worktree

Data Flow Where It Breaks

execute_plan(plan_id)
  ↓
_create_sandbox_for_plan(plan_id, service)
  → GitWorktreeSandbox.create(plan_id)
  → git worktree add → creates /tmp/ca-sandbox-XYZ/ + branch cleveragents/plan-<plan_id>
  ↓
current_plan = service.get_plan(plan_id)
if current_plan.read_only is True:
    raise typer.Abort()   ← exits Python immediately
                          ← /tmp/ca-sandbox-XYZ/ and branch left on disk

Expected Behavior

The read-only check (and the plan not found check) should happen BEFORE _create_sandbox_for_plan is called, so no git operations are performed for plans that will be immediately rejected.

Actual Behavior

A git worktree and branch are created for every read-only plan that a user (or automation) tries to execute, then immediately leaked when the guard triggers.

Suggested Fix

Reorder: run all pre-flight guards before creating the sandbox.

# plan.py — execute_plan()

# STEP 1: Load plan and run all guards BEFORE creating any sandbox
current_plan = service.get_plan(plan_id)
if current_plan is None:
    console.print(f"[red]Plan '{plan_id}' not found.[/red]")
    raise typer.Abort()
if current_plan.read_only is True:
    console.print(f"[red]Cannot execute plan '{plan_id}': plan is read-only.[/red]")
    raise typer.Abort()
# ... any other pre-flight checks ...

# STEP 2: Only now create the sandbox
sandbox_root, sandbox_obj = _create_sandbox_for_plan(plan_id, service)
try:
    executor = _get_plan_executor(lifecycle_service=service, sandbox_root=sandbox_root)
    ...
finally:
    if sandbox_obj is not None:
        with contextlib.suppress(Exception):
            sandbox_obj.cleanup()

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_, and @tdd_expected_fail to prove the bug exists before fixing it.

Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: bug-hunter

## Bug Report: [cross-module] — Sandbox created before read-only guard; always leaked for read-only plans ### Severity Assessment - **Impact**: Every `agents plan execute` call on a read-only plan creates a git worktree sandbox (branch + temp directory) before checking the read-only flag, then immediately aborts with `typer.Abort()`. The sandbox is never cleaned up. Repeated attempts on the same read-only plan create a new, differently-named worktree each time (ULIDs are unique), so the leak is unbounded. - **Likelihood**: High — any automation that retries a read-only plan creates leaks on every retry. - **Priority**: High ### Location - **File**: `src/cleveragents/cli/commands/plan.py` - **Function**: `execute_plan()` (the `@app.command("execute")` handler) - **Lines**: 2370–2390 ### Description The `execute_plan` command creates the sandbox at line 2372, **before** the read-only guard at lines 2384–2390: ```python # plan.py, lines 2370-2390 # Create sandbox for this plan (git worktree or flat fallback) # and build the executor with the sandbox path. sandbox_root, sandbox_obj = _create_sandbox_for_plan(plan_id, service) # ← sandbox created executor = _get_plan_executor( lifecycle_service=service, sandbox_root=sandbox_root, ) # Determine current phase and run the appropriate processing current_plan = service.get_plan(plan_id) if current_plan is None: console.print(f"[red]Plan '{plan_id}' not found.[/red]") raise typer.Abort() # ← sandbox NOT cleaned up # Fail-fast: read-only plans must not enter Execute phase if current_plan.read_only is True: console.print( f"[red]Cannot execute plan '{plan_id}': plan is read-only.[/red]" ) raise typer.Abort() # ← sandbox NOT cleaned up (worktree + branch leaked) ``` `_create_sandbox_for_plan` calls `GitWorktreeSandbox.create(plan_id)` which executes: 1. `git worktree add -b cleveragents/plan-<plan_id> /tmp/ca-sandbox-<id>/ HEAD` 2. Stores `ctx.sandbox_path` and `ctx.sandbox_id` Both the temp directory and the branch are now on disk. Raising `typer.Abort()` exits Python without calling `sandbox_obj.cleanup()`. This is closely related to #6632 (sandbox not cleaned on execute failure), but is a distinct ordering bug: the guard that would prevent any work from being done runs AFTER the sandbox is already created. ### Files Involved | File | Role | |------|------| | `src/cleveragents/cli/commands/plan.py` | Creates sandbox before read-only guard | | `src/cleveragents/infrastructure/sandbox/git_worktree.py` | `GitWorktreeSandbox.create()` creates the worktree | ### Data Flow Where It Breaks ``` execute_plan(plan_id) ↓ _create_sandbox_for_plan(plan_id, service) → GitWorktreeSandbox.create(plan_id) → git worktree add → creates /tmp/ca-sandbox-XYZ/ + branch cleveragents/plan-<plan_id> ↓ current_plan = service.get_plan(plan_id) if current_plan.read_only is True: raise typer.Abort() ← exits Python immediately ← /tmp/ca-sandbox-XYZ/ and branch left on disk ``` ### Expected Behavior The read-only check (and the `plan not found` check) should happen BEFORE `_create_sandbox_for_plan` is called, so no git operations are performed for plans that will be immediately rejected. ### Actual Behavior A git worktree and branch are created for every read-only plan that a user (or automation) tries to execute, then immediately leaked when the guard triggers. ### Suggested Fix Reorder: run all pre-flight guards before creating the sandbox. ```python # plan.py — execute_plan() # STEP 1: Load plan and run all guards BEFORE creating any sandbox current_plan = service.get_plan(plan_id) if current_plan is None: console.print(f"[red]Plan '{plan_id}' not found.[/red]") raise typer.Abort() if current_plan.read_only is True: console.print(f"[red]Cannot execute plan '{plan_id}': plan is read-only.[/red]") raise typer.Abort() # ... any other pre-flight checks ... # STEP 2: Only now create the sandbox sandbox_root, sandbox_obj = _create_sandbox_for_plan(plan_id, service) try: executor = _get_plan_executor(lifecycle_service=service, sandbox_root=sandbox_root) ... finally: if sandbox_obj is not None: with contextlib.suppress(Exception): sandbox_obj.cleanup() ``` ### Category resource / cross-module / ordering ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: bug-hunter

HAL9000 added the

labels

2026-04-09 22:45:17 +00:00

HAL9000 added this to the v3.2.0 milestone

2026-04-09 22:47:13 +00:00

HAL9000 referenced this issue