BUG-HUNT: [cross-module] GitWorktreeSandbox not cleaned up when plan execute fails — orphaned worktrees accumulate on disk #6632

Open
opened 2026-04-09 22:36:25 +00:00 by HAL9000 · 0 comments
Owner

Bug Report: [cross-module] — GitWorktreeSandbox not cleaned up when plan execute fails

Severity Assessment

  • Impact: Orphaned git worktrees and branches accumulate on disk whenever plan execute encounters a strategize or execute-phase failure. Each orphaned worktree occupies a full copy of the repository (disk space proportional to repo size). If the branch name collision is later hit, the next plan execute will fail.
  • Likelihood: High. Any LLM failure, timeout, or PlanError during execution leaves the worktree uncleaned.
  • Priority: High

Location

  • File: src/cleveragents/cli/commands/plan.py
  • Function: execute_plan() (the @app.command("execute") handler)
  • Lines: 2370–2501

Description

The execute_plan CLI command creates a GitWorktreeSandbox at line 2372:

sandbox_root, sandbox_obj = _create_sandbox_for_plan(plan_id, service)

_create_sandbox_for_plan calls GitWorktreeSandbox.create(plan_id), which creates a real git worktree directory and a cleveragents/plan-<plan_id> branch in the project repository.

The happy path correctly calls _commit_worktree_changes(sandbox_obj.context.sandbox_path, plan_id) at line 2448. However, every exception handler (PreflightRejection, InvalidPhaseTransitionError, PlanNotReadyError, ValueError, CleverAgentsError, generic Exception) never calls sandbox_obj.cleanup(). The worktree directory and branch are left permanently on disk.

The exception handlers are:

    except PreflightRejection as e:
        console.print(...)
        raise typer.Abort() from e   # ← sandbox_obj.cleanup() NEVER called
    except InvalidPhaseTransitionError as e:
        ...
    except PlanNotReadyError as e:
        ...
    except ValueError as e:
        ...
    except CleverAgentsError as e:
        ...
    except Exception as e:
        ...

Additionally, typer.Abort() raised on the fail-fast paths (lines 2388–2390) also bypasses cleanup because the sandbox_obj was already created before the guard checks.

Files Involved

File Role
src/cleveragents/cli/commands/plan.py Creates sandbox_obj, forgets to clean it up on failure
src/cleveragents/infrastructure/sandbox/git_worktree.py GitWorktreeSandbox.cleanup() removes worktree + branch

Data Flow Where It Breaks

execute_plan()
  ↓
sandbox_root, sandbox_obj = _create_sandbox_for_plan(...)   ← worktree created
  ↓
executor.run_execute(plan_id)   ← raises PlanError / ValueError / etc.
  ↓
except CleverAgentsError:
    raise typer.Abort()   ← NO sandbox_obj.cleanup() called
  ↓
Git worktree at /tmp/ca-sandbox-<id>/ + branch cleveragents/plan-<plan_id>
left on disk forever

Expected Behavior

sandbox_obj.cleanup() must be called in all failure paths (and in a finally block to guarantee it). The GitWorktreeSandbox.cleanup() method is idempotent, so calling it multiple times is safe.

Actual Behavior

Any exception during strategy/execute, or any early typer.Abort() raised after line 2372, leaves the git worktree directory and branch present on disk. On repeated failed executions, these accumulate without bound.

Suggested Fix

Wrap the entire execute_plan body in a try/finally that calls sandbox_obj.cleanup():

sandbox_root, sandbox_obj = _create_sandbox_for_plan(plan_id, service)
try:
    executor = _get_plan_executor(...)
    ...  # all existing logic
finally:
    if sandbox_obj is not None:
        with contextlib.suppress(Exception):
            sandbox_obj.cleanup()

Category

resource / cross-module

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_, and @tdd_expected_fail to prove the bug exists before fixing it.


Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: bug-hunter

## Bug Report: [cross-module] — GitWorktreeSandbox not cleaned up when `plan execute` fails ### Severity Assessment - **Impact**: Orphaned git worktrees and branches accumulate on disk whenever `plan execute` encounters a strategize or execute-phase failure. Each orphaned worktree occupies a full copy of the repository (disk space proportional to repo size). If the branch name collision is later hit, the next `plan execute` will fail. - **Likelihood**: High. Any LLM failure, timeout, or `PlanError` during execution leaves the worktree uncleaned. - **Priority**: High ### Location - **File**: `src/cleveragents/cli/commands/plan.py` - **Function**: `execute_plan()` (the `@app.command("execute")` handler) - **Lines**: 2370–2501 ### Description The `execute_plan` CLI command creates a `GitWorktreeSandbox` at line 2372: ```python sandbox_root, sandbox_obj = _create_sandbox_for_plan(plan_id, service) ``` `_create_sandbox_for_plan` calls `GitWorktreeSandbox.create(plan_id)`, which creates a real git worktree directory and a `cleveragents/plan-<plan_id>` branch in the project repository. The happy path correctly calls `_commit_worktree_changes(sandbox_obj.context.sandbox_path, plan_id)` at line 2448. However, every exception handler (`PreflightRejection`, `InvalidPhaseTransitionError`, `PlanNotReadyError`, `ValueError`, `CleverAgentsError`, generic `Exception`) **never calls `sandbox_obj.cleanup()`**. The worktree directory and branch are left permanently on disk. The exception handlers are: ```python except PreflightRejection as e: console.print(...) raise typer.Abort() from e # ← sandbox_obj.cleanup() NEVER called except InvalidPhaseTransitionError as e: ... except PlanNotReadyError as e: ... except ValueError as e: ... except CleverAgentsError as e: ... except Exception as e: ... ``` Additionally, `typer.Abort()` raised on the fail-fast paths (lines 2388–2390) also bypasses cleanup because the `sandbox_obj` was already created before the guard checks. ### Files Involved | File | Role | |------|------| | `src/cleveragents/cli/commands/plan.py` | Creates `sandbox_obj`, forgets to clean it up on failure | | `src/cleveragents/infrastructure/sandbox/git_worktree.py` | `GitWorktreeSandbox.cleanup()` removes worktree + branch | ### Data Flow Where It Breaks ``` execute_plan() ↓ sandbox_root, sandbox_obj = _create_sandbox_for_plan(...) ← worktree created ↓ executor.run_execute(plan_id) ← raises PlanError / ValueError / etc. ↓ except CleverAgentsError: raise typer.Abort() ← NO sandbox_obj.cleanup() called ↓ Git worktree at /tmp/ca-sandbox-<id>/ + branch cleveragents/plan-<plan_id> left on disk forever ``` ### Expected Behavior `sandbox_obj.cleanup()` must be called in all failure paths (and in a `finally` block to guarantee it). The `GitWorktreeSandbox.cleanup()` method is idempotent, so calling it multiple times is safe. ### Actual Behavior Any exception during strategy/execute, or any early `typer.Abort()` raised after line 2372, leaves the git worktree directory and branch present on disk. On repeated failed executions, these accumulate without bound. ### Suggested Fix Wrap the entire `execute_plan` body in a `try/finally` that calls `sandbox_obj.cleanup()`: ```python sandbox_root, sandbox_obj = _create_sandbox_for_plan(plan_id, service) try: executor = _get_plan_executor(...) ... # all existing logic finally: if sandbox_obj is not None: with contextlib.suppress(Exception): sandbox_obj.cleanup() ``` ### Category resource / cross-module ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: bug-hunter
HAL9000 added this to the v3.2.0 milestone 2026-04-09 22:47:15 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#6632
No description provided.