UAT: Apply phase never calls SandboxManager.commit_all() — sandbox changes are never merged into real resources #4879

Open
opened 2026-04-08 20:12:36 +00:00 by HAL9000 · 0 comments
Owner

Bug Report

Feature Area: Sandbox and Checkpoint — Apply phase sandbox merge
Severity: Critical — the core spec contract of the Apply phase is broken; no sandbox changes ever reach real resources


What Was Tested

Code-level analysis of the full Apply phase execution path:

  • src/cleveragents/cli/commands/plan.pyplan apply CLI command
  • src/cleveragents/application/services/plan_apply_service.pyPlanApplyService.apply_with_validation_gate()
  • src/cleveragents/application/services/plan_lifecycle_service.pycomplete_apply(), _complete_apply_if_queued()
  • src/cleveragents/infrastructure/sandbox/manager.pySandboxManager.commit_all()

Expected Behavior (from spec)

The Apply phase is defined as merging the sandbox changeset into real resources. The spec states:

Apply phase: Merges the sandbox changeset into the real resource targets. SandboxManager.commit_all(plan_id) must be called to atomically commit all sandboxed changes back to the original resource paths.

The SandboxManager.commit_all() method exists and implements exactly this: it iterates all active sandboxes for a plan and calls sandbox.commit() on each, with atomic rollback if any commit fails.

Actual Behavior (from code)

The Apply phase never calls SandboxManager.commit_all(). The entire Apply phase is a pure metadata transition:

  1. plan apply CLI → calls service.apply_plan(plan_id) then service._complete_apply_if_queued(plan_id)
  2. _complete_apply_if_queued() → calls start_apply() then complete_apply()
  3. complete_apply() → sets processing_state = ProcessingState.APPLIED, emits PLAN_APPLIED event, done

No sandbox commit occurs anywhere in this path. The SandboxManager.commit_all() method is only referenced in:

  • manager.py itself (the implementation)
  • protocol.py (the docstring example)
  • _fs_utils.py (a comment)

It is never called from any application service, CLI command, or workflow.

Impact

Any plan that uses a sandbox strategy (git_worktree, copy_on_write, overlay, etc.) will:

  1. Execute successfully in the sandbox (changes written to isolated copy)
  2. Show a correct diff via plan diff
  3. "Apply" successfully (state transitions to APPLIED)
  4. But the actual files on disk are never changed — the sandbox is silently abandoned

This means the entire Execute → Apply flow produces no real-world effect for sandboxed resources.

Code Location

  • SandboxManager.commit_all()src/cleveragents/infrastructure/sandbox/manager.py:208
  • PlanApplyService.apply_with_validation_gate()src/cleveragents/application/services/plan_apply_service.py:482
  • PlanLifecycleService.complete_apply()src/cleveragents/application/services/plan_lifecycle_service.py:1752
  • _complete_apply_if_queued()src/cleveragents/application/services/plan_lifecycle_service.py:2179

Steps to Reproduce

  1. Register a git-checkout resource with git_worktree sandbox strategy
  2. Create and execute a plan that writes files via tools
  3. Run agents plan apply <PLAN_ID>
  4. Observe: plan state transitions to APPLIED but no files changed in the original repository

Fix Direction

PlanApplyService.apply_with_validation_gate() (or PlanLifecycleService.complete_apply()) must:

  1. Resolve the SandboxManager for the plan
  2. Call sandbox_manager.commit_all(plan_id) before transitioning to APPLIED
  3. On AtomicCommitError, call handle_merge_failure() to set the plan to ERRORED

The PlanApplyService already has a checkpoint_manager parameter — a sandbox_manager parameter should be added similarly.


Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: uat-tester

## Bug Report **Feature Area:** Sandbox and Checkpoint — Apply phase sandbox merge **Severity:** Critical — the core spec contract of the Apply phase is broken; no sandbox changes ever reach real resources --- ## What Was Tested Code-level analysis of the full Apply phase execution path: - `src/cleveragents/cli/commands/plan.py` — `plan apply` CLI command - `src/cleveragents/application/services/plan_apply_service.py` — `PlanApplyService.apply_with_validation_gate()` - `src/cleveragents/application/services/plan_lifecycle_service.py` — `complete_apply()`, `_complete_apply_if_queued()` - `src/cleveragents/infrastructure/sandbox/manager.py` — `SandboxManager.commit_all()` ## Expected Behavior (from spec) The Apply phase is defined as merging the sandbox changeset into real resources. The spec states: > **Apply phase**: Merges the sandbox changeset into the real resource targets. `SandboxManager.commit_all(plan_id)` must be called to atomically commit all sandboxed changes back to the original resource paths. The `SandboxManager.commit_all()` method exists and implements exactly this: it iterates all active sandboxes for a plan and calls `sandbox.commit()` on each, with atomic rollback if any commit fails. ## Actual Behavior (from code) The Apply phase **never calls `SandboxManager.commit_all()`**. The entire Apply phase is a pure metadata transition: 1. `plan apply` CLI → calls `service.apply_plan(plan_id)` then `service._complete_apply_if_queued(plan_id)` 2. `_complete_apply_if_queued()` → calls `start_apply()` then `complete_apply()` 3. `complete_apply()` → sets `processing_state = ProcessingState.APPLIED`, emits `PLAN_APPLIED` event, **done** No sandbox commit occurs anywhere in this path. The `SandboxManager.commit_all()` method is only referenced in: - `manager.py` itself (the implementation) - `protocol.py` (the docstring example) - `_fs_utils.py` (a comment) It is **never called** from any application service, CLI command, or workflow. ## Impact Any plan that uses a sandbox strategy (`git_worktree`, `copy_on_write`, `overlay`, etc.) will: 1. Execute successfully in the sandbox (changes written to isolated copy) 2. Show a correct diff via `plan diff` 3. "Apply" successfully (state transitions to APPLIED) 4. **But the actual files on disk are never changed** — the sandbox is silently abandoned This means the entire Execute → Apply flow produces no real-world effect for sandboxed resources. ## Code Location - `SandboxManager.commit_all()` — `src/cleveragents/infrastructure/sandbox/manager.py:208` - `PlanApplyService.apply_with_validation_gate()` — `src/cleveragents/application/services/plan_apply_service.py:482` - `PlanLifecycleService.complete_apply()` — `src/cleveragents/application/services/plan_lifecycle_service.py:1752` - `_complete_apply_if_queued()` — `src/cleveragents/application/services/plan_lifecycle_service.py:2179` ## Steps to Reproduce 1. Register a git-checkout resource with `git_worktree` sandbox strategy 2. Create and execute a plan that writes files via tools 3. Run `agents plan apply <PLAN_ID>` 4. Observe: plan state transitions to APPLIED but no files changed in the original repository ## Fix Direction `PlanApplyService.apply_with_validation_gate()` (or `PlanLifecycleService.complete_apply()`) must: 1. Resolve the `SandboxManager` for the plan 2. Call `sandbox_manager.commit_all(plan_id)` before transitioning to APPLIED 3. On `AtomicCommitError`, call `handle_merge_failure()` to set the plan to ERRORED The `PlanApplyService` already has a `checkpoint_manager` parameter — a `sandbox_manager` parameter should be added similarly. --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: uat-tester
HAL9000 added this to the v3.4.0 milestone 2026-04-08 20:19:00 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#4879
No description provided.