fix(plan): wire sandbox_root into plan execute pipeline #1313

Open
opened 2026-04-02 11:45:03 +00:00 by brent.edwards · 2 comments
Member

Metadata

  • Commit Message: fix(plan): wire sandbox_root into plan execute pipeline
  • Branch: bugfix/m1-plan-execute-sandbox-root

Background

During development of #1249 (updating m1_acceptance.robot with output validation), it was discovered that plan execute does not write LLM-generated files to the sandbox directory. The LLM is invoked and successfully generates file content (e.g., HELLO.md), but the files are never written to disk.

Root Cause

_get_plan_executor() in src/cleveragents/cli/commands/plan.py (approx. line 1267) does not pass sandbox_root to PlanExecutor. This causes the following chain of failure:

  1. PlanExecutor.__init__() receives sandbox_root=None (default)
  2. PlanExecutor._run_execute_with_stub() passes sandbox_root=None to LLMExecuteActor.execute()
  3. LLMExecuteActor.execute() checks if sandbox_root is not None and not read_only: before calling _write_to_sandbox() — this guard evaluates to False
  4. _write_to_sandbox() is never called, so files generated by the LLM are silently discarded

Key Code Locations

File Location Role
src/cleveragents/cli/commands/plan.py _get_plan_executor() (~line 1267) Missing sandbox_root parameter
src/cleveragents/application/services/plan_executor.py __init__() (~line 293) Receives None for sandbox_root
src/cleveragents/application/services/plan_executor.py _run_execute_with_stub() (~line 726) Passes None to actor
src/cleveragents/application/services/llm_actors.py execute() (~line 354) Guard skips _write_to_sandbox()
src/cleveragents/application/services/llm_actors.py _write_to_sandbox() (~line 409) The file writer that is never called

Secondary Issues

  1. plan apply does not merge sandbox: _lifecycle_apply_with_id() only transitions plan state metadata. It never calls GitWorktreeSandbox.commit() to merge changes from a sandbox into the target repo. Even if files were written to a sandbox, they would not make it into the target repository.

  2. No GitWorktreeSandbox created in the execute pipeline: The sandbox infrastructure (infrastructure/sandbox/git_worktree.py) is fully implemented and tested in isolation, but is never instantiated or wired into the CLI plan execute flow.

Impact

This bug means the core M1 milestone acceptance criterion — "Execute/apply runs via actor-based LLM path producing tool invocations and a ChangeSet" — is not fully operational in the CLI. The LLM runs and generates output, but the generated files are silently lost.

Acceptance Criteria

  • _get_plan_executor() resolves the project's git-checkout resource path and creates or obtains a sandbox directory
  • sandbox_root is passed to PlanExecutor and propagated to LLMExecuteActor
  • LLMExecuteActor._write_to_sandbox() is called when the LLM generates file content
  • After plan execute, generated files exist in the sandbox directory
  • plan diff shows the generated file changes
  • plan apply merges sandbox changes into the target repository with a git commit
  • Post-apply, the generated files exist in the target repository
  • The tdd_expected_fail tag is removed from the M1 acceptance E2E test and the test passes

Definition of Done

This issue is complete when:

  • All acceptance criteria above are satisfied.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
## Metadata - **Commit Message**: `fix(plan): wire sandbox_root into plan execute pipeline` - **Branch**: `bugfix/m1-plan-execute-sandbox-root` ## Background During development of #1249 (updating `m1_acceptance.robot` with output validation), it was discovered that `plan execute` does **not** write LLM-generated files to the sandbox directory. The LLM is invoked and successfully generates file content (e.g., `HELLO.md`), but the files are never written to disk. ### Root Cause `_get_plan_executor()` in `src/cleveragents/cli/commands/plan.py` (approx. line 1267) does **not** pass `sandbox_root` to `PlanExecutor`. This causes the following chain of failure: 1. `PlanExecutor.__init__()` receives `sandbox_root=None` (default) 2. `PlanExecutor._run_execute_with_stub()` passes `sandbox_root=None` to `LLMExecuteActor.execute()` 3. `LLMExecuteActor.execute()` checks `if sandbox_root is not None and not read_only:` before calling `_write_to_sandbox()` — this guard evaluates to `False` 4. `_write_to_sandbox()` is **never called**, so files generated by the LLM are silently discarded ### Key Code Locations | File | Location | Role | |------|----------|------| | `src/cleveragents/cli/commands/plan.py` | `_get_plan_executor()` (~line 1267) | **Missing `sandbox_root` parameter** | | `src/cleveragents/application/services/plan_executor.py` | `__init__()` (~line 293) | Receives `None` for `sandbox_root` | | `src/cleveragents/application/services/plan_executor.py` | `_run_execute_with_stub()` (~line 726) | Passes `None` to actor | | `src/cleveragents/application/services/llm_actors.py` | `execute()` (~line 354) | Guard skips `_write_to_sandbox()` | | `src/cleveragents/application/services/llm_actors.py` | `_write_to_sandbox()` (~line 409) | The file writer that is never called | ### Secondary Issues 1. **`plan apply` does not merge sandbox:** `_lifecycle_apply_with_id()` only transitions plan state metadata. It never calls `GitWorktreeSandbox.commit()` to merge changes from a sandbox into the target repo. Even if files were written to a sandbox, they would not make it into the target repository. 2. **No `GitWorktreeSandbox` created in the execute pipeline:** The sandbox infrastructure (`infrastructure/sandbox/git_worktree.py`) is fully implemented and tested in isolation, but is never instantiated or wired into the CLI `plan execute` flow. ### Impact This bug means the core M1 milestone acceptance criterion — "Execute/apply runs via actor-based LLM path producing tool invocations and a ChangeSet" — is not fully operational in the CLI. The LLM runs and generates output, but the generated files are silently lost. ## Acceptance Criteria - [ ] `_get_plan_executor()` resolves the project's git-checkout resource path and creates or obtains a sandbox directory - [ ] `sandbox_root` is passed to `PlanExecutor` and propagated to `LLMExecuteActor` - [ ] `LLMExecuteActor._write_to_sandbox()` is called when the LLM generates file content - [ ] After `plan execute`, generated files exist in the sandbox directory - [ ] `plan diff` shows the generated file changes - [ ] `plan apply` merges sandbox changes into the target repository with a git commit - [ ] Post-apply, the generated files exist in the target repository - [ ] The `tdd_expected_fail` tag is removed from the M1 acceptance E2E test and the test passes ## Definition of Done This issue is complete when: - All acceptance criteria above are satisfied. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done.
Owner

Triage: Verified

This is a well-documented, critical bug filed by @brent.edwards during development of #1249 (M1 acceptance E2E test hardening). The root cause analysis is thorough and the code path is clearly traced.

Assessment:

  • Priority: Critical — confirmed. This bug means the core plan execute pipeline silently discards LLM-generated files, which breaks the M1 acceptance criterion ("Execute/apply runs via actor-based LLM path producing tool invocations and a ChangeSet").
  • Milestone: Assigning to v3.7.0. While this is an M1-era bug, it was discovered during v3.7.0 development and blocks the full pipeline from being operational.
  • Parent Epic: This relates to the plan lifecycle pipeline. Linking to the execution pipeline work.
  • Completeness: The issue has all required sections per CONTRIBUTING.md — Metadata, Background, Root Cause, Acceptance Criteria, and Definition of Done. The acceptance criteria are detailed and serve as implicit subtasks.

Next step: This issue is now verified and ready for implementation. The branch name bugfix/m1-plan-execute-sandbox-root and commit message are pre-defined in the Metadata section.

## Triage: Verified This is a well-documented, critical bug filed by @brent.edwards during development of #1249 (M1 acceptance E2E test hardening). The root cause analysis is thorough and the code path is clearly traced. **Assessment:** - **Priority**: Critical — confirmed. This bug means the core plan execute pipeline silently discards LLM-generated files, which breaks the M1 acceptance criterion ("Execute/apply runs via actor-based LLM path producing tool invocations and a ChangeSet"). - **Milestone**: Assigning to **v3.7.0**. While this is an M1-era bug, it was discovered during v3.7.0 development and blocks the full pipeline from being operational. - **Parent Epic**: This relates to the plan lifecycle pipeline. Linking to the execution pipeline work. - **Completeness**: The issue has all required sections per CONTRIBUTING.md — Metadata, Background, Root Cause, Acceptance Criteria, and Definition of Done. The acceptance criteria are detailed and serve as implicit subtasks. **Next step**: This issue is now verified and ready for implementation. The branch name `bugfix/m1-plan-execute-sandbox-root` and commit message are pre-defined in the Metadata section.
freemo added this to the v3.7.0 milestone 2026-04-02 16:49:03 +00:00
freemo self-assigned this 2026-04-02 18:45:24 +00:00
Owner

Label compliance fix applied:

  • Added missing label: MoSCoW/Must have
  • Reason: This is a Priority/Critical bug in State/Verified for the v3.7.0 milestone. Wiring sandbox_root into the plan execute pipeline is a core M1 acceptance criterion — it qualifies as Must Have.

Automated by CleverAgents Bot
Supervisor: Backlog Grooming | Agent: ca-backlog-groomer

Label compliance fix applied: - Added missing label: `MoSCoW/Must have` - Reason: This is a `Priority/Critical` bug in `State/Verified` for the v3.7.0 milestone. Wiring `sandbox_root` into the plan execute pipeline is a core M1 acceptance criterion — it qualifies as Must Have. --- **Automated by CleverAgents Bot** Supervisor: Backlog Grooming | Agent: ca-backlog-groomer
freemo modified the milestone from v3.7.0 to v3.2.0 2026-04-02 21:58:02 +00:00
freemo removed this from the v3.2.0 milestone 2026-04-07 02:32:43 +00:00
brent.edwards added this to the v3.2.0 milestone 2026-04-08 18:13:36 +00:00
freemo removed their assignment 2026-04-09 14:52:50 +00:00
brent.edwards added the due date 2026-04-15 2026-04-13 19:21:15 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

2026-04-15

Reference
cleveragents/cleveragents-core#1313
No description provided.