feat(plan): implement git worktree sandbox for execute and merge-based apply #5998

2026-04-09T13:27:01Z

hamza.khyari commented

2026-04-09 13:27:01 +00:00

Summary

Implement spec-aligned git worktree sandbox for the plan execute/apply lifecycle (specification.md §13225-13276).

Execute phase: creates an isolated git worktree via GitWorktreeSandbox for the plan's linked git-checkout resource. LLM file output is written to the worktree and committed on branch cleveragents/plan-<plan_id> — no merge yet.

Apply phase: merges the worktree branch into the project's current branch via git merge. Prints spec-aligned panels:

Apply Summary: Plan ID, artifacts count, insertions/deletions, project name, applied-at timestamp
Sandbox Cleanup: worktree removed, branch merged to main
Next Steps: review git diff, commit changes
Footer: ✓ OK Changes applied

Non-git projects fall back to the original flat directory sandbox with shutil.copy2.

Also fixes

Context assembly failure (#4454): ContextFragment metadata values (detail_depth, relevance_score) must be strings, not int/float. Pydantic validation errors crashed the context assembler, leaving the LLM with zero file context.
Duplicate execute dispatch (#2265): A2A facade _handle_plan_execute now checks if the plan has already reached execute/apply phase before attempting a transition, eliminating the noisy "Invalid phase transition from execute to execute" error.
Assembly error logging: execute_context_assembly_failed warning now includes the actual error string.

Changed files

src/cleveragents/cli/commands/plan.py — sandbox creation, worktree apply, spec panels, facade notify reorder
src/cleveragents/a2a/facade.py — idempotent _handle_plan_execute
src/cleveragents/application/services/context_tier_hydrator.py — metadata type fix
src/cleveragents/application/services/llm_actors.py — error string in assembly warning
features/git_worktree_apply.feature + step file — 6 Behave scenarios
CHANGELOG.md — updated
4 existing step files — added mocks for new sandbox functions

Testing

M1 E2E: m1-plan-lifecycle-ok
Scenario-1: full end-to-end with real LLM — calculator fixed, spec panels displayed, zero warnings
6 new Behave scenarios for git worktree apply lifecycle

Closes #4454
Closes #2265

## Summary Implement spec-aligned git worktree sandbox for the plan execute/apply lifecycle (specification.md §13225-13276). **Execute phase**: creates an isolated git worktree via `GitWorktreeSandbox` for the plan's linked git-checkout resource. LLM file output is written to the worktree and committed on branch `cleveragents/plan-<plan_id>` — no merge yet. **Apply phase**: merges the worktree branch into the project's current branch via `git merge`. Prints spec-aligned panels: - **Apply Summary**: Plan ID, artifacts count, insertions/deletions, project name, applied-at timestamp - **Sandbox Cleanup**: worktree removed, branch merged to main - **Next Steps**: review git diff, commit changes - **Footer**: ✓ OK Changes applied Non-git projects fall back to the original flat directory sandbox with `shutil.copy2`. ## Also fixes - **Context assembly failure** (#4454): `ContextFragment` metadata values (`detail_depth`, `relevance_score`) must be strings, not int/float. Pydantic validation errors crashed the context assembler, leaving the LLM with zero file context. - **Duplicate execute dispatch** (#2265): A2A facade `_handle_plan_execute` now checks if the plan has already reached execute/apply phase before attempting a transition, eliminating the noisy "Invalid phase transition from execute to execute" error. - **Assembly error logging**: `execute_context_assembly_failed` warning now includes the actual error string. ## Changed files - `src/cleveragents/cli/commands/plan.py` — sandbox creation, worktree apply, spec panels, facade notify reorder - `src/cleveragents/a2a/facade.py` — idempotent `_handle_plan_execute` - `src/cleveragents/application/services/context_tier_hydrator.py` — metadata type fix - `src/cleveragents/application/services/llm_actors.py` — error string in assembly warning - `features/git_worktree_apply.feature` + step file — 6 Behave scenarios - `CHANGELOG.md` — updated - 4 existing step files — added mocks for new sandbox functions ## Testing - M1 E2E: `m1-plan-lifecycle-ok` - Scenario-1: full end-to-end with real LLM — calculator fixed, spec panels displayed, zero warnings - 6 new Behave scenarios for git worktree apply lifecycle Closes #4454 Closes #2265

HAL9000 commented

2026-04-09 13:53:35 +00:00

Thank you for this contribution, @hamza.khyari. I have reviewed PR #5998 and the linked issue #4454.

Summary of Changes

This PR wires sandbox_root into PlanExecutor via _get_plan_executor() so that the execute phase writes LLM-generated file output to .cleveragents/sandbox/. Without this wiring, PlanExecutor received sandbox_root=None, causing LLM code changes to be discarded after execute completed.

Assessment

The change is targeted and addresses a clear gap: the sandbox wiring was missing, preventing the execute phase from persisting LLM output. The M1 E2E test confirmation (Applied 2 file(s) from sandbox to project.) demonstrates the fix works end-to-end.

Observations

The PR currently has no labels or milestone assigned. I will flag this for the project owner to assign appropriate labels.
The linked issue #4454 has a Needs Feedback label. Per project policy, issues with Needs Feedback are proposals awaiting human review. The PR closes this issue, which suggests the feedback has been addressed. Please confirm whether the Needs Feedback label on #4454 can be removed.
The PR description does not include quality gate results (lint, typecheck, unit_tests, integration_tests, coverage). Please confirm these pass before requesting review.

Next Steps

This PR will be reviewed by the implementation team once CI checks complete. I will monitor the CI status and provide updates.

Automated by CleverAgents Bot
Supervisor: Human Liaison | Agent: human-liaison

Thank you for this contribution, @hamza.khyari. I have reviewed PR #5998 and the linked issue #4454. **Summary of Changes** This PR wires `sandbox_root` into `PlanExecutor` via `_get_plan_executor()` so that the execute phase writes LLM-generated file output to `.cleveragents/sandbox/`. Without this wiring, `PlanExecutor` received `sandbox_root=None`, causing LLM code changes to be discarded after execute completed. **Assessment** The change is targeted and addresses a clear gap: the sandbox wiring was missing, preventing the execute phase from persisting LLM output. The M1 E2E test confirmation (`Applied 2 file(s) from sandbox to project.`) demonstrates the fix works end-to-end. **Observations** - The PR currently has no labels or milestone assigned. I will flag this for the project owner to assign appropriate labels. - The linked issue #4454 has a `Needs Feedback` label. Per project policy, issues with `Needs Feedback` are proposals awaiting human review. The PR closes this issue, which suggests the feedback has been addressed. Please confirm whether the `Needs Feedback` label on #4454 can be removed. - The PR description does not include quality gate results (lint, typecheck, unit_tests, integration_tests, coverage). Please confirm these pass before requesting review. **Next Steps** This PR will be reviewed by the implementation team once CI checks complete. I will monitor the CI status and provide updates. --- **Automated by CleverAgents Bot** Supervisor: Human Liaison | Agent: human-liaison

hamza.khyari force-pushed feature/git-worktree-apply from 9a20c91928 to 185bebe323

2026-04-09 14:04:22 +00:00

Compare

Gate	Status
`lint`	Successful
`typecheck`	Successful
`quality`	Successful
`security`	Successful
`integration_tests`	Successful
`e2e_tests`	Successful
`build`	Successful
`helm`	Successful
`unit_tests`	Failing
`coverage`	Running