feat(plan): create per-project sandboxes for multi-project plans #10828
No reviewers
Labels
No labels
auto/needs-reevaluation
controller-managed
auto/blocked-by-deps
auto/ci-timeout
auto/claimed-implementer
auto/claimed-merge
auto/claimed-reviewer
auto/driver-down
auto/invariant-violation
auto/last-attempt-tier-0
auto/last-attempt-tier-1
auto/last-attempt-tier-2
auto/last-attempt-tier-min
Automation Tracking
auto/needs-conflict-resolution
auto/needs-implementer
auto/postmortem
auto/ready-to-merge
auto/restart-throttled
auto/revert
auto/sentinel
auto/stale-inactivity
auto/unstable
Blocked
Bounty
$100
Bounty
$1000
Bounty
$10000
Bounty
$20
Bounty
$2000
Bounty
$250
Bounty
$50
Bounty
$500
Bounty
$5000
Bounty
$750
MoSCoW
Could have
MoSCoW
Must have
MoSCoW
Should have
Needs Feedback
Points
1
Points
13
Points
2
Points
21
Points
3
Points
34
Points
5
Points
55
Points
8
Points
88
Priority
Backlog
Priority
CI Blocker
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Signed-off: Owner
Signed-off: Scrum Master
Signed-off: Tech Lead
Spike
State
Completed
State
Duplicate
State
In Progress
State
In Review
State
Paused
State
Unverified
State
Verified
State
Wont Do
Type
Automation
Type
Bug
Type
Discussion
Type
Documentation
Type
Epic
Type
Feature
Type
Legendary
Type
Refactor
Type
Support
Type
Task
Type
Testing
No project
No assignees
4 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
cleveragents/cleveragents-core!10828
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "feature/multi-project-sandbox"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Per spec §19310-19312, each resource gets its own sandbox and Apply commits each sandbox separately. Previously, multi-project plans only created a worktree for the first project's resource — changes for other projects were lost.
How it works
_create_sandbox_for_plancreates a worktree for each git-checkout resource (not just the first)_route_sandbox_files_to_worktreesmoves files to the correct worktree by matching against each resource'sgit ls-filesoutput_apply_sandbox_changesmerges each worktree separately with per-resource Apply Summary panelsSingle-resource plans are fully backward compatible.
Testing
m1-plan-lifecycle-okCloses #7270
6ffb251aa92e25cb985e2e25cb985ef2916267eef2916267ee09788e189d09788e189d9389287eca@HAL9000 rebase this PR
Hi @hamza.khyari, thanks for this PR. Overall the implementation follows the spec, but I found a critical issue with the partial apply logic in _apply_sandbox_changes:
I’ve also noticed the spec reference in the docstring (§13241-13276) doesn’t match the new multi-project sandbox sections (§19310-19313) – please update to the correct spec sections.
Suggestions:
lrtolinked_resourcefor readability.Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
PR Review: !10828 (Ticket #7270)
Verdict: ❌ Request Changes
Two critical issues and four major issues must be resolved before this PR can merge. The core sandbox architecture is sound, but there is a data-loss bug in the file routing algorithm, a serious commit hygiene violation that hides unrelated deletions, and several resource-leak paths.
Critical Issues
C1 — File routing silently loses primary-project changes when files share the same relative path
File:
src/cleveragents/cli/commands/plan.py, lines 1836–1867Problem:
_route_sandbox_files_to_worktreeschecks whether each file in the primary sandbox exists in any non-primary resource'sgit ls-filesoutput. If it does, the file is moved (deleted from primary) to the secondary worktree — without first checking whether the file also belongs to the primary project. In practice, many files share the same relative path across projects (README.md,setup.py,pyproject.toml,src/__init__.py,.gitignore, etc.). When the LLM modifies such a file for the primary project, the routing algorithm will silently move it to the secondary project's worktree, losing the primary project's changes entirely.Example: Project Alpha (primary) and Project Beta both have
src/utils.py. LLM modifiessrc/utils.pyfor Alpha. Routing sees it in Beta's file list → moves it to Beta → Alpha's changes are gone.Recommendation: Build the primary resource's file list as well. Only move a file to a secondary worktree if it matches that secondary resource's file list and does not exist in the primary resource's file list:
C2 — Atomic commit violation: unrelated changes bundled into the feature commit
File: Entire commit on
feature/multi-project-sandboxProblem: The single commit bundles at least 6 unrelated change sets alongside the multi-project sandbox feature, violating CONTRIBUTING.md §85–98 ("One logical change per commit"). The unrelated changes include:
autonomy_guardrail_atomic_load.feature(108 lines) + step file (383 lines)context_analysis_engine.py(328 lines),context analyzeCLI command, feature + step files (~787 lines)lsp/runtime.py+ feature/step files (~393 lines)tui/widgets/prompt.py,tui/app.py, deletes feature/step files (~239 lines)ca-uat-tester.mdNet effect: ~2,500 lines of unrelated code deleted (including a security feature) hidden inside a feature commit. The commit is not cleanly revertible and breaks
git bisect.Recommendation: Remove all unrelated changes from this branch. The commit should contain only: changes to
plan.py, the newfeatures/multi_project_sandbox.feature,features/steps/multi_project_sandbox_steps.py, and any directly related test/config updates. Each unrelated change set must go through its own issue, commit, and PR.Major Issues
M1 —
_create_sandbox_for_plancallscleanup_stalein a loop, destroying previously-created sandboxes for shared reposFile:
src/cleveragents/cli/commands/plan.py, lines 1491–1499Problem: For each resource,
cleanup_stale(resource.location, plan_id)is called beforesandbox.create(plan_id). All sandboxes use the same branch namecleveragents/plan-{plan_id}. If two projects link to the same git repository, the second iteration'scleanup_stalewill delete the branch and worktree just created by the first iteration. The first_SandboxInfoentry then points to a destroyed worktree.Recommendation: Track which
(resource.location, plan_id)pairs have already been processed and skipcleanup_stalefor duplicates, or use resource-specific branch names (e.g.,cleveragents/plan-{plan_id}-{resource_id}).M2 —
_cleanup_sandbox_for_planearly-returns after the first resource, leaking all subsequent sandboxesFile:
src/cleveragents/cli/commands/plan.py, lines 1423–1424Problem: The function has
returnimmediately after the first successfulcleanup_stalecall. With multi-project sandboxes, only the first resource's stale sandbox is ever cleaned up; all others are leaked.Recommendation: Replace
returnwithcontinueso the loop cleans up all resources:M3 — No cleanup of already-created sandboxes when creation fails partway through
File:
src/cleveragents/cli/commands/plan.py, lines 1495–1507Problem:
sandbox.create(plan_id)is not wrapped in a try/except. If creation succeeds for the first N−1 resources but raises for the Nth, the exception propagates out of_create_sandbox_for_plan. The caller has no reference to the partially-created sandboxes, so their git worktrees and branches are never cleaned up.Recommendation: Wrap the creation call in a try/except that cleans up all previously-created sandboxes on failure before re-raising:
M4 — No cleanup of sandboxes when
execute_planfails after creationFile:
src/cleveragents/cli/commands/plan.py, lines 2637–2770Problem: After
_create_sandbox_for_plansucceeds, operations likeexecutor.run_execute(),_route_sandbox_files_to_worktrees(), and_commit_worktree_changes()can fail. The broad exception handlers at lines 2750–2770 print an error and raisetyper.Abort()but never callsandbox_obj.cleanup()on the created sandboxes, leaving git worktrees and branches behind on every failed execution.Recommendation: Add a
finallyblock to clean up sandboxes. SinceGitWorktreeSandbox.cleanup()is already idempotent (returns immediately if alreadyCLEANED_UP), it is safe to call unconditionally:Summary
The multi-project sandbox architecture is well-conceived and the spec compliance is solid (§19310–§19313 all satisfied). The single-project path is fully backward compatible. However, the PR cannot merge as-is for two reasons:
The four major issues (M1–M4) are resource-leak problems on failure paths that should be fixed but are less urgent than C1 and C2.
9389287eca0653048648All review findings addressed:
git ls-files, skip files that exist in primary before moving to secondarycleanup_stalein loop destroys previously-created sandboxesprocessed_reposset, skip duplicate repo paths. Only cleanup stale for first resource._cleanup_sandbox_for_planearly-returns after first resourcereturntocontinue— cleans up all resourcessandbox.create()in try/except, cleans up all previously-created sandboxes before re-raisingfinallyblock with idempotentsandbox_obj.cleanup()for all sandbox_infos.sandbox_infosinitialized beforetryto avoid unbound variable.lrtolinked_resourcecontinuenotreturn— merge failures are per-resource, loop continuesLint passes, typecheck 0 errors, M1 E2E passes, 6 Behave scenarios pass. Ready for re-review.
PR Review: !10828 (Ticket #7270)
Verdict: ❌ Request Changes
Three blocking issues remain after the author's latest push. All previously-reported critical and major issues (C1, C2, M1–M4) were addressed, but the fixes introduced two new major bugs and the test suite has critical coverage gaps that leave the most important fix unprotected against regression.
Critical Issues
None — no spec violations or data-loss bugs in the happy path.
Major Issues
M-NEW-1 —
cleanup_staleskipped for 2nd+ distinct repos, breaking plan re-executionFile:
src/cleveragents/cli/commands/plan.py(in_create_sandbox_for_plan)Problem: The M1 fix used an
if not sandboxes:guard to callcleanup_staleonly for the very first resource. This prevents the original M1 bug (destroying a sandbox just created for the same repo), but it also preventscleanup_stalefrom running on the 2nd, 3rd, etc. resources that are in different repos. If a previous execution of the same plan left stale branches in those repos,sandbox.create()will fail because the branch already exists — and the entire multi-project plan becomes un-re-runnable.Recommendation: Call
cleanup_staleunconditionally for every distinct repo. Theprocessed_reposdedup already handles the same-repo case:M-NEW-2 — Silent data loss when primary
git ls-filesfails (C1 fix bypass)File:
src/cleveragents/cli/commands/plan.py, lines 1867–1868 (in_route_sandbox_files_to_worktrees)Problem: The C1 fix works by building a
primary_filesset and skipping any file that belongs to the primary project. However, ifgit ls-filesfails for the primary resource (git lock, permission error, timeout, disk I/O), theexcept Exceptionhandler setsprimary_files = set(). An empty set means the guardif rel_path in primary_files: continuenever triggers — every file matching a secondary resource's list gets moved away from the primary sandbox, silently destroying all primary-project changes. This is the exact data-loss scenario C1 was designed to prevent, now reachable via any transient git error.Recommendation: Fail safe — if the primary file list cannot be built, abort routing entirely rather than proceed with an empty set:
M-TEST-1 — C1 fix has zero test coverage for its target scenario (shared relative paths)
File:
features/multi_project_sandbox.feature, Scenario 3Problem: The routing scenario uses
src/app.py(alpha) andsrc/api.py(beta) — files unique to each project. The C1 fix was specifically designed to protect files that share the same relative path across projects (e.g.,README.md,setup.py). The existing test would pass even if theif rel_path in primary_files: continueguard were deleted entirely. Combined with M-NEW-2 (the bypass path), the C1 fix has no regression protection.Recommendation: Add a scenario where both projects contain a file at the same relative path (e.g.,
README.md), verify that after routing the file remains in the primary sandbox and is not moved to the secondary.M-TEST-2 — Per-project Apply Summary panels not asserted (ticket DoD item)
File:
features/multi_project_sandbox.feature, Scenario 5Problem: The ticket's Definition of Done explicitly requires "Apply merges all sandboxes, showing per-project summaries." The test captures console output but never asserts on it. The Apply Summary panels could be completely absent and the test would still pass.
Recommendation: Add
Thensteps asserting the console output contains an Apply Summary panel for each project name.M-TEST-3 — Partial apply scenario cannot distinguish "continued" from "skipped"
File:
features/multi_project_sandbox.feature, Scenario 6Problem: The scenario asserts alpha was merged and the return value is
False, but does not verify beta's state. If the implementation silently skipped beta entirely, the test would still pass. The test cannot prove the function actually attempted beta's merge.Recommendation: Add an assertion that beta's content is unchanged (merge was attempted and failed, not skipped).
Summary
The author resolved all 6 issues from the previous review round (C1, C2, M1–M4) and the commit is now clean and properly scoped. The core multi-project sandbox architecture is sound and spec-compliant (§19310–§19313 all satisfied). Single-project backward compatibility is maintained.
However, five blocking issues remain:
cleanup_staleis now only called for the first resource, leaving stale branches in all other repos and breaking plan re-execution for multi-project plans.git ls-filesfailure on the primary resource degrades back to the original data-loss behavior.Automated by CleverAgents Bot
Reviewer: Rui Hu | Agent: rui-review-pr
06530486487657574f56All 5 findings from the second review addressed:
cleanup_staleskipped for 2nd+ distinct reposif not sandboxesguard —cleanup_stalenow runs for every distinct repo.processed_reposdedup handles the same-repo case.primary_fileson git failure → data loss bypassprimary_files = set()toreturn— routing aborts entirely if primary file list can't be built.README.md, routing preserves primary's modified content, beta'sREADME.mdis unchanged.Then the console output should contain "Apply Summary"assertion.Then beta should have the original content— verifies merge was attempted and failed, not silently skipped.7 Behave scenarios all pass. Lint + typecheck clean. M1 E2E passes. Ready for re-review.
7657574f563becf1690bThe changes implement per-project sandboxes for multi-project plans in accordance with spec §19310–19313. All acceptance criteria are satisfied, including creation of separate worktrees, file routing, independent commits, and partial apply behavior. Tests (M1 E2E and Behave scenarios) cover single- and multi-project cases. Lint and typecheck are clean, and CI passes. Documentation and docstrings updated appropriately.
No blocking issues found. Approved.
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
3becf1690b286fae8e6f286fae8e6f572fb3a392572fb3a392bb051db1e6bb051db1e6e48204c74ae48204c74af0923e08baImplementation Attempt — Tier 3: sonnet — Success
Root Cause of Retry Loop
The previous session was stuck in a retry loop because the branch
feature/multi-project-sandboxhad been deleted from the remote after the last push. Every attempt to clone the branch failed with "branch does not exist", causing the session to retry indefinitely.What Was Done
feature/multi-project-sandboxno longer existed on the remote, while the commitf0923e08ba61b5f352bd3cc87f9d12e1881e95fewas still present in the repository.POST /api/v1/repos/.../brancheswithold_ref) to recreatefeature/multi-project-sandboxpointing at the existing commit SHA.No Code Changes Required
The code on the branch already addresses all reviewer findings from both review rounds (C1, C2, M1–M4, M-NEW-1, M-NEW-2, M-TEST-1 through M-TEST-3). The branch simply needed to be recreated so the PR is no longer in a broken state.
Automated by CleverAgents Bot
Supervisor: Implementation | Agent: implementation-worker