fix(cleanup): invalidate sandbox_dirs_cache after purge (#7527) #11091
No reviewers
Labels
No labels
auto/needs-reevaluation
controller-managed
overdue
auto/blocked-by-deps
auto/ci-timeout
auto/claimed-implementer
auto/claimed-merge
auto/claimed-reviewer
auto/driver-down
auto/invariant-violation
auto/last-attempt-tier-0
auto/last-attempt-tier-1
auto/last-attempt-tier-2
auto/last-attempt-tier-min
Automation Tracking
auto/needs-conflict-resolution
auto/needs-implementer
auto/postmortem
auto/ready-to-merge
auto/restart-throttled
auto/revert
auto/sentinel
auto/stale-inactivity
auto/unstable
Blocked
Bounty
$100
Bounty
$1000
Bounty
$10000
Bounty
$20
Bounty
$2000
Bounty
$250
Bounty
$50
Bounty
$500
Bounty
$5000
Bounty
$750
MoSCoW
Could have
MoSCoW
Must have
MoSCoW
Should have
Needs Feedback
Points
1
Points
13
Points
2
Points
21
Points
3
Points
34
Points
5
Points
55
Points
8
Points
88
Priority
Backlog
Priority
CI Blocker
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Signed-off: Owner
Signed-off: Scrum Master
Signed-off: Tech Lead
Spike
State
Completed
State
Duplicate
State
In Progress
State
In Review
State
Paused
State
Unverified
State
Verified
State
Wont Do
Type
Automation
Type
Bug
Type
Discussion
Type
Documentation
Type
Epic
Type
Feature
Type
Legendary
Type
Refactor
Type
Support
Type
Task
Type
Testing
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
cleveragents/cleveragents-core!11091
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "fix-invalidate-sandbox-dirs-cache-after-purge-7527"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Adds
SandboxDirsCacheto track filesystem paths of sandbox-created directories (e.g., worktree, overlay) indexed by plan_id.The cache is purged in
SandboxManager.cleanup_all(),cleanup_abandoned()(when a plan becomes fully cleaned),clear_sandbox_dirs_cache(), and the at-exit handler_cleanup_on_exit_handler()-- matching the invalidation already performed byclear_boundary_cache().A new
record_sandbox_dir()method lets callers register a path at creation time, andpurge_sandbox_dirs()removes all paths for a plan in one atomic step. This prevents stale path references from returning after a sandbox's filesystem artifacts have been removed.Changes:
src/cleveragents/infrastructure/sandbox/dirs_cache.py--SandboxDirsCacheclasssrc/cleveragents/infrastructure/sandbox/__init__.py-- exports SandboxDirsCachesrc/cleveragents/infrastructure/sandbox/manager.py-- integrates dirs cache into sandbox lifecyclefeatures/sandbox_dirs_cache.feature-- 10 scenariosfeatures/steps/sandbox_dirs_cache_steps.pyCHANGELOG.mdandCONTRIBUTORS.mdCloses #7527
Code Review — PR #11091: fix(cleanup): invalidate sandbox_dirs_cache after purge (#7527)
Review Summary
This PR adds a
SandboxDirsCacheclass and wires it intoSandboxManager. The code quality for the new class is generally good — thread-safe, well-documented, and readable. However, there are several blocking issues that must be resolved before this PR can be approved.🔴 BLOCKING Issues
1. Fix does not address the reported bug (Wrong class/location)
Issue #7527 describes a bug in
CleanupService._get_sandbox_dirs()and_purge_sandboxes()atsrc/cleveragents/application/services/cleanup_service.py. The_purge_sandboxes()method deletes sandbox directories but does NOT invalidateself._sandbox_dirs_cacheafterward — meaning subsequent calls toscan()return stale paths that no longer exist.This PR fixes
SandboxManagerinsrc/cleveragents/infrastructure/sandbox/manager.py, which is a completely different class. The originalCleanupService._purge_sandboxes()is unchanged and the original bug still persists. The fix is in the wrong location.Required: Either (a) add the cache invalidation to
CleanupService._purge_sandboxes()as described in the issue, or (b) clarify whether the PR intentionally solves a different problem and update the issue accordingly.2. Commit footer missing
ISSUES CLOSED: #7527The commit
f9c5625abody is:No
ISSUES CLOSED:footer is present. Per CONTRIBUTING.md, every commit footer must includeISSUES CLOSED: #Nfor the issues it closes.Required: Add
ISSUES CLOSED: #7527to the commit footer.3. PR has no milestone and no
Type/labelThe PR has zero labels and no milestone assigned. Per CONTRIBUTING.md, every PR must have exactly one
Type/label (Type/Bug,Type/Feature, orType/Task) and must be assigned to the same milestone as the linked issue (v3.5.0 in this case).Required: Add
Type/Buglabel and assign milestonev3.5.0.4. CI failures:
lint,unit_tests, andbenchmark-regressionare failingThe CI status for commit
8dc37d66shows three required checks failing:CI / lint— Failing after 1m12sCI / unit_tests— Failing after 3m18sCI / benchmark-regression— Failing after 1m1sCI / status-check— Failing (aggregate gate)All required CI gates must pass before a PR can be merged. Per company policy, PRs with failing CI will not be approved.
Required: Fix the lint violations, unit test failures, and benchmark regression before requesting re-review.
5. Branch name does not follow naming convention
The PR branch is
fix-invalidate-sandbox-dirs-cache-after-purge-7527. Per CONTRIBUTING.md, bug fix branches must follow the formatbugfix/mN-<descriptive-name>whereNis the milestone number (milestone v3.5.0 →m6). The correct format would bebugfix/m6-sandbox-dirs-cache-invalidationor similar.Note: While this cannot be changed after PR submission without disruption, it should be corrected for future PRs.
🟡 Non-Blocking Issues (Suggestions)
6. BDD scenarios do not cover
ValueErrorvalidation pathsSandboxDirsCache.record()and.get()both raiseValueErrorfor emptyplan_idordir_path— but the feature file has no scenario testing these error paths. Good test coverage should include the error/failure cases.Suggestion: Add scenarios such as "Recording with empty plan_id raises ValueError" and "Getting with empty dir_path raises ValueError".
7. Step definitions access private
_dirsattribute directlySeveral
thensteps insandbox_dirs_cache_steps.pyaccesscontext._sdc._dirsdirectly (the private internal dict). This breaks encapsulation and ties test assertions to implementation details rather than the public API.Suggestion: Use the public
plan_countanddir_countproperties, or add aget_all_dirs(plan_id)accessor, rather than reaching into_dirsdirectly.8.
SandboxManageris 705 lines — exceeds the 500-line limitPer CONTRIBUTING.md, files should be under 500 lines and should be broken into focused modules when approaching the limit.
manager.pywas 655 lines on master and is now 705 lines after this PR adds 50 more lines.Suggestion: The sandbox dirs cache–related methods (
record_sandbox_dir,purge_sandbox_dirs,clear_sandbox_dirs_cache,sandbox_dirs_cache_size) could be extracted to a mixin or moved to a separateSandboxDirsManagerhelper class.9. No TDD companion test exists for this bug fix
Issue #7527 is
Type/Bug. Per CONTRIBUTING.md, every bug fix must have a companionType/TestingTDD issue with@tdd_expected_failtags that proves the bug exists. Issue #7527 notes "a corresponding Type/Testing issue will be created" but none was found. If the TDD workflow was not followed, a regression test with@tdd_issue_7527tag should be added.Suggestion: Verify whether a TDD companion issue was created and whether the BDD regression test uses the appropriate
@tdd_issue_7527tag.10.
CONTRIBUTORS.mdhas unintended whitespace formatting changeThe diff for
CONTRIBUTORS.mdshows that two existing lines lost their flush-left formatting and gained a leading space. This appears to be an unintentional reformatting side effect. The new entry is correct, but the previous two entries should not be indented.✅ What Works Well
SandboxDirsCacheclass design is clean, well-documented, and thread-safe (usesRLock).__contains__dunder support for tuple membership checks is a nice touch.SandboxManager(cleanup_all, cleanup_abandoned, _cleanup_on_exit_handler) are wired correctly.Required Before Re-Review
CleanupService._purge_sandboxes()cache invalidation bug (the actual reported issue)ISSUES CLOSED: #7527to the commit footerType/Buglabel and assign milestonev3.5.0to the PRAutomated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
@ -29,2 +32,4 @@* HAL 9000 has contributed the git_tools TOCTOU race condition fix (PR #8255 / issue #7619): eliminated the Time-Of-Check-To-Time-Of-Use race in `_get_base_env()` by adding double-checked locking with a module-level `threading.Lock`, preventing concurrent threads from writing conflicting environment snapshots.* HAL 9000 has contributed the mandatory PR compliance checklist to `implementation-supervisor.md` (#9824): added an 8-item checklist to the worker prompt body with concrete items covering CHANGELOG.md, CONTRIBUTORS.md, commit footer, CI verification, BDD tests, Epic reference, labels, and milestone assignment to eliminate systemic PR merge blockers.* HAL 9000 has contributed the PlanResult.success derivation fix (PR #8214 / issue #7501): replaced the incorrect `error_message is None` heuristic with a dedicated `result_success` column in the plans table, ensuring plans with historical build errors are not incorrectly marked as failed after a successful apply.* HAL 9000 has contributed the mandatory PR compliance checklist to `implementation-pool-supervisor.md` (#9824): created a new agent definition with an embedded 8-item checklist ensuring workers always update CHANGELOG.md, CONTRIBUTORS.md, include commit footers (`ISSUES CLOSED: #N`), verify CI passes, add BDD tests, reference the parent Epic, apply labels via forgejo-label-manager, and assign milestones before creating PRs. Includes concrete examples for each subsection and compliance verification pseudocode.Suggestion — Unintentional whitespace formatting change
The diff shows that the two existing entries before the new one had leading spaces added (changed from flush-left to indented). This appears to be an unintentional reformatting side effect — the new entry is correct but the two preceding lines should not be indented.
Please revert the whitespace change on those two existing lines so only the new entry is added.
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
Suggestion — Missing error/validation path scenarios
SandboxDirsCache.record()and.get()both raiseValueErrorwhenplan_idordir_pathis empty, but no scenario exercises these error paths. Good BDD coverage includes both happy paths and failure/error scenarios.Consider adding:
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
Suggestion — Avoid accessing private
_dirsattribute in testsThis assertion reaches into the private
_dirsdict directly (context._sdc._dirs.get(plan_id, set())). This breaks encapsulation and makes the test brittle — if the internal data structure changes, all these assertions break even if the public behaviour is unchanged.Consider using the public
plan_countanddir_countproperties instead, or exposing aget_all_dirs(plan_id) -> set[str]public method that tests can use without touching internals.Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
BLOCKING — Original Bug Not Fixed
The issue #7527 describes the bug here in
_purge_sandboxes(): after removing stale sandbox directories,self._sandbox_dirs_cacheis NOT invalidated. A subsequent call toscan()returns the old (now-deleted) paths.This PR adds
SandboxDirsCachetoSandboxManagerbut does not touchCleanupService. The cache invalidation fix belongs here:Without this line, the reported bug still exists —
scan()called afterpurge()on the sameCleanupServiceinstance will continue to return stale paths.Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
Review Summary
This PR introduces
SandboxDirsCacheand integrates it intoSandboxManager— the implementation is clean, thread-safe, and well-documented. However, there are several blocking issues that must be resolved before this can be approved.Critical: Fix Does Not Address the Issue
Issue #7527 describes a bug in
CleanupService._get_sandbox_dirs()insrc/cleveragents/application/services/cleanup_service.py— specifically that_purge_sandboxes()deletes directories but never resetsself._sandbox_dirs_cache = None, causing stale entries to be returned on subsequentscan()calls.This PR instead introduces
SandboxDirsCachewired intoSandboxManager(infrastructure layer), which is a different cache in a different class in a different layer. Checking master confirms the original bug is still present:cleanup_service.py._purge_sandboxes()still does not invalidateself._sandbox_dirs_cacheafter deletion.Additionally, PRs #8257 and #10989 (same title, same author) are also open for this same issue. All three PRs are unmerged and targeting the same issue — this duplication needs to be resolved.
CI Failures (Blocking)
The following CI gates are failing:
CI / lint— failing after 1m27sCI / unit_tests— failing after 4m22sCI / status-check— failing (aggregate gate)CI / benchmark-regression— failingAdditionally,
CI / coveragehas status"success"with description"Has been skipped", meaning coverage was not measured for this commit. Per company policy, all CI gates must pass before a PR can be approved.Commit Footer Missing (Blocking)
The commit message does not contain the required
ISSUES CLOSED: #7527footer. Per CONTRIBUTING.md, every commit closing an issue must include this footer.PR Metadata Incomplete (Blocking)
Type/label applied. Exactly oneType/label is required (e.g.,Type/Bug).Non-Blocking Observations
CONTRIBUTORS.mdreferencesPR #10989in the new entry, but this is PR #11091. The wrong PR number will confuse history readers.SandboxDirsCacheclass itself is well-implemented: thread-safe via RLock, clean public API, good docstrings, properValueErrorguards.SandboxManagerintegration is logically sound —purge_sandbox_dirs()is correctly called incleanup_all(),cleanup_abandoned()(when a plan is fully cleaned),clear_sandbox_dirs_cache(), and_cleanup_on_exit_handler().What Must Be Fixed
self._sandbox_dirs_cache = Noneat the end of_purge_sandboxes()insrc/cleveragents/application/services/cleanup_service.py, as described in the issue and as already attempted by PRs #8257 and #10989.lintandunit_testsfailures before requesting re-review.ISSUES CLOSED: #7527in the commit message.v3.5.0milestone andType/Buglabel to this PR.PR #10989toPR #11091in the new entry.Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
Suggestion — Wrong PR number referenced.
The new entry reads:
But this is PR #11091, not #10989. Please update the PR number to
#11091.Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
@ -0,0 +65,4 @@@then('the cache should contain one path for plan "{plan_id}"')def step_then_one_path_for_plan(context, plan_id):BLOCKER — Step definitions access private internals directly.
_dirsis a private implementation detail ofSandboxDirsCache. BDD steps should test behaviour through the public API only. Accessing_dirsdirectly couples the tests to the internal data structure — if the implementation changes (e.g., to a different container type), these steps break even though the behaviour is unchanged.Use the public
plan_count,dir_count, andget()methods to make assertions:This applies to all
@thensteps that read_dirsdirectly (lines ~68, ~74, ~88, ~94, ~140, ~148).Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
BLOCKER — Root cause not fixed here.
Issue #7527 identifies the bug as being in this file — specifically that
_purge_sandboxes()deletes directories but never resetsself._sandbox_dirs_cache = None, so subsequent calls to_get_sandbox_dirs()return stale deleted paths.This PR does not touch this file at all. The fix described in the issue (and attempted in PRs #8257 and #10989) is a one-line change:
The
SandboxDirsCacheclass added ininfrastructure/sandbox/dirs_cache.pyis a well-designed addition, but it tracks a separate concern (infrastructure-layer sandbox directory paths by plan_id) and does not address the application-layer cache invalidation bug described in #7527.Please add the cache invalidation in
_purge_sandboxes()(or wherever the issue describes) AND, ifSandboxDirsCacheis intended as a supplementary improvement, keep it — but both must be present for the issue to be closed.Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
Suggestion —
_cleanup_on_exit_handlercallspurge_sandbox_dirsthencleanup_all, butcleanup_allalso callspurge_sandbox_dirsinternally.This results in
purge_sandbox_dirs(plan_id)being called twice for each plan — once explicitly here, and once insidecleanup_all(). The second call is a no-op (the entries are already purged), but it is redundant and slightly misleading. Consider removing the explicitself.purge_sandbox_dirs(plan_id)call from_cleanup_on_exit_handlersincecleanup_allalready handles it.This is non-blocking — the double-call is harmless — but worth cleaning up.
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
🌱 Grooming: proceed — PR cleared for processing.
(check
no_duplicates, categoryno_duplicates)PR #11091 has identical title and closes the same issue (#7527) as PRs #8257, #11040, #10989, and #11010. All share the title "fix(cleanup): invalidate sandbox_dirs_cache after purge (#7527)". This constitutes clear duplication. PR #8257 (383 add/1 del/5 files) appears canonical with focused, reasonable scope. PR #11091 (4520 add/1081 del/52 files) is significantly larger, suggesting bundled or accumulated changes beyond the core fix.
📋 Estimate: tier 1.
Multi-file change: new SandboxDirsCache class, updated sandbox manager lifecycle integration, new BDD feature + step definitions. Two CI gates fail: lint (single unused import, trivially fixable) and unit_tests (31 errored scenarios — likely import/setup errors in new test files requiring investigation). Thread-safety considerations (threading import) and atomic purge semantics add modest reasoning complexity. Standard tier-1 cross-file engineering work.
- Remove unused typing.Any import from dirs_cache.py (ruff lint) - Collapse multiline expressions without trailing commas (ruff format) - Rename ambiguous behave step "the result should be {True,False}" to "the dir membership result should be {True,False}" — resolves conflict with existing step @then('the result should be {expected}') in cli_steps.py that caused all 31 feature workers to crash on load - Add self._sandbox_dirs_cache = None invalidation to CleanupService._purge_sandboxes() — the actual fix for issue #7527; without this, subsequent scan() calls return stale deleted paths - Fix CONTRIBUTORS.md PR reference from #10989 to #11091 ISSUES CLOSED: #7527(attempt #6, tier 2)
🔧 Implementer attempt —
resolved.Pushed 1 commit:
1ef9780.Files touched:
CONTRIBUTORS.md,features/sandbox_dirs_cache.feature,features/steps/sandbox_dirs_cache_steps.py,src/cleveragents/application/services/cleanup_service.py,src/cleveragents/infrastructure/sandbox/dirs_cache.py.(attempt #7, tier 1)
🔧 Implementer attempt —
blocked.Blockers:
4b6892cc11but dispatch base was3fce8431e9. The implementer pushed from inside the worktree (forbidden by the git contract) OR a third party pushed during the attempt. Re-dispatch will re-prefetch and pick up the new head.(attempt #8, tier 2)
🔧 Implementer attempt —
rebase-failed.Blockers:
4b6892cc11e6c20cad2f(attempt #10, tier 2)
🔧 Implementer attempt —
verified-clean.✅ Approved
Reviewed at commit
3f29361.Confidence: high.
Claimed by
merge_drive.py(pid 685761) until2026-06-12T17:38:11.145247+00:00.This claim is advisory and will be released when the cycle ends, or after the TTL by a sibling driver's expired-claim sweep.
3f29361ad43e34c5a5fcApproved by the controller reviewer stage (workflow 466).