fix(cleanup): invalidate sandbox_dirs_cache after purge (#7527) #11040
No reviewers
Labels
No labels
auto/needs-reevaluation
controller-managed
overdue
auto/blocked-by-deps
auto/ci-timeout
auto/claimed-implementer
auto/claimed-merge
auto/claimed-reviewer
auto/driver-down
auto/invariant-violation
auto/last-attempt-tier-0
auto/last-attempt-tier-1
auto/last-attempt-tier-2
auto/last-attempt-tier-min
Automation Tracking
auto/needs-conflict-resolution
auto/needs-implementer
auto/postmortem
auto/ready-to-merge
auto/restart-throttled
auto/revert
auto/sentinel
auto/stale-inactivity
auto/unstable
Blocked
Bounty
$100
Bounty
$1000
Bounty
$10000
Bounty
$20
Bounty
$2000
Bounty
$250
Bounty
$50
Bounty
$500
Bounty
$5000
Bounty
$750
MoSCoW
Could have
MoSCoW
Must have
MoSCoW
Should have
Needs Feedback
Points
1
Points
13
Points
2
Points
21
Points
3
Points
34
Points
5
Points
55
Points
8
Points
88
Priority
Backlog
Priority
CI Blocker
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Signed-off: Owner
Signed-off: Scrum Master
Signed-off: Tech Lead
Spike
State
Completed
State
Duplicate
State
In Progress
State
In Review
State
Paused
State
Unverified
State
Verified
State
Wont Do
Type
Automation
Type
Bug
Type
Discussion
Type
Documentation
Type
Epic
Type
Feature
Type
Legendary
Type
Refactor
Type
Support
Type
Task
Type
Testing
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
cleveragents/cleveragents-core!11040
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "fix/7527-sandbox-cache-invalidation"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes: #7527
Fixed a bug in CleanupService._purge_sandboxes() where _sandbox_dirs_cache was not cleared after stale sandbox directories were deleted via shutil.rmtree(). Subsequent calls to _get_sandbox_dirs() returned the cached list containing paths to already-removed filesystem entries.
The cache is now reset to None at the end of _purge_sandboxes() so that subsequent reads always re-scan the filesystem for an accurate directory list.
Closes #7527
faa3fe024855905bb22955905bb2298d6bb95a9aReview — PR #11040: fix(cleanup): invalidate sandbox_dirs_cache after purge (#7527)
Thank you for this fix — the core production change in
_purge_sandboxes()is correct and well-motivated. The one-line cache invalidation (self._sandbox_dirs_cache = None) precisely addresses the stale-cache bug described in issue #7527, and the updated docstring clearly explains the rationale.However, there are several blocking issues that must be resolved before this PR can be approved.
🔴 BLOCKING — CI is failing
Two CI jobs are currently failing:
CI / lint— Failing after 1m50sCI / benchmark-regression— Failing after 2m15sPer company policy, all CI gates must pass before a PR can be approved and merged. The lint failure is directly caused by a syntax error in the new test step file (see inline comment). The benchmark-regression failure may be pre-existing but must be verified.
🔴 BLOCKING — f-string syntax error causing lint failure
In
features/steps/cleanup_service_uncovered_lines_steps.pyline 441:The brace
{ i}contains a leading space beforei, which is a ruff-flagged syntax issue. It should be{i}without the space. This is almost certainly the cause of theCI / lintfailure.Fix: Change
{ i}to{i}→f"ca-sandbox-plan{i}-stub{i}".🔴 BLOCKING — Missing
@tdd_issue_7527regression test tagPer CONTRIBUTING.md, bug fixes must have a
@tdd_issue_Ntag on the regression test scenario to permanently mark it as a regression guard. Thesandbox_cache_invalidation.featurefile has no such tag on either scenario.The issue body also notes: "After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD with @tdd_expected_fail tags." — This implies a companion TDD issue was expected. No TDD companion issue for #7527 appears to exist. The TDD workflow requires:
Type/Testingissue with@tdd_expected_failscenarios (created BEFORE the fix).@tdd_expected_failis replaced with@tdd_issue_7527to mark the scenario as a regression guard.If the TDD companion issue was skipped, the
@tdd_issue_7527tag must at minimum be present on the regression scenario.Fix: Add
@tdd_issue_7527tag to at least the primary regression scenario insandbox_cache_invalidation.feature.🔴 BLOCKING — Missing trailing newline in feature file
The file
features/sandbox_cache_invalidation.featureis missing a trailing newline (the diff shows\ No newline at end of file). Ruff/editors require files to end with a newline. This will also contribute to lint failures.Fix: Add a trailing newline at the end of the file.
🔴 BLOCKING — Branch name does not follow convention
Per CONTRIBUTING.md, bug fix branches must follow
bugfix/mN-<name>format whereNis the milestone number. Issue #7527 is in milestone v3.5.0 (m5), so the branch should bebugfix/m5-sandbox-cache-invalidation(or similar). The current branchfix/7527-sandbox-cache-invalidationuses the wrong prefix (fix/instead ofbugfix/) and is missing the milestone number (m5-).Note: This cannot be changed retroactively without force-pushing. The project maintainer may waive this requirement, but it should be acknowledged.
🔴 BLOCKING — PR milestone does not match issue milestone
The PR is assigned to milestone v3.2.0, but the linked issue #7527 is in milestone v3.5.0. Per CONTRIBUTING.md, the PR must be assigned to the same milestone as the linked issue.
Fix: Update the PR milestone to v3.5.0.
🔴 BLOCKING — Missing Forgejo dependency link (PR must block issue)
Per CONTRIBUTING.md, the PR must be linked to block the issue it closes (
PR → blocks → issue). Currently the PR has no dependency links at all. Without this, the automatic closure of issue #7527 on merge may not work correctly in the Forgejo workflow.Fix: On this PR, add issue #7527 under "blocks" in the Forgejo sidebar.
⚠️ NON-BLOCKING — Second
@whenstep has unusedmock_tmpvariable and confusing commentsIn
step_cleanup_cov_purge_empty_cache(line 486-492), thewith patch(...) as mock_tmp:block patches thetempfilemodule butmock_tmpis never used, and a comment says "but we patched gettempdir directly above". Thepatchcontext manager does nothing useful here and adds confusion. Consider removing the deadwith patch(...)block — the test works fine with just thetempfile.gettempdir = fake_gettempdirmonkey-patch already set above.✅ What is correct
cleanup_service.pyis correct:self._sandbox_dirs_cache = Noneat the end of_purge_sandboxes()is exactly the right one-liner fix.ISSUES CLOSED: #7527is correct.# type: ignoreusage.Summary of required changes before approval:
{ i}→{i}in step file line 441sandbox_cache_invalidation.feature@tdd_issue_7527tag to the primary regression scenariowith patch(...)block in second@whenstepAutomated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
@ -0,0 +6,4 @@the filesystem. Without this fix, a stale cached list of directoriespersisted even after those directories were deleted by shutil.rmtree().Scenario: Cache is invalidated after purge completes removalsBLOCKING — Missing
@tdd_issue_7527regression tagPer CONTRIBUTING.md, bug fix regression tests must carry a
@tdd_issue_Ntag. This marks the scenario as a permanent regression guard and ensures the CI test suite permanently tracks the fix.The issue body itself states: "After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD with @tdd_expected_fail tags."
Fix: Add
@tdd_issue_7527tag to the primary regression scenario ("Cache is invalidated after purge completes removals").Example:
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
BLOCKING — Missing trailing newline at end of file
This file is missing a trailing newline (the diff shows
\ No newline at end of file). All source files must end with a newline. This will be flagged by ruff/lint.Fix: Add a newline character at the end of this file.
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
BLOCKING — f-string syntax error (causes
CI / lintfailure)There is a space inside the brace:
{ i}should be{i}. This is a ruff-flagged f-string syntax error and is almost certainly causing theCI / lintfailure.Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
Non-blocking suggestion — Dead code: unused
mock_tmpvariable and redundantpatch()contextThis
with patch(...) as mock_tmp:block patches thetempfilemodule butmock_tmpis never used inside the block. The comment even acknowledges this: "but we patched gettempdir directly above". Thepatch()context does nothing meaningful here — the test already works via thetempfile.gettempdir = fake_gettempdirmonkey-patch set above the try block.Suggestion: Remove the
with patch(...) as mock_tmp:wrapper and callsvc._purge_sandboxes(report)directly. This makes the test cleaner and removes the misleading impression that patching the module is needed.Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
[CONTROLLER-DEFER:Gate 1:full_duplicate]
This PR has been deferred for re-evaluation. The controller has stepped back
from processing it. To resume, a human or scope-evaluator must clear the
deferral flag AND re-add the auto/sentinel label.
Decision:
To clear the deferral (SQL):
UPDATE workflows SET deferred_reason=NULL,
deferred_at=NULL,
deferred_target_workflow_id=NULL
WHERE workflow_id = 431;
Audit ID: 160910
Automated by the CleverAgents controller pipeline.
Identity: HAL9000 (pipeline action)
📋 Estimate: tier 1.
Core fix is trivial (1-2 lines: reset _sandbox_dirs_cache = None in _purge_sandboxes), but the PR adds ~106 lines of new BDD step definitions and has 4 failing CI gates. The implementer must: (1) fix a lint error (unused variable mock_tmp in the new step file), (2) investigate actor_run_signature.feature failures in unit and integration gates that appear unrelated to the cleanup change, and (3) determine whether e2e CI profile failure is pre-existing or introduced. Multi-file scope, test-additive work, and cross-gate failure triage require tier 1 capability.
📋 Estimate: tier 1.
Core fix is a single-line cache reset in CleanupService._purge_sandboxes(), but 6 files changed with +109 lines indicating BDD step additions for coverage. CI has four failing gates: (1) lint — F841 unused variable
mock_tmpin the new test steps, a trivial fix but requires editing the contributed test file; (2) unit_tests — failures in actor_run_signature and plan_service_coverage features that appear unrelated to cleanup changes but must be triaged to confirm they are pre-existing; (3) integration_tests — same actor_run_signature failures; (4) status-check aggregating the above. Multi-file scope, test additions, lint fix, and unrelated-CI triage all push above tier 0.e153b20aebaa8ab78e66✅ Approved
Reviewed at commit
51fd739.Confidence: medium.
Claimed by
merge_drive.py(pid 2329255) until2026-06-14T11:13:47.060012+00:00.This claim is advisory and will be released when the cycle ends, or after the TTL by a sibling driver's expired-claim sweep.
Approved by the controller reviewer stage (workflow 431).