BUG-HUNT: [data-integrity] copy_on_write.py rollback uses rmtree(ignore_errors=True) + dirs_exist_ok=True — stale files survive rollback #7491

Open
opened 2026-04-10 20:48:13 +00:00 by HAL9000 · 1 comment
Owner

Bug Report: Data Integrity — CopyOnWriteSandbox.rollback Leaves Stale Files After Rollback

Severity Assessment

  • Impact: Pre-rollback modifications survive in the sandbox after rollback() — the sandbox appears "rolled back" but contains residual attacker-controlled content
  • Likelihood: Medium — triggered by any locked/permission-denied file during rollback
  • Priority: High

Location

  • File: src/cleveragents/infrastructure/sandbox/copy_on_write.py
  • Function: CopyOnWriteSandbox.rollback
  • Lines: ~270–285
  • Category: data-integrity

Description

When rolling back from ACTIVE, the sandbox directory is cleared with rmtree(ignore_errors=True) and then repopulated with dirs_exist_ok=True. The combination is dangerous: ignore_errors=True silently leaves residue if any file is locked or permission-denied. dirs_exist_ok=True then overlays the copy on top of that residue rather than replacing it. Files that were added to the sandbox but don't exist in the original are never removed — they persist invisibly after the purported rollback.

Evidence

shutil.rmtree(self._sandbox_path, ignore_errors=True)  # silently ignores errors — residue possible

shutil.copytree(
    self._original_path,
    self._sandbox_path,
    symlinks=True,
    dirs_exist_ok=True,   # ← overlays on possibly-partial tree — stale files survive
)

Scenario:

  1. Actor adds file malicious_script.py to sandbox
  2. rollback() is called
  3. rmtree silently fails to remove malicious_script.py (e.g., locked by another process)
  4. copytree(dirs_exist_ok=True) overlays original content but KEEPS malicious_script.py
  5. Sandbox appears "clean" but still contains the actor's additions

Expected Behavior

After rollback(), the sandbox should contain exactly the same files as the original — no additions, no modifications.

Actual Behavior

Files added to the sandbox that couldn't be removed by rmtree(ignore_errors=True) persist after rollback.

Suggested Fix

Use a two-step atomic replace: copy original to fresh temp directory, then atomically swap for sandbox path:

def rollback(self) -> None:
    # Atomic swap: build clean copy then rename
    tmp = Path(tempfile.mkdtemp(dir=self._sandbox_path.parent))
    try:
        shutil.copytree(self._original_path, tmp / "content", symlinks=True)
        shutil.rmtree(self._sandbox_path)
        (tmp / "content").rename(self._sandbox_path)
    finally:
        shutil.rmtree(tmp, ignore_errors=True)

Category

data-integrity

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_, and @tdd_expected_fail to prove the bug exists before fixing it.


Automated by CleverAgents Bot
Supervisor: Bug Detection Pool | Agent: bug-hunt-pool-supervisor

## Bug Report: Data Integrity — `CopyOnWriteSandbox.rollback` Leaves Stale Files After Rollback ### Severity Assessment - **Impact**: Pre-rollback modifications survive in the sandbox after `rollback()` — the sandbox appears "rolled back" but contains residual attacker-controlled content - **Likelihood**: Medium — triggered by any locked/permission-denied file during rollback - **Priority**: High ### Location - **File**: `src/cleveragents/infrastructure/sandbox/copy_on_write.py` - **Function**: `CopyOnWriteSandbox.rollback` - **Lines**: ~270–285 - **Category**: data-integrity ### Description When rolling back from `ACTIVE`, the sandbox directory is cleared with `rmtree(ignore_errors=True)` and then repopulated with `dirs_exist_ok=True`. The combination is dangerous: `ignore_errors=True` silently leaves residue if any file is locked or permission-denied. `dirs_exist_ok=True` then overlays the copy on top of that residue rather than replacing it. Files that were added to the sandbox but don't exist in the original are **never removed** — they persist invisibly after the purported rollback. ### Evidence ```python shutil.rmtree(self._sandbox_path, ignore_errors=True) # silently ignores errors — residue possible shutil.copytree( self._original_path, self._sandbox_path, symlinks=True, dirs_exist_ok=True, # ← overlays on possibly-partial tree — stale files survive ) ``` **Scenario:** 1. Actor adds file `malicious_script.py` to sandbox 2. `rollback()` is called 3. `rmtree` silently fails to remove `malicious_script.py` (e.g., locked by another process) 4. `copytree(dirs_exist_ok=True)` overlays original content but KEEPS `malicious_script.py` 5. Sandbox appears "clean" but still contains the actor's additions ### Expected Behavior After `rollback()`, the sandbox should contain exactly the same files as the original — no additions, no modifications. ### Actual Behavior Files added to the sandbox that couldn't be removed by `rmtree(ignore_errors=True)` persist after rollback. ### Suggested Fix Use a two-step atomic replace: copy original to fresh temp directory, then atomically swap for sandbox path: ```python def rollback(self) -> None: # Atomic swap: build clean copy then rename tmp = Path(tempfile.mkdtemp(dir=self._sandbox_path.parent)) try: shutil.copytree(self._original_path, tmp / "content", symlinks=True) shutil.rmtree(self._sandbox_path) (tmp / "content").rename(self._sandbox_path) finally: shutil.rmtree(tmp, ignore_errors=True) ``` ### Category data-integrity ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it. --- **Automated by CleverAgents Bot** Supervisor: Bug Detection Pool | Agent: bug-hunt-pool-supervisor
HAL9000 added this to the v3.3.0 milestone 2026-04-10 21:38:50 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • Priority: High — Correctness bug in subplan/correction/merge logic that directly impacts M4 milestone functionality
  • Milestone: v3.3.0 (M4: Corrections + Subplans) — This component is core to the corrections and subplan execution features
  • Story Points: 3 (M) — Bug fix with clear reproduction path
  • MoSCoW: Must Have — Subplan and correction functionality must work correctly for M4 delivery
  • Type: Bug

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

Issue triaged by project owner: - **State**: Verified - **Priority**: High — Correctness bug in subplan/correction/merge logic that directly impacts M4 milestone functionality - **Milestone**: v3.3.0 (M4: Corrections + Subplans) — This component is core to the corrections and subplan execution features - **Story Points**: 3 (M) — Bug fix with clear reproduction path - **MoSCoW**: Must Have — Subplan and correction functionality must work correctly for M4 delivery - **Type**: Bug --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#7491
No description provided.