BUG-HUNT: [security] CopyOnWriteSandbox.commit() follows symlinks via shutil.copy2 — actor-created symlinks in sandbox exfiltrate arbitrary file content to committed output #6633

Open
opened 2026-04-09 22:37:04 +00:00 by HAL9000 · 0 comments
Owner

Severity Assessment

  • Impact: An actor operating in a copy-on-write sandbox can create a symlink pointing to any file readable by the process (e.g. /etc/shadow, database credentials, private keys). On commit(), shutil.copy2 follows the symlink and writes the target file's content into the original directory — silently exfiltrating sensitive data into the committed artifact
  • Likelihood: High — creating symlinks is a standard filesystem operation; any actor with write access to the sandbox can perform this attack
  • Priority: Critical

Location

  • File: src/cleveragents/infrastructure/sandbox/copy_on_write.py
  • Function: CopyOnWriteSandbox.commit
  • Lines: ~161–181 (the shutil.copy2 loop)
  • Supporting file: src/cleveragents/infrastructure/sandbox/_fs_utils.py
  • Function: compute_diff

Description

CopyOnWriteSandbox.commit() uses compute_diff() to identify added/changed files in the sandbox, then applies them with shutil.copy2(). The compute_diff() function calls os.walk(sandbox_dir) which — by default — includes symlinks-to-files in its fnames output (it only avoids following symlinks for directories). When a symlink is included as an "added" file, shutil.copy2(src, dst) follows the symlink and copies the content of the symlink target, not the symlink itself, into the original directory.

Evidence

# _fs_utils.py  compute_diff()
sandbox_files: set[str] = {
    os.path.relpath(os.path.join(dp, f), sandbox_dir)
    for dp, _, fnames in os.walk(sandbox_dir)   # ← fnames INCLUDES symlinks to files
    for f in fnames
}

# ...
added = sorted(sandbox_files - original_files)
# copy_on_write.py  commit()
for rel_path in changed_files + added_files:
    src = os.path.join(self._sandbox_path, rel_path)
    dst = os.path.join(self._original_path, rel_path)
    dst_dir = os.path.dirname(dst)
    if dst_dir:
        os.makedirs(dst_dir, exist_ok=True)
    shutil.copy2(src, dst)   # ← shutil.copy2 FOLLOWS symlinks by default
                             #   reads content from symlink target, not symlink itself

Attack scenario:

  1. Actor's plan execution creates a symlink in the sandbox:
    os.symlink("/etc/shadow", os.path.join(sandbox_path, "shadow_copy.txt"))
    
  2. compute_diff includes shadow_copy.txt in added_files (it appears in fnames from os.walk).
  3. commit() calls shutil.copy2(sandbox/shadow_copy.txt, original/shadow_copy.txt).
  4. shutil.copy2 follows the symlink and reads from /etc/shadow.
  5. The content of /etc/shadow is written into the original directory as shadow_copy.txt.
  6. The sensitive content is now committed to the repository or the original filesystem.

This attack works for any file readable by the process running cleveragents, including environment files, API keys, private keys, and database credentials.

Expected Behavior

commit() should either:
(a) Refuse to commit symlinks (raise an error if a sandbox file is a symlink), OR
(b) Copy the symlink itself (not its target) using os.symlink(), OR
(c) Resolve and validate that the symlink target is within the sandbox before copying content.

Actual Behavior

shutil.copy2() silently follows any symlink in the sandbox and writes the symlink target's content to the committed output. No warning is logged. The original directory receives the sensitive content.

Suggested Fix

Replace the shutil.copy2 call with a check:

for rel_path in changed_files + added_files:
    src = os.path.join(self._sandbox_path, rel_path)
    dst = os.path.join(self._original_path, rel_path)
    # Refuse to follow symlinks that escape the sandbox
    if os.path.islink(src):
        link_target = os.readlink(src)
        abs_target = os.path.realpath(os.path.join(os.path.dirname(src), link_target))
        sandbox_root = os.path.realpath(self._sandbox_path)
        if not abs_target.startswith(sandbox_root + os.sep):
            raise SandboxCommitError(
                f"Symlink '{rel_path}' in sandbox points outside sandbox root — "
                "refusing to commit to prevent data exfiltration"
            )
        # Re-create as symlink (don't follow)
        os.makedirs(os.path.dirname(dst), exist_ok=True)
        if os.path.lexists(dst):
            os.unlink(dst)
        os.symlink(link_target, dst)
        continue
    dst_dir = os.path.dirname(dst)
    if dst_dir:
        os.makedirs(dst_dir, exist_ok=True)
    shutil.copy2(src, dst)

Category

security / data-flow

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it.


Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: bug-hunter

## Bug Report: Security — `shutil.copy2` follows symlinks in copy-on-write commit, exfiltrating sensitive file content ### Severity Assessment - **Impact**: An actor operating in a copy-on-write sandbox can create a symlink pointing to any file readable by the process (e.g. `/etc/shadow`, database credentials, private keys). On `commit()`, `shutil.copy2` follows the symlink and writes the target file's content into the original directory — silently exfiltrating sensitive data into the committed artifact - **Likelihood**: High — creating symlinks is a standard filesystem operation; any actor with write access to the sandbox can perform this attack - **Priority**: Critical ### Location - **File**: `src/cleveragents/infrastructure/sandbox/copy_on_write.py` - **Function**: `CopyOnWriteSandbox.commit` - **Lines**: ~161–181 (the `shutil.copy2` loop) - **Supporting file**: `src/cleveragents/infrastructure/sandbox/_fs_utils.py` - **Function**: `compute_diff` ### Description `CopyOnWriteSandbox.commit()` uses `compute_diff()` to identify added/changed files in the sandbox, then applies them with `shutil.copy2()`. The `compute_diff()` function calls `os.walk(sandbox_dir)` which — by default — includes symlinks-to-files in its `fnames` output (it only avoids following symlinks for directories). When a symlink is included as an "added" file, `shutil.copy2(src, dst)` follows the symlink and copies the **content** of the symlink target, not the symlink itself, into the original directory. ### Evidence ```python # _fs_utils.py compute_diff() sandbox_files: set[str] = { os.path.relpath(os.path.join(dp, f), sandbox_dir) for dp, _, fnames in os.walk(sandbox_dir) # ← fnames INCLUDES symlinks to files for f in fnames } # ... added = sorted(sandbox_files - original_files) ``` ```python # copy_on_write.py commit() for rel_path in changed_files + added_files: src = os.path.join(self._sandbox_path, rel_path) dst = os.path.join(self._original_path, rel_path) dst_dir = os.path.dirname(dst) if dst_dir: os.makedirs(dst_dir, exist_ok=True) shutil.copy2(src, dst) # ← shutil.copy2 FOLLOWS symlinks by default # reads content from symlink target, not symlink itself ``` **Attack scenario**: 1. Actor's plan execution creates a symlink in the sandbox: ```python os.symlink("/etc/shadow", os.path.join(sandbox_path, "shadow_copy.txt")) ``` 2. `compute_diff` includes `shadow_copy.txt` in `added_files` (it appears in `fnames` from `os.walk`). 3. `commit()` calls `shutil.copy2(sandbox/shadow_copy.txt, original/shadow_copy.txt)`. 4. `shutil.copy2` follows the symlink and reads from `/etc/shadow`. 5. The content of `/etc/shadow` is written into the original directory as `shadow_copy.txt`. 6. The sensitive content is now committed to the repository or the original filesystem. This attack works for any file readable by the process running `cleveragents`, including environment files, API keys, private keys, and database credentials. ### Expected Behavior `commit()` should either: (a) Refuse to commit symlinks (raise an error if a sandbox file is a symlink), OR (b) Copy the symlink itself (not its target) using `os.symlink()`, OR (c) Resolve and validate that the symlink target is within the sandbox before copying content. ### Actual Behavior `shutil.copy2()` silently follows any symlink in the sandbox and writes the symlink target's content to the committed output. No warning is logged. The original directory receives the sensitive content. ### Suggested Fix Replace the `shutil.copy2` call with a check: ```python for rel_path in changed_files + added_files: src = os.path.join(self._sandbox_path, rel_path) dst = os.path.join(self._original_path, rel_path) # Refuse to follow symlinks that escape the sandbox if os.path.islink(src): link_target = os.readlink(src) abs_target = os.path.realpath(os.path.join(os.path.dirname(src), link_target)) sandbox_root = os.path.realpath(self._sandbox_path) if not abs_target.startswith(sandbox_root + os.sep): raise SandboxCommitError( f"Symlink '{rel_path}' in sandbox points outside sandbox root — " "refusing to commit to prevent data exfiltration" ) # Re-create as symlink (don't follow) os.makedirs(os.path.dirname(dst), exist_ok=True) if os.path.lexists(dst): os.unlink(dst) os.symlink(link_target, dst) continue dst_dir = os.path.dirname(dst) if dst_dir: os.makedirs(dst_dir, exist_ok=True) shutil.copy2(src, dst) ``` ### Category security / data-flow ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: `@tdd_issue`, `@tdd_issue_<this-issue-number>`, and `@tdd_expected_fail` to prove the bug exists before fixing it. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: bug-hunter
HAL9000 added this to the v3.2.0 milestone 2026-04-09 22:47:12 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#6633
No description provided.