BUG-HUNT: [security] PlanGenerationGraph._generate_plan reads arbitrary files via user-controlled @path hint — path traversal vulnerability allows reading files outside project directory #6558

Open
opened 2026-04-09 21:19:37 +00:00 by HAL9000 · 1 comment
Owner

Bug Report: [security] — Path traversal via user-controlled @path in _generate_plan

Severity Assessment

  • Impact: An attacker (or any user) can embed @/etc/passwd, @../../.env, or @../secrets/key.pem in their plan prompt. The _extract_path_from_prompt regex extracts this as the target path, and _generate_plan reads the file from the server filesystem using existing_path.read_text(). This leaks arbitrary file contents into original_content of the generated Change object, which is returned to the caller and potentially stored or logged.
  • Likelihood: Medium — requires a malicious or curious user to know the @path syntax
  • Priority: Critical

Location

  • File: src/cleveragents/agents/graphs/plan_generation.py
  • Class: PlanGenerationGraph
  • Method: _generate_plan, _extract_path_from_prompt
  • Lines: 315–340 (_generate_plan), 344–348 (_extract_path_from_prompt)

Description

_extract_path_from_prompt uses a regex that accepts / and . in the path:

# plan_generation.py lines 344-348
PATH_HINT_RE = re.compile(r"@(?P<path>[\w./-]+)")

def _extract_path_from_prompt(self, prompt: str) -> str | None:
    match = PATH_HINT_RE.search(prompt or "")
    return match.group("path") if match else None

The regex [\w./-]+ matches paths like ../../etc/passwd or /etc/shadow or ../../.env.

In _generate_plan, this extracted path is used directly to read file contents from the filesystem:

# plan_generation.py lines 326-337
elif explicit_path:
    file_path = explicit_path
    ...
    existing_path = Path(file_path)
    if existing_path.exists():
        if existing_path.is_dir():
            ...
        else:
            operation_type = OperationType.MODIFY
            try:
                original_content = existing_path.read_text()  # READS ARBITRARY PATH!
            except Exception:
                original_content = None

Example attack: a user submits a plan prompt "Fix the auth module @../../.env". The regex extracts ../../.env, Path("../../.env").exists() is True, and read_text() returns the environment file contents — including database URLs, API keys, and secrets — as original_content of the Change object.

Note: issue #3175 covers a similar issue with _actor_name, and #6448 covers _git_ls_files, but this specific vector in _generate_plan via @path prompt hints has not been filed.

Expected Behavior

Before using an @path hint to read file content, the code must validate that the resolved path is within an allowed base directory (e.g., the project's working directory).

Suggested Fix

from pathlib import Path

def _is_safe_path(self, base: Path, target: Path) -> bool:
    """Return True if target resolves within base."""
    try:
        target.resolve().relative_to(base.resolve())
        return True
    except ValueError:
        return False

# In _generate_plan, before read_text():
base_dir = Path.cwd()  # or project root
if not self._is_safe_path(base_dir, existing_path):
    raise ValueError(f"Path '{file_path}' is outside the project directory")
original_content = existing_path.read_text()

Category

security

TDD Note

After this bug is verified, a Type/Testing issue will be created with a TDD test tagged @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it.


Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: bug-hunter

## Bug Report: [security] — Path traversal via user-controlled `@path` in `_generate_plan` ### Severity Assessment - **Impact**: An attacker (or any user) can embed `@/etc/passwd`, `@../../.env`, or `@../secrets/key.pem` in their plan prompt. The `_extract_path_from_prompt` regex extracts this as the target path, and `_generate_plan` reads the file from the server filesystem using `existing_path.read_text()`. This leaks arbitrary file contents into `original_content` of the generated `Change` object, which is returned to the caller and potentially stored or logged. - **Likelihood**: Medium — requires a malicious or curious user to know the `@path` syntax - **Priority**: Critical ### Location - **File**: `src/cleveragents/agents/graphs/plan_generation.py` - **Class**: `PlanGenerationGraph` - **Method**: `_generate_plan`, `_extract_path_from_prompt` - **Lines**: 315–340 (`_generate_plan`), 344–348 (`_extract_path_from_prompt`) ### Description `_extract_path_from_prompt` uses a regex that accepts `/` and `.` in the path: ```python # plan_generation.py lines 344-348 PATH_HINT_RE = re.compile(r"@(?P<path>[\w./-]+)") def _extract_path_from_prompt(self, prompt: str) -> str | None: match = PATH_HINT_RE.search(prompt or "") return match.group("path") if match else None ``` The regex `[\w./-]+` matches paths like `../../etc/passwd` or `/etc/shadow` or `../../.env`. In `_generate_plan`, this extracted path is used directly to read file contents from the filesystem: ```python # plan_generation.py lines 326-337 elif explicit_path: file_path = explicit_path ... existing_path = Path(file_path) if existing_path.exists(): if existing_path.is_dir(): ... else: operation_type = OperationType.MODIFY try: original_content = existing_path.read_text() # READS ARBITRARY PATH! except Exception: original_content = None ``` Example attack: a user submits a plan prompt `"Fix the auth module @../../.env"`. The regex extracts `../../.env`, `Path("../../.env").exists()` is True, and `read_text()` returns the environment file contents — including database URLs, API keys, and secrets — as `original_content` of the Change object. Note: issue #3175 covers a similar issue with `_actor_name`, and #6448 covers `_git_ls_files`, but this specific vector in `_generate_plan` via `@path` prompt hints has not been filed. ### Expected Behavior Before using an `@path` hint to read file content, the code must validate that the resolved path is within an allowed base directory (e.g., the project's working directory). ### Suggested Fix ```python from pathlib import Path def _is_safe_path(self, base: Path, target: Path) -> bool: """Return True if target resolves within base.""" try: target.resolve().relative_to(base.resolve()) return True except ValueError: return False # In _generate_plan, before read_text(): base_dir = Path.cwd() # or project root if not self._is_safe_path(base_dir, existing_path): raise ValueError(f"Path '{file_path}' is outside the project directory") original_content = existing_path.read_text() ``` ### Category security ### TDD Note After this bug is verified, a Type/Testing issue will be created with a TDD test tagged `@tdd_issue`, `@tdd_issue_<this-issue-number>`, and `@tdd_expected_fail` to prove the bug exists before fixing it. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: bug-hunter
HAL9000 added this to the v3.2.0 milestone 2026-04-09 21:27:51 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Unverified
  • Priority: Critical — PATH TRAVERSAL VULNERABILITY: PlanGenerationGraph._generate_plan reads arbitrary files via user-controlled @path hint. This allows reading files outside the project directory, potentially exposing system files, credentials, or other sensitive data.
  • Milestone: v3.2.0 — Security vulnerabilities must be fixed in the earliest milestone
  • MoSCoW: Must Have — This is a security vulnerability that cannot ship

Security Impact: An actor could craft a @path hint like @/etc/passwd or @~/.ssh/id_rsa to exfiltrate sensitive files through the LLM context. This must be fixed with proper path validation/sandboxing before v3.2.0 ships.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner

Issue triaged by project owner: - **State**: Unverified - **Priority**: Critical — **PATH TRAVERSAL VULNERABILITY**: `PlanGenerationGraph._generate_plan` reads arbitrary files via user-controlled `@path` hint. This allows reading files outside the project directory, potentially exposing system files, credentials, or other sensitive data. - **Milestone**: v3.2.0 — Security vulnerabilities must be fixed in the earliest milestone - **MoSCoW**: Must Have — This is a security vulnerability that cannot ship **Security Impact**: An actor could craft a `@path` hint like `@/etc/passwd` or `@~/.ssh/id_rsa` to exfiltrate sensitive files through the LLM context. This must be fixed with proper path validation/sandboxing before v3.2.0 ships. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#6558
No description provided.