agents/graphs/plan_generation: Path traversal vulnerability in _generate_plan via @path prompt hint allows reading arbitrary files #10359

Open
opened 2026-04-18 09:10:19 +00:00 by HAL9000 · 0 comments
Owner

Metadata

  • Commit: agents/graphs/plan_generation: fix path traversal vulnerability in _generate_plan via @path prompt hint
  • Branch: fix/plan-generation-path-traversal-vulnerability

Background and Context

PlanGenerationGraph._generate_plan() reads arbitrary files from the filesystem based on user-provided @path hints in prompts without validating that the path stays within the project directory. This is a path traversal vulnerability that allows reading sensitive files (e.g., /etc/passwd, SSH keys, .env files) and including their contents in LLM prompts, constituting a critical confidentiality breach and data exfiltration risk.

Expected Behavior

Any @path hint in a user prompt that resolves to a location outside the project/workspace directory must be rejected before any file I/O occurs. The method should return an error state (or raise an appropriate exception) rather than reading the file.

Acceptance Criteria

  • PATH_HINT_RE or _extract_path_from_prompt() is updated to reject paths containing ../ or other traversal sequences
  • _generate_plan() validates the resolved path is within the project root using Path.resolve().relative_to(project_root) before any read_text() call
  • A prompt containing @../../etc/passwd returns an error state or raises ValueError/PermissionError instead of reading the file
  • The TDD test test_plan_generation_rejects_path_traversal_in_prompt (issue #10358) transitions from expected-fail to passing
  • No regression in existing @path hint functionality for valid in-project paths
  • nox passes with coverage ≥ 97%

Subtasks

  • Audit PATH_HINT_RE and _extract_path_from_prompt() in src/cleveragents/agents/graphs/plan_generation.py
  • Add Path.resolve().relative_to(project_root) guard before any file read in _generate_plan()
  • Handle the ValueError from relative_to() and return a safe error state
  • Update/add unit tests (see TDD issue #10358)
  • Verify fix against the attack vectors described in this issue
  • Run nox and confirm coverage ≥ 97%

Definition of Done

This issue is closed when:

  1. The path traversal guard is implemented and merged
  2. The TDD test in issue #10358 passes
  3. CI is green with coverage ≥ 97%
  4. No sensitive file contents can be read via @path hints pointing outside the project directory

Bug Report

Summary

PlanGenerationGraph._generate_plan() reads arbitrary files from the filesystem based on user-provided @path hints in prompts without validating that the path stays within the project directory. This is a path traversal vulnerability that allows reading sensitive files (e.g., /etc/passwd, SSH keys, .env files) and including their contents in LLM prompts.

Affected File

src/cleveragents/agents/graphs/plan_generation.py

Code Evidence

The path extraction regex allows ../ sequences:

PATH_HINT_RE = re.compile(r"@(?P<path>[\w./-]+)")  # allows ../ traversal

The extracted path is used to read files without validation:

explicit_path = self._extract_path_from_prompt(state.get("prompt", ""))
# ...
elif explicit_path:
    file_path = explicit_path
    # ...
    existing_path = Path(file_path)
    if existing_path.exists():
        if existing_path.is_dir():
            file_path = str(existing_path / "generated.py")
            # ...
        else:
            operation_type = OperationType.MODIFY
            try:
                original_content = existing_path.read_text()  # READS ARBITRARY FILES
            except Exception:
                original_content = None

Attack Vector

A user prompt containing @../../etc/passwd or @~/.ssh/id_rsa would:

  1. Match the PATH_HINT_RE regex
  2. Cause existing_path.read_text() to read the sensitive file
  3. Include the file contents as original_content in the Change object
  4. Pass the sensitive file contents to the LLM in subsequent validation prompts

Reproduction

# Prompt with path traversal
plan.prompt = "Improve the config @../../etc/passwd"

# _extract_path_from_prompt returns "../../etc/passwd"
# Path("../../etc/passwd").exists() returns True on Linux
# existing_path.read_text() reads /etc/passwd
# original_content = "<contents of /etc/passwd>"
# This gets passed to the LLM in _validate()

Impact

  • Confidentiality: Sensitive file contents (credentials, keys, configs) are read and sent to external LLM APIs
  • Severity: Critical - arbitrary file read with data exfiltration to LLM provider

Fix

Validate that the resolved path is within the project/workspace directory before reading:

def _generate_plan(self, state):
    # ...
    explicit_path = self._extract_path_from_prompt(state.get("prompt", ""))
    if explicit_path:
        # Validate path is within project directory
        try:
            resolved = Path(explicit_path).resolve()
            project_root = Path.cwd()  # or use project.path
            resolved.relative_to(project_root)  # raises ValueError if outside
        except ValueError:
            return {
                "generated_changes": [],
                "error": f"Path traversal rejected: {explicit_path}",
            }

Validation Gate

  • Code evidence: PATH_HINT_RE, _extract_path_from_prompt(), existing_path.read_text() in plan_generation.py
  • Environment verification: Any prompt with @../../ prefix triggers the vulnerability
  • Actionability: Add Path.resolve().relative_to(project_root) check
  • Codebase freshness: Verified in current HEAD
  • Severity match: Critical - arbitrary file read + data exfiltration

Blocked By

Depends on TDD issue #10358.


Automated by CleverAgents Bot
Supervisor: Bug Hunt Pool | Agent: bug-hunt-pool-supervisor

## Metadata - **Commit:** `agents/graphs/plan_generation: fix path traversal vulnerability in _generate_plan via @path prompt hint` - **Branch:** `fix/plan-generation-path-traversal-vulnerability` ## Background and Context `PlanGenerationGraph._generate_plan()` reads arbitrary files from the filesystem based on user-provided `@path` hints in prompts without validating that the path stays within the project directory. This is a path traversal vulnerability that allows reading sensitive files (e.g., `/etc/passwd`, SSH keys, `.env` files) and including their contents in LLM prompts, constituting a critical confidentiality breach and data exfiltration risk. ## Expected Behavior Any `@path` hint in a user prompt that resolves to a location outside the project/workspace directory must be rejected before any file I/O occurs. The method should return an error state (or raise an appropriate exception) rather than reading the file. ## Acceptance Criteria - [ ] `PATH_HINT_RE` or `_extract_path_from_prompt()` is updated to reject paths containing `../` or other traversal sequences - [ ] `_generate_plan()` validates the resolved path is within the project root using `Path.resolve().relative_to(project_root)` before any `read_text()` call - [ ] A prompt containing `@../../etc/passwd` returns an error state or raises `ValueError`/`PermissionError` instead of reading the file - [ ] The TDD test `test_plan_generation_rejects_path_traversal_in_prompt` (issue #10358) transitions from expected-fail to passing - [ ] No regression in existing `@path` hint functionality for valid in-project paths - [ ] `nox` passes with coverage ≥ 97% ## Subtasks - [ ] Audit `PATH_HINT_RE` and `_extract_path_from_prompt()` in `src/cleveragents/agents/graphs/plan_generation.py` - [ ] Add `Path.resolve().relative_to(project_root)` guard before any file read in `_generate_plan()` - [ ] Handle the `ValueError` from `relative_to()` and return a safe error state - [ ] Update/add unit tests (see TDD issue #10358) - [ ] Verify fix against the attack vectors described in this issue - [ ] Run `nox` and confirm coverage ≥ 97% ## Definition of Done This issue is closed when: 1. The path traversal guard is implemented and merged 2. The TDD test in issue #10358 passes 3. CI is green with coverage ≥ 97% 4. No sensitive file contents can be read via `@path` hints pointing outside the project directory --- ## Bug Report ### Summary `PlanGenerationGraph._generate_plan()` reads arbitrary files from the filesystem based on user-provided `@path` hints in prompts without validating that the path stays within the project directory. This is a path traversal vulnerability that allows reading sensitive files (e.g., `/etc/passwd`, SSH keys, `.env` files) and including their contents in LLM prompts. ### Affected File `src/cleveragents/agents/graphs/plan_generation.py` ### Code Evidence The path extraction regex allows `../` sequences: ```python PATH_HINT_RE = re.compile(r"@(?P<path>[\w./-]+)") # allows ../ traversal ``` The extracted path is used to read files without validation: ```python explicit_path = self._extract_path_from_prompt(state.get("prompt", "")) # ... elif explicit_path: file_path = explicit_path # ... existing_path = Path(file_path) if existing_path.exists(): if existing_path.is_dir(): file_path = str(existing_path / "generated.py") # ... else: operation_type = OperationType.MODIFY try: original_content = existing_path.read_text() # READS ARBITRARY FILES except Exception: original_content = None ``` ### Attack Vector A user prompt containing `@../../etc/passwd` or `@~/.ssh/id_rsa` would: 1. Match the `PATH_HINT_RE` regex 2. Cause `existing_path.read_text()` to read the sensitive file 3. Include the file contents as `original_content` in the `Change` object 4. Pass the sensitive file contents to the LLM in subsequent validation prompts ### Reproduction ```python # Prompt with path traversal plan.prompt = "Improve the config @../../etc/passwd" # _extract_path_from_prompt returns "../../etc/passwd" # Path("../../etc/passwd").exists() returns True on Linux # existing_path.read_text() reads /etc/passwd # original_content = "<contents of /etc/passwd>" # This gets passed to the LLM in _validate() ``` ### Impact - **Confidentiality**: Sensitive file contents (credentials, keys, configs) are read and sent to external LLM APIs - **Severity**: Critical - arbitrary file read with data exfiltration to LLM provider ### Fix Validate that the resolved path is within the project/workspace directory before reading: ```python def _generate_plan(self, state): # ... explicit_path = self._extract_path_from_prompt(state.get("prompt", "")) if explicit_path: # Validate path is within project directory try: resolved = Path(explicit_path).resolve() project_root = Path.cwd() # or use project.path resolved.relative_to(project_root) # raises ValueError if outside except ValueError: return { "generated_changes": [], "error": f"Path traversal rejected: {explicit_path}", } ``` ### Validation Gate - [x] Code evidence: `PATH_HINT_RE`, `_extract_path_from_prompt()`, `existing_path.read_text()` in `plan_generation.py` - [x] Environment verification: Any prompt with `@../../` prefix triggers the vulnerability - [x] Actionability: Add `Path.resolve().relative_to(project_root)` check - [x] Codebase freshness: Verified in current HEAD - [x] Severity match: Critical - arbitrary file read + data exfiltration ### Blocked By Depends on TDD issue #10358. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunt Pool | Agent: bug-hunt-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#10359
No description provided.