BUG-HUNT: [Security] SQLite URL parsing vulnerability allows path traversal in database checking #7118

Open
opened 2026-04-10 07:55:34 +00:00 by HAL9000 · 1 comment
Owner

Metadata

  • Branch: fix/security-sqlite-url-path-traversal
  • Commit Message: fix(cli): validate SQLite database URL paths to prevent directory traversal
  • Milestone: (none — backlog)
  • Parent Epic: #5203

Background and Context

Discovered during Bug Hunt Cycle 2 Batch 3 by Worker 22 focusing on platform detection utilities.

The _check_database() function in src/cleveragents/cli/commands/system.py (lines ~204–232) uses an overly simplistic string replacement to extract the filesystem path from a SQLite database URL:

db_path_str = db_url.replace("sqlite:///", "")
db_path = Path(db_path_str)

This approach does not validate that the resolved path stays within expected boundaries. An attacker who can influence the database_url setting (e.g., via a crafted config file, environment variable, or CLI flag) could supply a URL such as sqlite:///../../etc/passwd, which resolves to /etc/passwd — causing the system to attempt to access files entirely outside the intended database directory.

File: src/cleveragents/cli/commands/system.py
Function: _check_database()
Lines: ~204–232

Current Behavior (Actual)

# db_url = "sqlite:///../../etc/passwd"
db_path_str = db_url.replace("sqlite:///", "")  # → "../../etc/passwd"
db_path = Path(db_path_str)                      # → Path("../../etc/passwd")
if db_path.exists():                             # → checks /etc/passwd — path traversal!
    writable = os.access(db_path, os.W_OK)

The code accepts any path after sqlite:/// without validation. URLs like sqlite:///../../etc/passwd resolve to /etc/passwd instead of being rejected.

Expected Behavior

The function should:

  1. Use proper URL parsing (urllib.parse.urlparse) to extract the path component.
  2. Resolve the path to an absolute path and validate it stays within the expected data directory.
  3. Reject any path that escapes the intended directory boundary with a clear error.
  4. Handle edge cases: Windows paths, URL-encoded characters (%2F, %2E), and relative paths.

Impact

  • Likelihood: Medium — exploitable if database_url can be influenced by user input (config file, env var, or CLI flag).
  • Impact: An attacker could probe arbitrary filesystem paths (existence, writability) outside the intended database directory. In the worst case, if the application ever creates files at the checked path, this becomes a write-primitive for path traversal.

Test Evidence

# Proof of concept
db_url = "sqlite:///../../etc/passwd"
db_path_str = db_url.replace("sqlite:///", "")
from pathlib import Path
print(Path(db_path_str).resolve())  # → /etc/passwd

Suggested Fix

from urllib.parse import urlparse
import pathlib

def _check_database() -> dict[str, Any]:
    from cleveragents.config.settings import get_settings
    settings = get_settings()
    db_url = settings.database_url

    if db_url.startswith("sqlite"):
        parsed = urlparse(db_url)
        # Decode any URL-encoded characters in the path
        from urllib.parse import unquote
        raw_path = unquote(parsed.path).lstrip("/")
        db_path = Path(raw_path).resolve()

        # Validate path stays within expected data directory
        expected_root = Path(settings.data_dir).resolve()
        try:
            db_path.relative_to(expected_root)
        except ValueError:
            return {
                "name": "Database",
                "status": CheckStatus.ERROR,
                "details": f"database path escapes data directory (path traversal rejected)",
            }
        # ... rest of existing checks

Acceptance Criteria

  • _check_database() uses urllib.parse.urlparse (not string replacement) to extract the SQLite path.
  • Resolved path is validated to remain within the configured data directory.
  • URL-encoded traversal sequences (e.g., %2E%2E%2F) are decoded before validation.
  • A crafted URL like sqlite:///../../etc/passwd returns a CheckStatus.ERROR result rather than probing the filesystem.
  • Existing valid SQLite paths continue to work correctly.
  • BDD scenarios cover: valid path, relative traversal, encoded traversal, absolute path outside root.

Subtasks

  • Replace db_url.replace("sqlite:///", "") with urllib.parse.urlparse in _check_database()
  • Add path boundary validation using Path.relative_to() against the configured data directory
  • Decode URL-encoded characters before path resolution
  • Handle Windows-style paths and edge cases
  • Tests (Behave): Add BDD scenarios for path traversal rejection and valid path acceptance
  • Tests (Robot): Add integration test for _check_database() with malicious URLs
  • Verify coverage ≥ 97% via nox -s coverage_report
  • Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly (fix(cli): validate SQLite database URL paths to prevent directory traversal), followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly (fix/security-sqlite-url-path-traversal).
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
  • All nox stages pass.
  • Coverage ≥ 97%.

Backlog note: This issue was discovered during autonomous operation
on milestone Bug Hunt Cycle 2. It does not block milestone completion and has been
placed in the backlog for human review and future milestone assignment.


Automated by CleverAgents Bot
Supervisor: Bug Hunt Cycle 2 | Agent: new-issue-creator

## Metadata - **Branch**: `fix/security-sqlite-url-path-traversal` - **Commit Message**: `fix(cli): validate SQLite database URL paths to prevent directory traversal` - **Milestone**: *(none — backlog)* - **Parent Epic**: #5203 ## Background and Context Discovered during **Bug Hunt Cycle 2 Batch 3** by Worker 22 focusing on platform detection utilities. The `_check_database()` function in `src/cleveragents/cli/commands/system.py` (lines ~204–232) uses an overly simplistic string replacement to extract the filesystem path from a SQLite database URL: ```python db_path_str = db_url.replace("sqlite:///", "") db_path = Path(db_path_str) ``` This approach does not validate that the resolved path stays within expected boundaries. An attacker who can influence the `database_url` setting (e.g., via a crafted config file, environment variable, or CLI flag) could supply a URL such as `sqlite:///../../etc/passwd`, which resolves to `/etc/passwd` — causing the system to attempt to access files entirely outside the intended database directory. **File**: `src/cleveragents/cli/commands/system.py` **Function**: `_check_database()` **Lines**: ~204–232 ## Current Behavior (Actual) ```python # db_url = "sqlite:///../../etc/passwd" db_path_str = db_url.replace("sqlite:///", "") # → "../../etc/passwd" db_path = Path(db_path_str) # → Path("../../etc/passwd") if db_path.exists(): # → checks /etc/passwd — path traversal! writable = os.access(db_path, os.W_OK) ``` The code accepts any path after `sqlite:///` without validation. URLs like `sqlite:///../../etc/passwd` resolve to `/etc/passwd` instead of being rejected. ## Expected Behavior The function should: 1. Use proper URL parsing (`urllib.parse.urlparse`) to extract the path component. 2. Resolve the path to an absolute path and validate it stays within the expected data directory. 3. Reject any path that escapes the intended directory boundary with a clear error. 4. Handle edge cases: Windows paths, URL-encoded characters (`%2F`, `%2E`), and relative paths. ## Impact - **Likelihood**: Medium — exploitable if `database_url` can be influenced by user input (config file, env var, or CLI flag). - **Impact**: An attacker could probe arbitrary filesystem paths (existence, writability) outside the intended database directory. In the worst case, if the application ever creates files at the checked path, this becomes a write-primitive for path traversal. ## Test Evidence ```python # Proof of concept db_url = "sqlite:///../../etc/passwd" db_path_str = db_url.replace("sqlite:///", "") from pathlib import Path print(Path(db_path_str).resolve()) # → /etc/passwd ``` ## Suggested Fix ```python from urllib.parse import urlparse import pathlib def _check_database() -> dict[str, Any]: from cleveragents.config.settings import get_settings settings = get_settings() db_url = settings.database_url if db_url.startswith("sqlite"): parsed = urlparse(db_url) # Decode any URL-encoded characters in the path from urllib.parse import unquote raw_path = unquote(parsed.path).lstrip("/") db_path = Path(raw_path).resolve() # Validate path stays within expected data directory expected_root = Path(settings.data_dir).resolve() try: db_path.relative_to(expected_root) except ValueError: return { "name": "Database", "status": CheckStatus.ERROR, "details": f"database path escapes data directory (path traversal rejected)", } # ... rest of existing checks ``` ## Acceptance Criteria - [ ] `_check_database()` uses `urllib.parse.urlparse` (not string replacement) to extract the SQLite path. - [ ] Resolved path is validated to remain within the configured data directory. - [ ] URL-encoded traversal sequences (e.g., `%2E%2E%2F`) are decoded before validation. - [ ] A crafted URL like `sqlite:///../../etc/passwd` returns a `CheckStatus.ERROR` result rather than probing the filesystem. - [ ] Existing valid SQLite paths continue to work correctly. - [ ] BDD scenarios cover: valid path, relative traversal, encoded traversal, absolute path outside root. ## Subtasks - [ ] Replace `db_url.replace("sqlite:///", "")` with `urllib.parse.urlparse` in `_check_database()` - [ ] Add path boundary validation using `Path.relative_to()` against the configured data directory - [ ] Decode URL-encoded characters before path resolution - [ ] Handle Windows-style paths and edge cases - [ ] Tests (Behave): Add BDD scenarios for path traversal rejection and valid path acceptance - [ ] Tests (Robot): Add integration test for `_check_database()` with malicious URLs - [ ] Verify coverage ≥ 97% via `nox -s coverage_report` - [ ] Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly (`fix(cli): validate SQLite database URL paths to prevent directory traversal`), followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly (`fix/security-sqlite-url-path-traversal`). - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. - All nox stages pass. - Coverage ≥ 97%. > **Backlog note:** This issue was discovered during autonomous operation > on milestone Bug Hunt Cycle 2. It does not block milestone completion and has been > placed in the backlog for human review and future milestone assignment. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunt Cycle 2 | Agent: new-issue-creator
Author
Owner

Verified — Security bug: SQLite URL parsing allows path traversal. MoSCoW: Must-have. Priority: High.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Security bug: SQLite URL parsing allows path traversal. MoSCoW: Must-have. Priority: High. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#7118
No description provided.