BUG-HUNT: [security] Resource exhaustion DoS in LSP language discovery directory traversal #7161

Open
opened 2026-04-10 08:20:33 +00:00 by HAL9000 · 1 comment
Owner

Metadata

  • Branch: bugfix/m3.6.0-lsp-discovery-resource-exhaustion-dos
  • Commit Message: fix(lsp): add depth/file/timeout limits to detect_directory_languages() to prevent DoS
  • Milestone: v3.6.0
  • Parent Epic: #824

Background and Context

The detect_directory_languages() method in src/cleveragents/lsp/discovery.py uses an unbounded os.walk() traversal with no depth limit, file count limit, or execution timeout. This allows a malicious actor to cause indefinite CPU and memory exhaustion by providing crafted directory structures, constituting a Denial-of-Service (DoS) vulnerability via resource exhaustion.

Affected component: src/cleveragents/lsp/discovery.pydetect_directory_languages() method

Severity: Critical — DoS via resource exhaustion
Likelihood: High — trivially exploitable with crafted directories

Vulnerable Pattern

# Unbounded traversal — no depth limit, no file count limit, no timeout
def detect_directory_languages(self, directory: str) -> list[str]:
    languages: set[str] = set()
    try:
        for root, _dirs, files in os.walk(directory):  # Unbounded walk
            for fname in files:  # No file count limit
                fpath = os.path.join(root, fname)
                lang = self.detect_file_language(fpath)  # Processes every file
                if lang != "plaintext":
                    languages.add(lang)
    except OSError:
        logger.warning(...)
    return sorted(languages)

Attack Scenarios

  1. Deep nesting attack: Create directories 10,000 levels deep to exhaust stack/memory
  2. Wide directory attack: Create a directory with 1,000,000 files to exhaust processing time
  3. Symbolic link loops: Create circular symlinks to cause infinite traversal (current code does not pass followlinks=False explicitly)

Expected Behavior

Directory traversal should enforce reasonable, configurable limits on depth, file count, and execution time. Traversal should never block indefinitely or exhaust system resources regardless of the input directory structure.

Actual Behavior

The function attempts to process unlimited directory structures, potentially hanging indefinitely or exhausting system resources.

Subtasks

  • Add max_depth parameter (default: 50) to detect_directory_languages() with argument validation
  • Add max_files parameter (default: 10,000) to detect_directory_languages() with argument validation
  • Add timeout parameter (default: 30s) to detect_directory_languages() with argument validation
  • Pass followlinks=False to os.walk() to prevent symlink loop DoS
  • Add depth tracking in the traversal loop and break when max_depth is exceeded
  • Add file count tracking and break when max_files is exceeded
  • Add periodic timeout checks during traversal to enforce the time limit
  • Log a warning when traversal is terminated early due to any limit being reached
  • Remove @tdd_expected_fail tag from all @tdd_issue_7161 scenarios after fix is verified
  • Update docstring to document the new parameters and their defaults

Definition of Done

  • detect_directory_languages() enforces configurable max_depth, max_files, and timeout limits
  • os.walk() is called with followlinks=False
  • A warning is logged whenever traversal is terminated early by any limit
  • All @tdd_issue_7161 scenarios pass with @tdd_expected_fail removed
  • No # type: ignore suppressions introduced
  • All nox stages pass
  • Coverage >= 97%

Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: new-issue-creator

## Metadata - **Branch**: `bugfix/m3.6.0-lsp-discovery-resource-exhaustion-dos` - **Commit Message**: `fix(lsp): add depth/file/timeout limits to detect_directory_languages() to prevent DoS` - **Milestone**: v3.6.0 - **Parent Epic**: #824 ## Background and Context The `detect_directory_languages()` method in `src/cleveragents/lsp/discovery.py` uses an unbounded `os.walk()` traversal with no depth limit, file count limit, or execution timeout. This allows a malicious actor to cause indefinite CPU and memory exhaustion by providing crafted directory structures, constituting a Denial-of-Service (DoS) vulnerability via resource exhaustion. **Affected component**: `src/cleveragents/lsp/discovery.py` — `detect_directory_languages()` method **Severity**: Critical — DoS via resource exhaustion **Likelihood**: High — trivially exploitable with crafted directories ### Vulnerable Pattern ```python # Unbounded traversal — no depth limit, no file count limit, no timeout def detect_directory_languages(self, directory: str) -> list[str]: languages: set[str] = set() try: for root, _dirs, files in os.walk(directory): # Unbounded walk for fname in files: # No file count limit fpath = os.path.join(root, fname) lang = self.detect_file_language(fpath) # Processes every file if lang != "plaintext": languages.add(lang) except OSError: logger.warning(...) return sorted(languages) ``` ### Attack Scenarios 1. **Deep nesting attack**: Create directories 10,000 levels deep to exhaust stack/memory 2. **Wide directory attack**: Create a directory with 1,000,000 files to exhaust processing time 3. **Symbolic link loops**: Create circular symlinks to cause infinite traversal (current code does not pass `followlinks=False` explicitly) ### Expected Behavior Directory traversal should enforce reasonable, configurable limits on depth, file count, and execution time. Traversal should never block indefinitely or exhaust system resources regardless of the input directory structure. ### Actual Behavior The function attempts to process unlimited directory structures, potentially hanging indefinitely or exhausting system resources. ## Subtasks - [ ] Add `max_depth` parameter (default: 50) to `detect_directory_languages()` with argument validation - [ ] Add `max_files` parameter (default: 10,000) to `detect_directory_languages()` with argument validation - [ ] Add `timeout` parameter (default: 30s) to `detect_directory_languages()` with argument validation - [ ] Pass `followlinks=False` to `os.walk()` to prevent symlink loop DoS - [ ] Add depth tracking in the traversal loop and break when `max_depth` is exceeded - [ ] Add file count tracking and break when `max_files` is exceeded - [ ] Add periodic timeout checks during traversal to enforce the time limit - [ ] Log a warning when traversal is terminated early due to any limit being reached - [ ] Remove `@tdd_expected_fail` tag from all `@tdd_issue_7161` scenarios after fix is verified - [ ] Update docstring to document the new parameters and their defaults ## Definition of Done - [ ] `detect_directory_languages()` enforces configurable `max_depth`, `max_files`, and `timeout` limits - [ ] `os.walk()` is called with `followlinks=False` - [ ] A warning is logged whenever traversal is terminated early by any limit - [ ] All `@tdd_issue_7161` scenarios pass with `@tdd_expected_fail` removed - [ ] No `# type: ignore` suppressions introduced - [ ] All nox stages pass - [ ] Coverage >= 97% --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: new-issue-creator
HAL9000 added this to the v3.6.0 milestone 2026-04-10 08:20:41 +00:00
Author
Owner

Verified — Critical security bug: resource exhaustion DoS in LSP language discovery. MoSCoW: Must-have. Priority: Critical.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Critical security bug: resource exhaustion DoS in LSP language discovery. MoSCoW: Must-have. Priority: Critical. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#7161
No description provided.