BUG-HUNT: [consistency] pattern_registry.py has four overlapping curl/wget pipe patterns — wget_pipe_sh and curl_pipe_sh already match bash via (ba)?sh, making wget_pipe_bash and curl_pipe_bash redundant dead patterns #6612

Open
opened 2026-04-09 22:16:56 +00:00 by HAL9000 · 0 comments
Owner

Bug Report: Consistency — Redundant Duplicate Shell Safety Patterns

Severity Assessment

  • Impact: Two pairs of patterns in DEFAULT_PATTERNS are redundant — wget_pipe_bash and curl_pipe_bash can never produce a warning that wget_pipe_sh and curl_pipe_sh would not have already produced. The duplicate patterns add maintenance overhead and create false confidence that there are 15 distinct detection rules when in fact two are subsets of two others. Additionally, if a developer tries to remove_pattern("wget_pipe_sh") to reduce false positives while keeping only "wget_pipe_bash", they will leave a gap because wget_pipe_sh covers sh patterns that wget_pipe_bash would miss.
  • Likelihood: High — this is a code correctness issue present on every invocation.
  • Priority: Medium (backlog)

Location

  • File: src/cleveragents/tui/shell_safety/pattern_registry.py
  • Lines: 92–130 (wget_pipe_sh, curl_pipe_sh, wget_pipe_bash, curl_pipe_bash)

Description

wget_pipe_sh uses the regex r"\bwget\b.*\|\s*(ba)?sh\b". The (ba)?sh group matches both sh and bash. Therefore this pattern already matches wget | bash.

wget_pipe_bash then defines r"\bwget\b.*\|\s*bash\b", which is a strict subset of wget_pipe_sh. Every string that matches wget_pipe_bash will already have been caught by wget_pipe_sh:

# pattern_registry.py
DangerousPattern(
    name="wget_pipe_sh",
    pattern=r"\bwget\b.*\|\s*(ba)?sh\b",   # matches: "wget ... | sh"  AND "wget ... | bash"
    level=ShellDangerLevel.MEDIUM,
    ...
),
...
DangerousPattern(
    name="wget_pipe_bash",
    pattern=r"\bwget\b.*\|\s*bash\b",       # matches: "wget ... | bash" ONLY ← already matched above
    level=ShellDangerLevel.MEDIUM,
    ...
),

The same applies to curl_pipe_sh / curl_pipe_bash:

DangerousPattern(
    name="curl_pipe_sh",
    pattern=r"\bcurl\b.*\|\s*(ba)?sh\b",   # matches both sh and bash
    ...
),
DangerousPattern(
    name="curl_pipe_bash",
    pattern=r"\bcurl\b.*\|\s*bash\b",      # redundant subset
    ...
),

Since DangerousPatternDetector.check_first() returns the first matching pattern in insertion order, and wget_pipe_sh / curl_pipe_sh appear before wget_pipe_bash / curl_pipe_bash, the bash-specific patterns will never fire in normal usage.

Expected Behavior

Each pattern in DEFAULT_PATTERNS should detect a distinct class of dangerous commands that no other pattern covers. The pattern descriptions claim these are different patterns but they are functionally identical in coverage.

Actual Behavior

wget_pipe_bash and curl_pipe_bash are dead code — check_first() always returns wget_pipe_sh or curl_pipe_sh first. check() (all matches) returns both, but the duplicate is informationless.

Suggested Fix

Option A — Remove the redundant bash-specific patterns:

# Remove wget_pipe_bash and curl_pipe_bash entirely.
# wget_pipe_sh and curl_pipe_sh already cover both sh and bash.

Option B — Rename and refine to distinguish sh vs bash with separate levels:

DangerousPattern(
    name="pipe_to_sh",
    pattern=r"\b(wget|curl)\b.*\|\s*sh\b",   # strictly sh only
    level=ShellDangerLevel.MEDIUM,
),
DangerousPattern(
    name="pipe_to_bash",
    pattern=r"\b(wget|curl)\b.*\|\s*bash\b",  # strictly bash only
    level=ShellDangerLevel.MEDIUM,
),

Category

consistency

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it.


Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: bug-hunter

## Bug Report: Consistency — Redundant Duplicate Shell Safety Patterns ### Severity Assessment - **Impact**: Two pairs of patterns in `DEFAULT_PATTERNS` are redundant — `wget_pipe_bash` and `curl_pipe_bash` can never produce a warning that `wget_pipe_sh` and `curl_pipe_sh` would not have already produced. The duplicate patterns add maintenance overhead and create false confidence that there are 15 distinct detection rules when in fact two are subsets of two others. Additionally, if a developer tries to `remove_pattern("wget_pipe_sh")` to reduce false positives while keeping only `"wget_pipe_bash"`, they will leave a gap because `wget_pipe_sh` covers `sh` patterns that `wget_pipe_bash` would miss. - **Likelihood**: High — this is a code correctness issue present on every invocation. - **Priority**: Medium (backlog) ### Location - **File**: `src/cleveragents/tui/shell_safety/pattern_registry.py` - **Lines**: 92–130 (`wget_pipe_sh`, `curl_pipe_sh`, `wget_pipe_bash`, `curl_pipe_bash`) ### Description `wget_pipe_sh` uses the regex `r"\bwget\b.*\|\s*(ba)?sh\b"`. The `(ba)?sh` group matches both `sh` and `bash`. Therefore this pattern already matches `wget | bash`. `wget_pipe_bash` then defines `r"\bwget\b.*\|\s*bash\b"`, which is a strict subset of `wget_pipe_sh`. Every string that matches `wget_pipe_bash` will already have been caught by `wget_pipe_sh`: ```python # pattern_registry.py DangerousPattern( name="wget_pipe_sh", pattern=r"\bwget\b.*\|\s*(ba)?sh\b", # matches: "wget ... | sh" AND "wget ... | bash" level=ShellDangerLevel.MEDIUM, ... ), ... DangerousPattern( name="wget_pipe_bash", pattern=r"\bwget\b.*\|\s*bash\b", # matches: "wget ... | bash" ONLY ← already matched above level=ShellDangerLevel.MEDIUM, ... ), ``` The same applies to `curl_pipe_sh` / `curl_pipe_bash`: ```python DangerousPattern( name="curl_pipe_sh", pattern=r"\bcurl\b.*\|\s*(ba)?sh\b", # matches both sh and bash ... ), DangerousPattern( name="curl_pipe_bash", pattern=r"\bcurl\b.*\|\s*bash\b", # redundant subset ... ), ``` Since `DangerousPatternDetector.check_first()` returns the **first** matching pattern in insertion order, and `wget_pipe_sh` / `curl_pipe_sh` appear before `wget_pipe_bash` / `curl_pipe_bash`, the bash-specific patterns will **never fire** in normal usage. ### Expected Behavior Each pattern in `DEFAULT_PATTERNS` should detect a distinct class of dangerous commands that no other pattern covers. The pattern descriptions claim these are different patterns but they are functionally identical in coverage. ### Actual Behavior `wget_pipe_bash` and `curl_pipe_bash` are dead code — `check_first()` always returns `wget_pipe_sh` or `curl_pipe_sh` first. `check()` (all matches) returns both, but the duplicate is informationless. ### Suggested Fix Option A — Remove the redundant bash-specific patterns: ```python # Remove wget_pipe_bash and curl_pipe_bash entirely. # wget_pipe_sh and curl_pipe_sh already cover both sh and bash. ``` Option B — Rename and refine to distinguish `sh` vs `bash` with separate levels: ```python DangerousPattern( name="pipe_to_sh", pattern=r"\b(wget|curl)\b.*\|\s*sh\b", # strictly sh only level=ShellDangerLevel.MEDIUM, ), DangerousPattern( name="pipe_to_bash", pattern=r"\b(wget|curl)\b.*\|\s*bash\b", # strictly bash only level=ShellDangerLevel.MEDIUM, ), ``` ### Category consistency ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: `@tdd_issue`, `@tdd_issue_<this-issue-number>`, and `@tdd_expected_fail` to prove the bug exists before fixing it. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: bug-hunter
HAL9000 added this to the v3.2.0 milestone 2026-04-09 22:25:28 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#6612
No description provided.