BUG-HUNT: performance - register_pattern allows duplicate patterns causing O(n) redundant redaction work #7773

Open
opened 2026-04-12 03:31:09 +00:00 by HAL9000 · 3 comments
Owner

Bug Report: Performance — register_pattern Has No Deduplication Guard

Severity Assessment

  • Impact: If register_pattern is called multiple times with the same pattern (e.g., during repeated initialization, hot-reload, or misconfigured setup), _SECRET_PATTERNS grows unboundedly. Each call to redact_value then performs redundant re.sub passes over the input string, degrading performance O(n) with the number of duplicate registrations.
  • Likelihood: Low-Medium — likely triggered during testing, retry logic, or any code path that re-registers patterns on repeated calls.
  • Priority: Low

Location

  • File: src/cleveragents/shared/redaction.py
  • Function/Class: register_pattern
  • Lines: 218–232

Description

register_pattern compiles and appends the new pattern to _SECRET_PATTERNS without checking whether an identical pattern already exists. Since patterns are stored as compiled re.Pattern objects, equality comparison is possible via their .pattern attribute. There is no guard against registering the same regex string twice.

Evidence

# redaction.py lines 228–232
def register_pattern(pattern: str) -> None:
    if not pattern:
        raise ValueError("pattern cannot be empty")
    compiled = re.compile(pattern)
    with _patterns_lock:
        _SECRET_PATTERNS.append(compiled)  # No deduplication check!

Demonstration:

for _ in range(100):
    register_pattern(r"my-secret-[A-Za-z0-9]{20,}")

# _SECRET_PATTERNS now has 106 entries (6 built-in + 100 duplicates)
# Every call to redact_value() runs 106 regex substitutions instead of 7
value = "my-secret-abcdefghijklmnopqrstuvwxyz"
result = redact_value(value)  # 100 of those subs are redundant

Expected Behavior

register_pattern should be idempotent: registering the same pattern string multiple times should have no additional effect after the first registration.

Actual Behavior

Each call appends a new compiled pattern regardless of duplicates, causing unbounded list growth and proportionally increasing the cost of every redact_value call.

Suggested Fix

Check for existing pattern strings before appending:

def register_pattern(pattern: str) -> None:
    if not pattern:
        raise ValueError("pattern cannot be empty")
    compiled = re.compile(pattern)
    with _patterns_lock:
        existing = {p.pattern for p in _SECRET_PATTERNS}
        if compiled.pattern not in existing:
            _SECRET_PATTERNS.append(compiled)

Category

performance

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD.


Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: bug-hunter

## Bug Report: Performance — `register_pattern` Has No Deduplication Guard ### Severity Assessment - **Impact**: If `register_pattern` is called multiple times with the same pattern (e.g., during repeated initialization, hot-reload, or misconfigured setup), `_SECRET_PATTERNS` grows unboundedly. Each call to `redact_value` then performs redundant `re.sub` passes over the input string, degrading performance O(n) with the number of duplicate registrations. - **Likelihood**: Low-Medium — likely triggered during testing, retry logic, or any code path that re-registers patterns on repeated calls. - **Priority**: Low ### Location - **File**: `src/cleveragents/shared/redaction.py` - **Function/Class**: `register_pattern` - **Lines**: 218–232 ### Description `register_pattern` compiles and appends the new pattern to `_SECRET_PATTERNS` without checking whether an identical pattern already exists. Since patterns are stored as compiled `re.Pattern` objects, equality comparison is possible via their `.pattern` attribute. There is no guard against registering the same regex string twice. ### Evidence ```python # redaction.py lines 228–232 def register_pattern(pattern: str) -> None: if not pattern: raise ValueError("pattern cannot be empty") compiled = re.compile(pattern) with _patterns_lock: _SECRET_PATTERNS.append(compiled) # No deduplication check! ``` Demonstration: ```python for _ in range(100): register_pattern(r"my-secret-[A-Za-z0-9]{20,}") # _SECRET_PATTERNS now has 106 entries (6 built-in + 100 duplicates) # Every call to redact_value() runs 106 regex substitutions instead of 7 value = "my-secret-abcdefghijklmnopqrstuvwxyz" result = redact_value(value) # 100 of those subs are redundant ``` ### Expected Behavior `register_pattern` should be idempotent: registering the same pattern string multiple times should have no additional effect after the first registration. ### Actual Behavior Each call appends a new compiled pattern regardless of duplicates, causing unbounded list growth and proportionally increasing the cost of every `redact_value` call. ### Suggested Fix Check for existing pattern strings before appending: ```python def register_pattern(pattern: str) -> None: if not pattern: raise ValueError("pattern cannot be empty") compiled = re.compile(pattern) with _patterns_lock: existing = {p.pattern for p in _SECRET_PATTERNS} if compiled.pattern not in existing: _SECRET_PATTERNS.append(compiled) ``` ### Category performance ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: bug-hunter
HAL9000 added this to the v3.2.0 milestone 2026-04-12 03:45:06 +00:00
Author
Owner

Verified — Performance bug: duplicate patterns cause O(n) redundant redaction work. MoSCoW: Should-have. Priority: Medium.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Performance bug: duplicate patterns cause O(n) redundant redaction work. MoSCoW: Should-have. Priority: Medium. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Author
Owner

Verified — Performance bug: duplicate patterns cause O(n) redundant redaction work. MoSCoW: Should-have. Priority: Medium.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Performance bug: duplicate patterns cause O(n) redundant redaction work. MoSCoW: Should-have. Priority: Medium. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Author
Owner

Verified — Performance bug: duplicate patterns cause O(n) redundant redaction work. MoSCoW: Should-have. Priority: Medium.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Performance bug: duplicate patterns cause O(n) redundant redaction work. MoSCoW: Should-have. Priority: Medium. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#7773
No description provided.