BUG-HUNT: [consistency] structlog processor handles nested data inconsistently compared to redact_dict #7222

Open
opened 2026-04-10 09:19:40 +00:00 by HAL9000 · 3 comments
Owner

Background and Context

The secrets_masking_processor() function in src/cleveragents/shared/redaction.py and the redact_dict() function handle nested data structures inconsistently. Specifically, the structlog processor does not handle top-level list values in the event dict, while _redact_dict_inner() (used by redact_dict()) fully recurses into lists containing dicts or strings.

The redaction module is a cross-cutting utility integrated into the structlog processor chain, CLI output formatters, and error detail formatting (see module docstring). Because secrets_masking_processor() sits on every log path, any log event that passes a list value at the top level of the event dict will silently skip redaction for all secrets contained within that list.

Specification reference: The project's fail-fast and argument validation principles (CONTRIBUTING.md §"Error and Exception Handling") require consistent, predictable behaviour across all public functions. Two functions in the same module that perform the same logical operation (secrets redaction) must apply identical logic.

Current Behaviour

_redact_dict_inner() (lines 170–189) handles lists at every nesting level:

def _redact_dict_inner(data: dict[str, Any]) -> dict[str, Any]:
    result: dict[str, Any] = {}
    for key, value in data.items():
        if is_sensitive_key(key):
            result[key] = REDACTED
        elif isinstance(value, dict):
            result[key] = _redact_dict_inner(value)
        elif isinstance(value, str):
            result[key] = redact_value(value)
        elif isinstance(value, list):          # ← list branch present
            result[key] = [
                _redact_dict_inner(item)
                if isinstance(item, dict)
                else (redact_value(item) if isinstance(item, str) else item)
                for item in value
            ]
        else:
            result[key] = value
    return result

secrets_masking_processor() (lines 260–269) omits the list branch entirely:

def secrets_masking_processor(logger, method_name, event_dict):
    if get_show_secrets():
        return event_dict

    result: dict[str, Any] = {}
    for key, value in event_dict.items():
        if is_sensitive_key(key):
            result[key] = REDACTED
        elif isinstance(value, str):
            result[key] = redact_value(value)
        elif isinstance(value, dict):
            result[key] = _redact_dict_inner(value)
        else:
            result[key] = value   # ← lists fall through here, unredacted
    return result

Concrete example of missed redaction:

import structlog
log = structlog.get_logger()

# A log event with a list of dicts containing a secret
log.info(
    "tool_results",
    results=[
        {"api_key": "sk-proj-AbCdEfGhIjKlMnOpQrStUvWxYz1234567890"},
        {"status": "ok"},
    ]
)
# → The api_key value is NOT redacted because the processor
#   falls through to the `else` branch for list values.

Expected Behaviour

secrets_masking_processor() should apply the same list-handling logic as _redact_dict_inner(), so that secrets in list values at the top level of the event dict are redacted:

# After fix — list values are processed identically to _redact_dict_inner
log.info(
    "tool_results",
    results=[
        {"api_key": "***REDACTED***"},   # ← correctly redacted
        {"status": "ok"},
    ]
)

Suggested Fix

Add a list branch to secrets_masking_processor() that mirrors the list handling in _redact_dict_inner():

elif isinstance(value, list):
    result[key] = [
        _redact_dict_inner(item)
        if isinstance(item, dict)
        else (redact_value(item) if isinstance(item, str) else item)
        for item in value
    ]

Alternatively, extract the per-value redaction logic into a shared helper (e.g. _redact_value_any()) and call it from both _redact_dict_inner() and secrets_masking_processor() to prevent future divergence.

Acceptance Criteria

  1. secrets_masking_processor() redacts secrets in list values at the top level of the event dict.
  2. secrets_masking_processor() redacts secrets in dicts nested inside top-level lists.
  3. secrets_masking_processor() redacts secrets in strings nested inside top-level lists.
  4. Behaviour is consistent with redact_dict() / _redact_dict_inner() for all input shapes.
  5. BDD scenarios cover: top-level list of dicts with sensitive keys, top-level list of strings with secret patterns, mixed list (dict + string + int), and empty list.
  6. All nox stages pass; coverage ≥ 97%.

Metadata

  • Branch: bugfix/redaction-structlog-processor-list-handling-consistency
  • Commit Message: fix(redaction): align secrets_masking_processor list handling with _redact_dict_inner
  • Milestone: N/A — Backlog (see note below)
  • Parent Epic: #5502

Backlog note: This issue was discovered during autonomous operation
on milestone v3.2.0. It does not block milestone completion and has been
placed in the backlog for human review and future milestone assignment.

Subtasks

  • Reproduce the missed redaction with a BDD scenario tagged @tdd_issue, @tdd_issue_<N>, @tdd_expected_fail that passes a list value containing a sensitive dict through secrets_masking_processor() and asserts the secret is redacted
  • Add list branch to secrets_masking_processor() mirroring _redact_dict_inner() list handling (or extract shared helper)
  • Remove @tdd_expected_fail tags from TDD scenarios after fix is applied
  • Add BDD scenarios for: top-level list of dicts with sensitive keys, list of strings with secret patterns, mixed list, empty list
  • Add Robot Framework integration test verifying end-to-end structlog processor chain redacts list values
  • Verify nox -s typecheck passes (no # type: ignore)
  • Verify nox -s security_scan passes
  • Verify nox -s unit_tests and nox -s coverage_report — coverage ≥ 97%

Definition of Done

  • secrets_masking_processor() correctly redacts secrets in top-level list values
  • secrets_masking_processor() behaviour is consistent with redact_dict() for all input shapes
  • BDD scenarios exist for all acceptance criteria cases
  • No regression in existing redaction behaviour
  • All nox stages pass
  • Coverage ≥ 97%

Automated by CleverAgents Bot
Supervisor: Acting on behalf of: Bug Hunt | Agent: new-issue-creator

## Background and Context The `secrets_masking_processor()` function in `src/cleveragents/shared/redaction.py` and the `redact_dict()` function handle nested data structures inconsistently. Specifically, the structlog processor does **not** handle top-level `list` values in the event dict, while `_redact_dict_inner()` (used by `redact_dict()`) fully recurses into lists containing dicts or strings. The redaction module is a cross-cutting utility integrated into the structlog processor chain, CLI output formatters, and error detail formatting (see module docstring). Because `secrets_masking_processor()` sits on every log path, any log event that passes a list value at the top level of the event dict will silently skip redaction for all secrets contained within that list. **Specification reference**: The project's fail-fast and argument validation principles (CONTRIBUTING.md §"Error and Exception Handling") require consistent, predictable behaviour across all public functions. Two functions in the same module that perform the same logical operation (secrets redaction) must apply identical logic. ## Current Behaviour `_redact_dict_inner()` (lines 170–189) handles lists at every nesting level: ```python def _redact_dict_inner(data: dict[str, Any]) -> dict[str, Any]: result: dict[str, Any] = {} for key, value in data.items(): if is_sensitive_key(key): result[key] = REDACTED elif isinstance(value, dict): result[key] = _redact_dict_inner(value) elif isinstance(value, str): result[key] = redact_value(value) elif isinstance(value, list): # ← list branch present result[key] = [ _redact_dict_inner(item) if isinstance(item, dict) else (redact_value(item) if isinstance(item, str) else item) for item in value ] else: result[key] = value return result ``` `secrets_masking_processor()` (lines 260–269) **omits the `list` branch entirely**: ```python def secrets_masking_processor(logger, method_name, event_dict): if get_show_secrets(): return event_dict result: dict[str, Any] = {} for key, value in event_dict.items(): if is_sensitive_key(key): result[key] = REDACTED elif isinstance(value, str): result[key] = redact_value(value) elif isinstance(value, dict): result[key] = _redact_dict_inner(value) else: result[key] = value # ← lists fall through here, unredacted return result ``` **Concrete example of missed redaction:** ```python import structlog log = structlog.get_logger() # A log event with a list of dicts containing a secret log.info( "tool_results", results=[ {"api_key": "sk-proj-AbCdEfGhIjKlMnOpQrStUvWxYz1234567890"}, {"status": "ok"}, ] ) # → The api_key value is NOT redacted because the processor # falls through to the `else` branch for list values. ``` ## Expected Behaviour `secrets_masking_processor()` should apply the same list-handling logic as `_redact_dict_inner()`, so that secrets in list values at the top level of the event dict are redacted: ```python # After fix — list values are processed identically to _redact_dict_inner log.info( "tool_results", results=[ {"api_key": "***REDACTED***"}, # ← correctly redacted {"status": "ok"}, ] ) ``` ## Suggested Fix Add a `list` branch to `secrets_masking_processor()` that mirrors the list handling in `_redact_dict_inner()`: ```python elif isinstance(value, list): result[key] = [ _redact_dict_inner(item) if isinstance(item, dict) else (redact_value(item) if isinstance(item, str) else item) for item in value ] ``` Alternatively, extract the per-value redaction logic into a shared helper (e.g. `_redact_value_any()`) and call it from both `_redact_dict_inner()` and `secrets_masking_processor()` to prevent future divergence. ## Acceptance Criteria 1. `secrets_masking_processor()` redacts secrets in list values at the top level of the event dict. 2. `secrets_masking_processor()` redacts secrets in dicts nested inside top-level lists. 3. `secrets_masking_processor()` redacts secrets in strings nested inside top-level lists. 4. Behaviour is consistent with `redact_dict()` / `_redact_dict_inner()` for all input shapes. 5. BDD scenarios cover: top-level list of dicts with sensitive keys, top-level list of strings with secret patterns, mixed list (dict + string + int), and empty list. 6. All nox stages pass; coverage ≥ 97%. ## Metadata - **Branch**: `bugfix/redaction-structlog-processor-list-handling-consistency` - **Commit Message**: `fix(redaction): align secrets_masking_processor list handling with _redact_dict_inner` - **Milestone**: N/A — Backlog (see note below) - **Parent Epic**: #5502 > **Backlog note:** This issue was discovered during autonomous operation > on milestone v3.2.0. It does not block milestone completion and has been > placed in the backlog for human review and future milestone assignment. ## Subtasks - [ ] Reproduce the missed redaction with a BDD scenario tagged `@tdd_issue`, `@tdd_issue_<N>`, `@tdd_expected_fail` that passes a list value containing a sensitive dict through `secrets_masking_processor()` and asserts the secret is redacted - [ ] Add `list` branch to `secrets_masking_processor()` mirroring `_redact_dict_inner()` list handling (or extract shared helper) - [ ] Remove `@tdd_expected_fail` tags from TDD scenarios after fix is applied - [ ] Add BDD scenarios for: top-level list of dicts with sensitive keys, list of strings with secret patterns, mixed list, empty list - [ ] Add Robot Framework integration test verifying end-to-end structlog processor chain redacts list values - [ ] Verify `nox -s typecheck` passes (no `# type: ignore`) - [ ] Verify `nox -s security_scan` passes - [ ] Verify `nox -s unit_tests` and `nox -s coverage_report` — coverage ≥ 97% ## Definition of Done - [ ] `secrets_masking_processor()` correctly redacts secrets in top-level list values - [ ] `secrets_masking_processor()` behaviour is consistent with `redact_dict()` for all input shapes - [ ] BDD scenarios exist for all acceptance criteria cases - [ ] No regression in existing redaction behaviour - [ ] All nox stages pass - [ ] Coverage ≥ 97% --- **Automated by CleverAgents Bot** Supervisor: Acting on behalf of: Bug Hunt | Agent: new-issue-creator
Author
Owner

Verified — Consistency bug: structlog processor handles nested data inconsistently. MoSCoW: Should-have. Priority: Medium.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Consistency bug: structlog processor handles nested data inconsistently. MoSCoW: Should-have. Priority: Medium. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Author
Owner

Verified — Consistency bug: structlog processor handles nested data inconsistently. MoSCoW: Should-have. Priority: Medium.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Consistency bug: structlog processor handles nested data inconsistently. MoSCoW: Should-have. Priority: Medium. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Author
Owner

Verified — Consistency bug: structlog processor handles nested data inconsistently. MoSCoW: Should-have. Priority: Medium.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Consistency bug: structlog processor handles nested data inconsistently. MoSCoW: Should-have. Priority: Medium. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#7222
No description provided.