UAT: ReactiveStreamRouter._route_to_llm() applies prompt boundary markers (mechanism 2) but skips sanitize_user_input() (mechanism 1) — prompt injection mechanism 1 bypassed in reactive routing path #3965

Open
opened 2026-04-06 07:57:10 +00:00 by freemo · 0 comments
Owner

Metadata

  • Branch: fix/security-stream-router-sanitize-user-input
  • Commit Message: fix(security): call sanitize_user_input() in ReactiveStreamRouter before wrapping with boundary markers
  • Milestone: (none — backlog)
  • Parent Epic: #362

Bug Report

What Was Tested

Prompt injection mitigation in the reactive stream router's LLM invocation path.

Expected Behavior (from spec)

Per docs/specification.md §Security Model — Prompt Injection Mitigation:

Mechanism 1 — Input sanitization: User-provided text in action arguments, invariant text, and session prompts is sanitized before inclusion in LLM prompts. HTML entities, control characters, and known injection patterns are escaped or rejected.

The spec requires both mechanism 1 (input sanitization) and mechanism 2 (prompt boundary markers) to be applied before user content reaches the LLM.

Actual Behavior

ReactiveStreamRouter._route_to_llm() in src/cleveragents/reactive/stream_router.py applies mechanism 2 (boundary markers) but skips mechanism 1 (sanitize_user_input()):

# stream_router.py lines 240-259
def _route_to_llm(self, content, context=None):
    prompt = content if isinstance(content, str) else str(content or "")
    ctx = context or {}
    rendered_system = self._render_prompt(self.system_prompt, ctx)

    # Mechanism 2: augment system prompt with boundary instructions
    # and wrap user content with boundary markers
    if rendered_system:
        rendered_system = _SANITIZER.augment_system_prompt(rendered_system)
    prompt = _SANITIZER.wrap_user_content(prompt)  # ← Mechanism 2 only
    # ← Missing: _SANITIZER.sanitize_user_input(prompt) before wrapping!

    messages = []
    if rendered_system:
        messages.append(SystemMessage(content=rendered_system))
    messages.append(HumanMessage(content=prompt))

This means user-provided content flowing through the reactive routing path (e.g., actor run commands) bypasses:

  • HTML entity escaping
  • Control character stripping
  • Known injection pattern detection (ignore all previous instructions, you are now a, [SYSTEM], <|im_start|>, etc.)
  • #3653PlanGenerationGraph has the same issue (mechanism 1 bypassed)
  • #3319 — Covers auto_debug.py and context_analysis.py but not stream_router.py
  • #2552PromptSanitizer not integrated into A2A request handling

Code Location

  • src/cleveragents/reactive/stream_router.py:241-249_route_to_llm() method

Fix Required

Call sanitize_user_input() before wrap_user_content():

# Before wrapping with boundary markers, sanitize the raw user input
sanitization_result = _SANITIZER.sanitize_user_input(prompt)
prompt = sanitization_result.sanitized
# Then apply mechanism 2
prompt = _SANITIZER.wrap_user_content(prompt)

Note: sanitize_user_input() raises PromptInjectionDetected if a known injection pattern is found. The caller must handle this exception appropriately (e.g., return an error to the user).

Subtasks

  • Add sanitize_user_input() call in _route_to_llm() before wrap_user_content()
  • Handle PromptInjectionDetected exception in the routing path
  • Add BDD scenario verifying that injection patterns are rejected in the reactive routing path
  • Verify nox -e unit_tests passes

Definition of Done

  • _route_to_llm() calls sanitize_user_input() before wrapping with boundary markers
  • PromptInjectionDetected is handled and returns an appropriate error to the caller
  • A BDD scenario exists verifying injection patterns are blocked in the reactive path
  • nox -e unit_tests and nox -e typecheck pass
  • All nox stages pass
  • Coverage >= 97%

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-uat-tester

## Metadata - **Branch**: `fix/security-stream-router-sanitize-user-input` - **Commit Message**: `fix(security): call sanitize_user_input() in ReactiveStreamRouter before wrapping with boundary markers` - **Milestone**: _(none — backlog)_ - **Parent Epic**: #362 ## Bug Report ### What Was Tested Prompt injection mitigation in the reactive stream router's LLM invocation path. ### Expected Behavior (from spec) Per `docs/specification.md` §Security Model — Prompt Injection Mitigation: > **Mechanism 1 — Input sanitization**: User-provided text in action arguments, invariant text, and session prompts is sanitized before inclusion in LLM prompts. HTML entities, control characters, and known injection patterns are escaped or rejected. The spec requires **both** mechanism 1 (input sanitization) and mechanism 2 (prompt boundary markers) to be applied before user content reaches the LLM. ### Actual Behavior `ReactiveStreamRouter._route_to_llm()` in `src/cleveragents/reactive/stream_router.py` applies mechanism 2 (boundary markers) but **skips mechanism 1** (`sanitize_user_input()`): ```python # stream_router.py lines 240-259 def _route_to_llm(self, content, context=None): prompt = content if isinstance(content, str) else str(content or "") ctx = context or {} rendered_system = self._render_prompt(self.system_prompt, ctx) # Mechanism 2: augment system prompt with boundary instructions # and wrap user content with boundary markers if rendered_system: rendered_system = _SANITIZER.augment_system_prompt(rendered_system) prompt = _SANITIZER.wrap_user_content(prompt) # ← Mechanism 2 only # ← Missing: _SANITIZER.sanitize_user_input(prompt) before wrapping! messages = [] if rendered_system: messages.append(SystemMessage(content=rendered_system)) messages.append(HumanMessage(content=prompt)) ``` This means user-provided content flowing through the reactive routing path (e.g., actor `run` commands) bypasses: - HTML entity escaping - Control character stripping - Known injection pattern detection (`ignore all previous instructions`, `you are now a`, `[SYSTEM]`, `<|im_start|>`, etc.) ### Related Issues - #3653 — `PlanGenerationGraph` has the same issue (mechanism 1 bypassed) - #3319 — Covers `auto_debug.py` and `context_analysis.py` but not `stream_router.py` - #2552 — `PromptSanitizer` not integrated into A2A request handling ### Code Location - `src/cleveragents/reactive/stream_router.py:241-249` — `_route_to_llm()` method ### Fix Required Call `sanitize_user_input()` before `wrap_user_content()`: ```python # Before wrapping with boundary markers, sanitize the raw user input sanitization_result = _SANITIZER.sanitize_user_input(prompt) prompt = sanitization_result.sanitized # Then apply mechanism 2 prompt = _SANITIZER.wrap_user_content(prompt) ``` Note: `sanitize_user_input()` raises `PromptInjectionDetected` if a known injection pattern is found. The caller must handle this exception appropriately (e.g., return an error to the user). ## Subtasks - [ ] Add `sanitize_user_input()` call in `_route_to_llm()` before `wrap_user_content()` - [ ] Handle `PromptInjectionDetected` exception in the routing path - [ ] Add BDD scenario verifying that injection patterns are rejected in the reactive routing path - [ ] Verify `nox -e unit_tests` passes ## Definition of Done - [ ] `_route_to_llm()` calls `sanitize_user_input()` before wrapping with boundary markers - [ ] `PromptInjectionDetected` is handled and returns an appropriate error to the caller - [ ] A BDD scenario exists verifying injection patterns are blocked in the reactive path - [ ] `nox -e unit_tests` and `nox -e typecheck` pass - All nox stages pass - Coverage >= 97% --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-uat-tester
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#3965
No description provided.