BUG-HUNT: [boundary] LLMExecuteActor produces empty ChangeSet with no warning when LLM response contains no FILE: blocks #6407

Open
opened 2026-04-09 21:01:02 +00:00 by HAL9000 · 0 comments
Owner

Bug Report: [boundary] Empty or malformed LLM response silently yields a zero-entry ChangeSet with no user feedback

Severity Assessment

  • Impact: When the LLM response contains no FILE: <path> blocks (e.g., the LLM returns a prose explanation, apologizes, or produces output in an unexpected format), _parse_file_blocks() returns an empty list. The execute phase completes with tool_calls_count=0 and no files written. The caller receives an ExecuteResult that looks successful but has accomplished nothing. There is no warning log entry, no raised exception, and no indication in the result that code generation actually failed.
  • Likelihood: Medium. LLMs regularly produce non-conforming output, especially when context is missing, the prompt is ambiguous, or the model is rate-limited and truncates output.
  • Priority: Medium

Location

  • File: src/cleveragents/application/services/llm_actors.py
  • Function: LLMExecuteActor.execute
  • Lines: 417–452

Description

After invoking the LLM, LLMExecuteActor.execute() calls _parse_file_blocks(content, plan_id) to extract file changes. If the LLM response contains no matching FILE: <path> + fenced-code-block patterns, the method returns an empty entries list. The code continues without any warning, producing:

  • ChangeSet(plan_id=plan_id, entries=[]) — empty, no changes
  • ExecuteResult(tool_calls_count=0, ...) — looks like success

The stream_callback receives execute_complete with tool_calls_count=0, but there is no execute_no_output or execute_parse_failed event, and no log warning about the empty parse result.

Additionally, the content variable (the raw LLM response string) is neither logged at INFO level nor included in the result, so there is no way to diagnose why parsing failed after the fact.

Evidence

# llm_actors.py, lines 395-452
response = llm.invoke([HumanMessage(content=prompt)])
content = response.content if hasattr(response, "content") else str(response)

self._logger.debug(
    "LLM execute response",
    plan_id=plan_id,
    response_preview=content[:_LOG_RESPONSE_CHARS],  # ← debug only, 500 chars
)

# ... stream per-decision progress ...

entries = self._parse_file_blocks(content, plan_id)  # ← returns [] if no FILE: blocks
changeset_id = str(ULID())
changeset = ChangeSet(plan_id=plan_id, entries=entries)  # empty

if sandbox_root is not None and not read_only:
    self._write_to_sandbox(entries, sandbox_root, content)  # no-op

# No check here for len(entries) == 0!

if stream_callback is not None:
    stream_callback(
        "execute_complete",
        {
            "plan_id": plan_id,
            "changeset_id": changeset_id,
            "tool_calls_count": len(entries),  # reports 0, caller sees "success"
        },
    )

self._logger.info(
    "LLM execute completed",
    plan_id=plan_id,
    changeset_id=changeset_id,
    entry_count=len(entries),  # 0, but no WARNING level log
)

return ExecuteResult(
    changeset_id=changeset_id,
    changeset=changeset,      # empty
    tool_calls_count=0,
    sandbox_refs=sandbox_refs,
)

Expected Behavior

When _parse_file_blocks() returns an empty list, the actor should:

  1. Log a WARNING (not just DEBUG) including the raw LLM response content for diagnostics.
  2. Either raise an error (forcing the caller to handle the failure) or emit a stream_callback("execute_no_output", ...) event so users and orchestrators can detect and react to the empty result.

Actual Behavior

An empty ChangeSet is returned as if the execute phase succeeded. The log only shows an INFO entry with entry_count=0. There is no way to distinguish a genuine "no changes needed" result from a parse failure.

Suggested Fix

Add a guard after parsing:

entries = self._parse_file_blocks(content, plan_id)
if not entries:
    self._logger.warning(
        "execute_no_file_blocks_in_response",
        plan_id=plan_id,
        response_preview=content[:_LOG_RESPONSE_CHARS],
        response_length=len(content),
    )
    if stream_callback is not None:
        stream_callback("execute_no_output", {
            "plan_id": plan_id,
            "reason": "LLM response contained no FILE: blocks",
        })

Category

boundary

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it.


Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: bug-hunter

## Bug Report: [boundary] Empty or malformed LLM response silently yields a zero-entry ChangeSet with no user feedback ### Severity Assessment - **Impact**: When the LLM response contains no `FILE: <path>` blocks (e.g., the LLM returns a prose explanation, apologizes, or produces output in an unexpected format), `_parse_file_blocks()` returns an empty list. The execute phase completes with `tool_calls_count=0` and no files written. The caller receives an `ExecuteResult` that looks successful but has accomplished nothing. There is no warning log entry, no raised exception, and no indication in the result that code generation actually failed. - **Likelihood**: Medium. LLMs regularly produce non-conforming output, especially when context is missing, the prompt is ambiguous, or the model is rate-limited and truncates output. - **Priority**: Medium ### Location - **File**: `src/cleveragents/application/services/llm_actors.py` - **Function**: `LLMExecuteActor.execute` - **Lines**: 417–452 ### Description After invoking the LLM, `LLMExecuteActor.execute()` calls `_parse_file_blocks(content, plan_id)` to extract file changes. If the LLM response contains no matching `FILE: <path>` + fenced-code-block patterns, the method returns an empty `entries` list. The code continues without any warning, producing: - `ChangeSet(plan_id=plan_id, entries=[])` — empty, no changes - `ExecuteResult(tool_calls_count=0, ...)` — looks like success The `stream_callback` receives `execute_complete` with `tool_calls_count=0`, but there is no `execute_no_output` or `execute_parse_failed` event, and no log warning about the empty parse result. Additionally, the `content` variable (the raw LLM response string) is neither logged at INFO level nor included in the result, so there is no way to diagnose why parsing failed after the fact. ### Evidence ```python # llm_actors.py, lines 395-452 response = llm.invoke([HumanMessage(content=prompt)]) content = response.content if hasattr(response, "content") else str(response) self._logger.debug( "LLM execute response", plan_id=plan_id, response_preview=content[:_LOG_RESPONSE_CHARS], # ← debug only, 500 chars ) # ... stream per-decision progress ... entries = self._parse_file_blocks(content, plan_id) # ← returns [] if no FILE: blocks changeset_id = str(ULID()) changeset = ChangeSet(plan_id=plan_id, entries=entries) # empty if sandbox_root is not None and not read_only: self._write_to_sandbox(entries, sandbox_root, content) # no-op # No check here for len(entries) == 0! if stream_callback is not None: stream_callback( "execute_complete", { "plan_id": plan_id, "changeset_id": changeset_id, "tool_calls_count": len(entries), # reports 0, caller sees "success" }, ) self._logger.info( "LLM execute completed", plan_id=plan_id, changeset_id=changeset_id, entry_count=len(entries), # 0, but no WARNING level log ) return ExecuteResult( changeset_id=changeset_id, changeset=changeset, # empty tool_calls_count=0, sandbox_refs=sandbox_refs, ) ``` ### Expected Behavior When `_parse_file_blocks()` returns an empty list, the actor should: 1. Log a **WARNING** (not just DEBUG) including the raw LLM response content for diagnostics. 2. Either raise an error (forcing the caller to handle the failure) or emit a `stream_callback("execute_no_output", ...)` event so users and orchestrators can detect and react to the empty result. ### Actual Behavior An empty `ChangeSet` is returned as if the execute phase succeeded. The log only shows an INFO entry with `entry_count=0`. There is no way to distinguish a genuine "no changes needed" result from a parse failure. ### Suggested Fix Add a guard after parsing: ```python entries = self._parse_file_blocks(content, plan_id) if not entries: self._logger.warning( "execute_no_file_blocks_in_response", plan_id=plan_id, response_preview=content[:_LOG_RESPONSE_CHARS], response_length=len(content), ) if stream_callback is not None: stream_callback("execute_no_output", { "plan_id": plan_id, "reason": "LLM response contained no FILE: blocks", }) ``` ### Category boundary ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: `@tdd_issue`, `@tdd_issue_<this-issue-number>`, and `@tdd_expected_fail` to prove the bug exists before fixing it. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: bug-hunter
HAL9000 added this to the v3.2.0 milestone 2026-04-09 21:09:18 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#6407
No description provided.