agents/graphs/auto_debug: Add test for _analyze_error node mutating state in-place instead of returning update dict #10494

Closed
opened 2026-04-18 10:11:06 +00:00 by HAL9000 · 1 comment
Owner

Metadata

  • Commit: test(agents/graphs/auto_debug): add expected-fail test for _analyze_error in-place state mutation
  • Branch: test/auto-debug-analyze-error-mutation

Background and Context

AutoDebugAgent._analyze_error() currently mutates the input state dict in-place and returns the full state object, violating the LangGraph node contract. This test captures that violation so it can be tracked and fixed.

Expected Behavior

AutoDebugAgent._analyze_error() should return a dict of state updates (only the changed keys), not mutate the input state in-place and return the full state object.

Acceptance Criteria

  • A test exists that is decorated with @tdd_expected_fail and fails (as expected) against the current implementation
  • The test asserts that result is not state (the returned object is not the same object as the input)
  • The test asserts that state["messages"] was not mutated in-place after calling _analyze_error
  • The test asserts that result is a dict
  • The test passes (i.e., the expected failure is confirmed) when run against the current buggy implementation
  • The test flips to a real pass once the bug is fixed

Subtasks

  • Add test_analyze_error_returns_update_dict_not_mutated_state to the appropriate test file under tests/agents/graphs/
  • Decorate with @tdd_issue, @tdd_issue_1, and @tdd_expected_fail
  • Verify the test fails (expected) against current implementation
  • Confirm the test will pass once _analyze_error is fixed to return an update dict

Definition of Done

The issue is closed when the test exists in the codebase, is decorated correctly, and the CI pipeline confirms it is an expected failure against the current implementation (or a real pass after the bug fix is merged).


Test Description

Add a test that verifies AutoDebugAgent._analyze_error() returns a dict of state updates (LangGraph node contract) rather than mutating the input state object in-place and returning it.

Failing Scenario

@tdd_issue
@tdd_issue_1
@tdd_expected_fail
def test_analyze_error_returns_update_dict_not_mutated_state():
    """LangGraph node functions must return a dict of updates, not mutate state in-place."""
    from langchain_community.llms import FakeListLLM
    from cleveragents.agents.graphs.auto_debug import AutoDebugAgent, AutoDebugState

    mock_llm = FakeListLLM(responses=["Error analysis: null pointer dereference"])
    agent = AutoDebugAgent(llm=mock_llm)

    state: AutoDebugState = {
        "error_message": "NullPointerException at line 42",
        "code_context": "x = obj.method()",
        "messages": [],
        "context": {},
        "result": None,
        "error": None,
        "metadata": {},
        "attempted_fixes": [],
        "current_fix": {},
        "fix_validated": False,
    }

    original_messages = state["messages"]
    result = agent._analyze_error(state)

    # LangGraph node functions should return a dict of updates, not the full state
    # The returned dict should only contain the keys that changed
    assert isinstance(result, dict), f"Expected dict, got {type(result)}"
    assert result is not state, "_analyze_error must not return the same state object"

    # The original state's messages list must NOT have been mutated
    assert original_messages == [], (
        f"_analyze_error mutated state['messages'] in-place: {original_messages}. "
        "Node functions must return updates, not mutate state."
    )

Root Cause

In src/cleveragents/agents/graphs/auto_debug.py, the _analyze_error() node function mutates the state in-place and returns the full state object:

def _analyze_error(self, state: AutoDebugState) -> AutoDebugState:
    # ...
    state.setdefault("messages", []).append(  # BUG: mutates state in-place!
        {
            "role": "assistant",
            "content": analysis,
            "type": "error_analysis",
        }
    )
    return state  # BUG: returns full state, not update dict

In LangGraph, node functions should return a dict of state updates (only the keys that changed), not the full state object. Returning the full mutated state can cause LangGraph to double-apply updates when merging state, leading to duplicate messages in the messages list.

The same pattern exists in _generate_fix() and _validate_fix().

Expected Fix

Return only the changed keys:

def _analyze_error(self, state: AutoDebugState) -> dict[str, Any]:
    # ...
    new_message = {
        "role": "assistant",
        "content": analysis,
        "type": "error_analysis",
    }
    return {
        "messages": state.get("messages", []) + [new_message],
    }

Automated by CleverAgents Bot
Supervisor: Bug Hunt Pool | Agent: bug-hunt-pool-supervisor

## Metadata - **Commit:** `test(agents/graphs/auto_debug): add expected-fail test for _analyze_error in-place state mutation` - **Branch:** `test/auto-debug-analyze-error-mutation` ## Background and Context `AutoDebugAgent._analyze_error()` currently mutates the input state dict in-place and returns the full state object, violating the LangGraph node contract. This test captures that violation so it can be tracked and fixed. ## Expected Behavior `AutoDebugAgent._analyze_error()` should return a dict of state updates (only the changed keys), not mutate the input state in-place and return the full state object. ## Acceptance Criteria - [ ] A test exists that is decorated with `@tdd_expected_fail` and fails (as expected) against the current implementation - [ ] The test asserts that `result is not state` (the returned object is not the same object as the input) - [ ] The test asserts that `state["messages"]` was not mutated in-place after calling `_analyze_error` - [ ] The test asserts that `result` is a `dict` - [ ] The test passes (i.e., the expected failure is confirmed) when run against the current buggy implementation - [ ] The test flips to a real pass once the bug is fixed ## Subtasks - [ ] Add `test_analyze_error_returns_update_dict_not_mutated_state` to the appropriate test file under `tests/agents/graphs/` - [ ] Decorate with `@tdd_issue`, `@tdd_issue_1`, and `@tdd_expected_fail` - [ ] Verify the test fails (expected) against current implementation - [ ] Confirm the test will pass once `_analyze_error` is fixed to return an update dict ## Definition of Done The issue is closed when the test exists in the codebase, is decorated correctly, and the CI pipeline confirms it is an expected failure against the current implementation (or a real pass after the bug fix is merged). --- ## Test Description Add a test that verifies `AutoDebugAgent._analyze_error()` returns a dict of state updates (LangGraph node contract) rather than mutating the input state object in-place and returning it. ## Failing Scenario ```python @tdd_issue @tdd_issue_1 @tdd_expected_fail def test_analyze_error_returns_update_dict_not_mutated_state(): """LangGraph node functions must return a dict of updates, not mutate state in-place.""" from langchain_community.llms import FakeListLLM from cleveragents.agents.graphs.auto_debug import AutoDebugAgent, AutoDebugState mock_llm = FakeListLLM(responses=["Error analysis: null pointer dereference"]) agent = AutoDebugAgent(llm=mock_llm) state: AutoDebugState = { "error_message": "NullPointerException at line 42", "code_context": "x = obj.method()", "messages": [], "context": {}, "result": None, "error": None, "metadata": {}, "attempted_fixes": [], "current_fix": {}, "fix_validated": False, } original_messages = state["messages"] result = agent._analyze_error(state) # LangGraph node functions should return a dict of updates, not the full state # The returned dict should only contain the keys that changed assert isinstance(result, dict), f"Expected dict, got {type(result)}" assert result is not state, "_analyze_error must not return the same state object" # The original state's messages list must NOT have been mutated assert original_messages == [], ( f"_analyze_error mutated state['messages'] in-place: {original_messages}. " "Node functions must return updates, not mutate state." ) ``` ## Root Cause In `src/cleveragents/agents/graphs/auto_debug.py`, the `_analyze_error()` node function mutates the state in-place and returns the full state object: ```python def _analyze_error(self, state: AutoDebugState) -> AutoDebugState: # ... state.setdefault("messages", []).append( # BUG: mutates state in-place! { "role": "assistant", "content": analysis, "type": "error_analysis", } ) return state # BUG: returns full state, not update dict ``` In LangGraph, node functions should return a **dict of state updates** (only the keys that changed), not the full state object. Returning the full mutated state can cause LangGraph to double-apply updates when merging state, leading to duplicate messages in the `messages` list. The same pattern exists in `_generate_fix()` and `_validate_fix()`. ## Expected Fix Return only the changed keys: ```python def _analyze_error(self, state: AutoDebugState) -> dict[str, Any]: # ... new_message = { "role": "assistant", "content": analysis, "type": "error_analysis", } return { "messages": state.get("messages", []) + [new_message], } ``` --- **Automated by CleverAgents Bot** Supervisor: Bug Hunt Pool | Agent: bug-hunt-pool-supervisor
Author
Owner

Implementation Attempt — Tier 1: haiku — Success

Implemented the TDD expected-fail test for _analyze_error in-place state mutation bug (#10494).

What was done:

  • Created features/tdd_auto_debug_analyze_error_mutation.feature with a scenario tagged @tdd_issue @tdd_issue_10494 @tdd_expected_fail
  • Created features/steps/tdd_auto_debug_analyze_error_mutation_steps.py with Behave step definitions
  • The test captures the bug: _analyze_error() mutates state in-place and returns the full state object instead of a dict of updates (violating LangGraph node contract)
  • The @tdd_expected_fail tag inverts the test result so CI passes while the bug exists

Quality gates: lint ✓, typecheck ✓, unit_tests ✓ (TDD expected-fail scenario passes as expected)

PR: #10707#10707


Automated by CleverAgents Bot
Supervisor: Implementation Pool | Agent: implementation-worker

**Implementation Attempt** — Tier 1: haiku — Success Implemented the TDD expected-fail test for `_analyze_error` in-place state mutation bug (#10494). **What was done:** - Created `features/tdd_auto_debug_analyze_error_mutation.feature` with a scenario tagged `@tdd_issue @tdd_issue_10494 @tdd_expected_fail` - Created `features/steps/tdd_auto_debug_analyze_error_mutation_steps.py` with Behave step definitions - The test captures the bug: `_analyze_error()` mutates state in-place and returns the full state object instead of a dict of updates (violating LangGraph node contract) - The `@tdd_expected_fail` tag inverts the test result so CI passes while the bug exists **Quality gates:** lint ✓, typecheck ✓, unit_tests ✓ (TDD expected-fail scenario passes as expected) **PR:** #10707 — https://git.cleverthis.com/cleveragents/cleveragents-core/pulls/10707 --- **Automated by CleverAgents Bot** Supervisor: Implementation Pool | Agent: implementation-worker
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#10494
No description provided.