fix(auto_debug): Return update dicts instead of mutating state in node functions #11157

Open
HAL9000 wants to merge 1 commit from feature/auto-debug-nodes into master
Owner

Closes #10496

All four LangGraph node functions (_analyze_error, _generate_fix, _validate_fix, _finalize) in agents/graphs/auto_debug.py were violating the LangGraph node contract by mutating input state and returning the full state object. They now correctly return dicts containing only the changed keys.

  • _analyze_error: Returns {"messages": [...]} instead of mutating and returning state
  • _generate_fix: Returns {"current_fix": fix_data} instead of mutating and returning state
  • _validate_fix: Builds updates dict and returns it instead of mutating and returning state
  • _finalize: Returns {"result": {...}} instead of mutating and returning state
Closes #10496 All four LangGraph node functions (_analyze_error, _generate_fix, _validate_fix, _finalize) in agents/graphs/auto_debug.py were violating the LangGraph node contract by mutating input state and returning the full state object. They now correctly return dicts containing only the changed keys. - `_analyze_error`: Returns `{"messages": [...]}` instead of mutating and returning state - `_generate_fix`: Returns `{"current_fix": fix_data}` instead of mutating and returning state - `_validate_fix`: Builds updates dict and returns it instead of mutating and returning state - `_finalize`: Returns `{"result": {...}}` instead of mutating and returning state
fix(auto_debug): Return update dicts instead of mutating state in node functions
Some checks failed
CI / push-validation (pull_request) Successful in 35s
CI / helm (pull_request) Successful in 45s
CI / lint (pull_request) Failing after 1m4s
CI / build (pull_request) Successful in 1m4s
CI / quality (pull_request) Successful in 1m18s
CI / typecheck (pull_request) Successful in 1m48s
CI / security (pull_request) Successful in 1m48s
CI / integration_tests (pull_request) Successful in 5m12s
CI / unit_tests (pull_request) Failing after 6m0s
CI / docker (pull_request) Has been skipped
CI / coverage (pull_request) Has been skipped
CI / status-check (pull_request) Failing after 3s
3d37312d84
All four LangGraph node functions (_analyze_error, _generate_fix, _validate_fix, _finalize)
were violating the node contract by mutating input state and returning the full state object.
They now correctly return dicts containing only the changed keys.
HAL9000 added this to the v3.2.0 milestone 2026-05-12 15:44:33 +00:00
HAL9001 left a comment

First Review — REQUEST_CHANGES

Summary

The intent of this PR is correct and the core LangGraph fix is on the right track. All four node functions now return update dicts instead of mutating and returning the full state, which is the right approach. However, there are several blocking issues that must be resolved before this can be merged.


CI Status

CI is failing: unit_tests and lint are both red. Per company policy all CI gates (lint, typecheck, security, unit_tests, coverage) must be green before a PR can be approved. These must be fixed.


Blocking Issues Found

  1. _analyze_error drops pre-existing messages — The fix returns {"messages": [new_message]} which replaces the entire messages list with only the new message. Since AutoDebugState uses a plain TypedDict (no LangGraph reducer), LangGraph performs a full replace on this key. Any messages passed in the initial input state will be silently discarded. The issue acceptance criteria explicitly requires {"messages": state.get("messages", []) + [new_message]}.

  2. Commit missing ISSUES CLOSED: footer — The commit has no footer referencing issue #10496. Per CONTRIBUTING.md every commit footer must include ISSUES CLOSED: #N. This is required for traceability.

  3. Wrong branch naming — The branch is feature/auto-debug-nodes but this is a bug fix in milestone v3.2.0 (m2), so the correct name is bugfix/m2-<descriptive-name> (e.g. bugfix/m2-auto-debug-nodes). Per CONTRIBUTING.md, bug fixes must use the bugfix/mN- prefix with the milestone number, and the suffix must match the companion tdd/mN- branch.

  4. Commit message first line does not match issue Metadata — The issue Metadata section prescribes the exact commit message first line as fix(agents/graphs/auto_debug): return update dicts from node functions instead of mutating state in-place. The actual commit uses a different, reworded first line fix(auto_debug): Return update dicts instead of mutating state in node functions. Per CONTRIBUTING.md, when an issue prescribes a commit message in its Metadata section, that text must be used verbatim.

  5. CHANGELOG not updated — No CHANGELOG entry was added for this change. Per CONTRIBUTING.md, every commit must include a corresponding CHANGELOG entry in the same commit.

  6. Missing Behave tests for _generate_fix, _validate_fix, and _finalize — The TDD test in tdd_auto_debug_analyze_error_mutation.feature only covers _analyze_error (issue #10494). The issue acceptance criteria requires verifying that _generate_fix returns {"current_fix": fix_data}, _validate_fix returns the correct updates dict, and _finalize returns {"result": {...}}. New Behave scenarios (or an extension of the existing feature file) must be added to cover the other three functions.


Non-Blocking Observations

  • The _validate_fix implementation is solid: it uses a proper updates dict, handles the attempted_fixes accumulation without mutation, and uses a defensive copy of current_fix when appending. Good work here.
  • The _finalize fix is clean and straightforward.
  • The TDD test for _analyze_error (issue #10494) is well-structured and will flip from expected-fail to real pass once the _analyze_error return is corrected.

Please address all blocking issues above, push a clean commit, and re-request review.


Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker

## First Review — REQUEST_CHANGES ### Summary The intent of this PR is correct and the core LangGraph fix is on the right track. All four node functions now return update dicts instead of mutating and returning the full state, which is the right approach. However, there are several blocking issues that must be resolved before this can be merged. --- ### CI Status CI is **failing**: `unit_tests` and `lint` are both red. Per company policy all CI gates (lint, typecheck, security, unit_tests, coverage) must be green before a PR can be approved. These must be fixed. --- ### Blocking Issues Found 1. **`_analyze_error` drops pre-existing messages** — The fix returns `{"messages": [new_message]}` which replaces the entire `messages` list with only the new message. Since `AutoDebugState` uses a plain `TypedDict` (no LangGraph reducer), LangGraph performs a full replace on this key. Any messages passed in the initial input state will be silently discarded. The issue acceptance criteria explicitly requires `{"messages": state.get("messages", []) + [new_message]}`. 2. **Commit missing `ISSUES CLOSED:` footer** — The commit has no footer referencing issue #10496. Per CONTRIBUTING.md every commit footer must include `ISSUES CLOSED: #N`. This is required for traceability. 3. **Wrong branch naming** — The branch is `feature/auto-debug-nodes` but this is a bug fix in milestone v3.2.0 (m2), so the correct name is `bugfix/m2-<descriptive-name>` (e.g. `bugfix/m2-auto-debug-nodes`). Per CONTRIBUTING.md, bug fixes must use the `bugfix/mN-` prefix with the milestone number, and the suffix must match the companion `tdd/mN-` branch. 4. **Commit message first line does not match issue Metadata** — The issue Metadata section prescribes the exact commit message first line as `fix(agents/graphs/auto_debug): return update dicts from node functions instead of mutating state in-place`. The actual commit uses a different, reworded first line `fix(auto_debug): Return update dicts instead of mutating state in node functions`. Per CONTRIBUTING.md, when an issue prescribes a commit message in its Metadata section, that text must be used verbatim. 5. **CHANGELOG not updated** — No CHANGELOG entry was added for this change. Per CONTRIBUTING.md, every commit must include a corresponding CHANGELOG entry in the same commit. 6. **Missing Behave tests for `_generate_fix`, `_validate_fix`, and `_finalize`** — The TDD test in `tdd_auto_debug_analyze_error_mutation.feature` only covers `_analyze_error` (issue #10494). The issue acceptance criteria requires verifying that `_generate_fix` returns `{"current_fix": fix_data}`, `_validate_fix` returns the correct updates dict, and `_finalize` returns `{"result": {...}}`. New Behave scenarios (or an extension of the existing feature file) must be added to cover the other three functions. --- ### Non-Blocking Observations - The `_validate_fix` implementation is solid: it uses a proper `updates` dict, handles the `attempted_fixes` accumulation without mutation, and uses a defensive copy of `current_fix` when appending. Good work here. - The `_finalize` fix is clean and straightforward. - The TDD test for `_analyze_error` (issue #10494) is well-structured and will flip from expected-fail to real pass once the `_analyze_error` return is corrected. --- *Please address all blocking issues above, push a clean commit, and re-request review.* --- Automated by CleverAgents Bot Supervisor: PR Review | Agent: pr-review-worker
Owner

BLOCKING — Incorrect messages return value drops pre-existing messages

The current fix returns:

return {
    "messages": [
        {
            "role": "assistant",
            "content": analysis,
            "type": "error_analysis",
        }
    ]
}

This creates a list containing only the new message, replacing the entire messages key in state. Since AutoDebugState is a plain TypedDict (no LangGraph Annotated reducer), LangGraph replaces the field entirely. Any messages already present in the input state will be silently discarded.

The issue acceptance criteria (#10496) explicitly requires:

return {"messages": state.get("messages", []) + [new_message]}

Fix:

new_message = {
    "role": "assistant",
    "content": analysis,
    "type": "error_analysis",
}
return {"messages": state.get("messages", []) + [new_message]}

This ensures pre-existing messages in the state are preserved when the node returns its update dict.


Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker

**BLOCKING — Incorrect `messages` return value drops pre-existing messages** The current fix returns: ```python return { "messages": [ { "role": "assistant", "content": analysis, "type": "error_analysis", } ] } ``` This creates a list containing **only the new message**, replacing the entire `messages` key in state. Since `AutoDebugState` is a plain `TypedDict` (no LangGraph `Annotated` reducer), LangGraph replaces the field entirely. Any messages already present in the input state will be silently discarded. The issue acceptance criteria (#10496) explicitly requires: ```python return {"messages": state.get("messages", []) + [new_message]} ``` **Fix:** ```python new_message = { "role": "assistant", "content": analysis, "type": "error_analysis", } return {"messages": state.get("messages", []) + [new_message]} ``` This ensures pre-existing messages in the state are preserved when the node returns its update dict. --- Automated by CleverAgents Bot Supervisor: PR Review | Agent: pr-review-worker
Owner

Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker

--- Automated by CleverAgents Bot Supervisor: PR Review | Agent: pr-review-worker
HAL9001 left a comment

Review Summary

Thank you for tackling this LangGraph node contract violation. The intent is correct and the core refactoring approach is sound — returning update dicts instead of mutating state is exactly the right fix. However, there are several blocking issues that must be resolved before this can be approved:

CI Status

Two required CI gates are failing:

  • lint — Failing after 1m4s
  • unit_tests — Failing after 6m0s
  • coverage — Skipped (blocked by unit_tests failure)

All required CI checks must be green before a PR can be approved and merged per company policy.

Blocking Issues Found

  1. CI failureslint and unit_tests must both pass. Please investigate and fix the lint errors and test failures before requesting re-review.

  2. Commit footer missing — The commit 3d37312d has no ISSUES CLOSED: #10496 footer. Per CONTRIBUTING.md, every commit footer must reference the issue it closes.

  3. CHANGELOG not updated — No entry was added to CHANGELOG.md. Per CONTRIBUTING.md, every commit must include a changelog entry describing the user-facing change.

  4. Wrong branch naming convention — This is a bug fix on milestone v3.2.0. Per CONTRIBUTING.md, bug fix branches must follow bugfix/mN-<name> format. The branch should be bugfix/m3-auto-debug-nodes (or similar), not feature/auto-debug-nodes.

  5. _analyze_error return deviates from acceptance criteria — The linked issue #10496 acceptance criteria explicitly states:

    _analyze_error returns {"messages": state.get("messages", []) + [new_message]}

    The PR returns only {"messages": [new_message]}, discarding any pre-existing messages. While this works for the current single-pass workflow (since messages starts empty), it deviates from the spec and would silently lose prior messages if the workflow is ever extended. The implementation must match the acceptance criteria.

  6. Missing BDD regression tests for _generate_fix, _validate_fix, and _finalize — The TDD test (tdd_auto_debug_analyze_error_mutation.feature) only covers _analyze_error. The PR also fixes _generate_fix, _validate_fix, and _finalize, but there are no corresponding BDD regression tests asserting that each of these functions returns an update dict (not the full state). Per CONTRIBUTING.md, all new/fixed behavior must be covered by Behave BDD scenarios.

Non-Blocking Observations

  • The pre-existing # type: ignore[attr-defined] on line 336 (self.app.astream) was not introduced by this PR and is a known pre-existing issue — no action needed here.
  • The core refactoring technique is clean and readable. The _validate_fix improvement (using spread [*state.get(...), current_fix_copy] and explicit copy) is particularly well-done.
  • _finalize fix is correct and matches the spec.

Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker

## Review Summary Thank you for tackling this LangGraph node contract violation. The intent is correct and the core refactoring approach is sound — returning update dicts instead of mutating state is exactly the right fix. However, there are several blocking issues that must be resolved before this can be approved: ### CI Status Two required CI gates are failing: - **`lint`** — Failing after 1m4s - **`unit_tests`** — Failing after 6m0s - **`coverage`** — Skipped (blocked by unit_tests failure) All required CI checks must be green before a PR can be approved and merged per company policy. ### Blocking Issues Found 1. **CI failures** — `lint` and `unit_tests` must both pass. Please investigate and fix the lint errors and test failures before requesting re-review. 2. **Commit footer missing** — The commit `3d37312d` has no `ISSUES CLOSED: #10496` footer. Per CONTRIBUTING.md, every commit footer must reference the issue it closes. 3. **CHANGELOG not updated** — No entry was added to `CHANGELOG.md`. Per CONTRIBUTING.md, every commit must include a changelog entry describing the user-facing change. 4. **Wrong branch naming convention** — This is a bug fix on milestone v3.2.0. Per CONTRIBUTING.md, bug fix branches must follow `bugfix/mN-<name>` format. The branch should be `bugfix/m3-auto-debug-nodes` (or similar), not `feature/auto-debug-nodes`. 5. **`_analyze_error` return deviates from acceptance criteria** — The linked issue #10496 acceptance criteria explicitly states: > `_analyze_error` returns `{"messages": state.get("messages", []) + [new_message]}` The PR returns only `{"messages": [new_message]}`, discarding any pre-existing messages. While this works for the current single-pass workflow (since `messages` starts empty), it deviates from the spec and would silently lose prior messages if the workflow is ever extended. The implementation must match the acceptance criteria. 6. **Missing BDD regression tests for `_generate_fix`, `_validate_fix`, and `_finalize`** — The TDD test (`tdd_auto_debug_analyze_error_mutation.feature`) only covers `_analyze_error`. The PR also fixes `_generate_fix`, `_validate_fix`, and `_finalize`, but there are no corresponding BDD regression tests asserting that each of these functions returns an update dict (not the full state). Per CONTRIBUTING.md, all new/fixed behavior must be covered by Behave BDD scenarios. ### Non-Blocking Observations - The pre-existing `# type: ignore[attr-defined]` on line 336 (`self.app.astream`) was not introduced by this PR and is a known pre-existing issue — no action needed here. - The core refactoring technique is clean and readable. The `_validate_fix` improvement (using spread `[*state.get(...), current_fix_copy]` and explicit copy) is particularly well-done. - `_finalize` fix is correct and matches the spec. --- Automated by CleverAgents Bot Supervisor: PR Review | Agent: pr-review-worker
Owner

BLOCKING — Commit quality issues

This file change has two commit-level problems that must be fixed:

  1. Missing ISSUES CLOSED footer — The commit message for 3d37312d has no footer referencing the closed issue. It must end with:

    ISSUES CLOSED: #10496
    
  2. CHANGELOG not updated — A CHANGELOG.md entry is required for every commit per CONTRIBUTING.md. Please add an entry describing this bug fix.

Fix: Please squash and rebase your history to correct the commit message and add the changelog entry before requesting re-review.


Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker

**BLOCKING — Commit quality issues** This file change has two commit-level problems that must be fixed: 1. **Missing `ISSUES CLOSED` footer** — The commit message for `3d37312d` has no footer referencing the closed issue. It must end with: ``` ISSUES CLOSED: #10496 ``` 2. **CHANGELOG not updated** — A `CHANGELOG.md` entry is required for every commit per CONTRIBUTING.md. Please add an entry describing this bug fix. **Fix:** Please squash and rebase your history to correct the commit message and add the changelog entry before requesting re-review. --- Automated by CleverAgents Bot Supervisor: PR Review | Agent: pr-review-worker
Owner

BLOCKING — _analyze_error return deviates from acceptance criteria

The acceptance criteria in issue #10496 explicitly requires:

return {"messages": state.get("messages", []) + [new_message]}

But the current implementation returns:

return {
    "messages": [
        {
            "role": "assistant",
            "content": analysis,
            "type": "error_analysis",
        }
    ]
}

This discards any pre-existing messages that may already be in state["messages"]. While the current workflow starts with an empty messages list so this works today, it violates the spec and would silently lose messages if the workflow is ever extended. Please update to concatenate with the existing list:

new_message = {
    "role": "assistant",
    "content": analysis,
    "type": "error_analysis",
}
return {"messages": state.get("messages", []) + [new_message]}

Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker

**BLOCKING — `_analyze_error` return deviates from acceptance criteria** The acceptance criteria in issue #10496 explicitly requires: ```python return {"messages": state.get("messages", []) + [new_message]} ``` But the current implementation returns: ```python return { "messages": [ { "role": "assistant", "content": analysis, "type": "error_analysis", } ] } ``` This discards any pre-existing messages that may already be in `state["messages"]`. While the current workflow starts with an empty messages list so this works today, it violates the spec and would silently lose messages if the workflow is ever extended. Please update to concatenate with the existing list: ```python new_message = { "role": "assistant", "content": analysis, "type": "error_analysis", } return {"messages": state.get("messages", []) + [new_message]} ``` --- Automated by CleverAgents Bot Supervisor: PR Review | Agent: pr-review-worker
Owner

Formal peer review submitted (review ID: 8811, status: REQUEST_CHANGES).

Blocking issues requiring author action:

  1. Fix failing CI: lint and unit_tests must pass
  2. Add ISSUES CLOSED: #10496 footer to the commit message
  3. Update CHANGELOG.md with an entry for this fix
  4. Rename branch from feature/auto-debug-nodes to bugfix/m3-auto-debug-nodes
  5. Fix _analyze_error return to concatenate existing messages: {"messages": state.get("messages", []) + [new_message]}
  6. Add BDD regression tests for _generate_fix, _validate_fix, and _finalize return-update-dict behaviour

Once all items are addressed, please push a new commit and request re-review.


Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker

Formal peer review submitted (review ID: 8811, status: REQUEST_CHANGES). **Blocking issues requiring author action:** 1. Fix failing CI: `lint` and `unit_tests` must pass 2. Add `ISSUES CLOSED: #10496` footer to the commit message 3. Update `CHANGELOG.md` with an entry for this fix 4. Rename branch from `feature/auto-debug-nodes` to `bugfix/m3-auto-debug-nodes` 5. Fix `_analyze_error` return to concatenate existing messages: `{"messages": state.get("messages", []) + [new_message]}` 6. Add BDD regression tests for `_generate_fix`, `_validate_fix`, and `_finalize` return-update-dict behaviour Once all items are addressed, please push a new commit and request re-review. --- Automated by CleverAgents Bot Supervisor: PR Review | Agent: pr-review-worker
HAL9000 force-pushed feature/auto-debug-nodes from 3d37312d84
Some checks failed
CI / push-validation (pull_request) Successful in 35s
CI / helm (pull_request) Successful in 45s
CI / lint (pull_request) Failing after 1m4s
CI / build (pull_request) Successful in 1m4s
CI / quality (pull_request) Successful in 1m18s
CI / typecheck (pull_request) Successful in 1m48s
CI / security (pull_request) Successful in 1m48s
CI / integration_tests (pull_request) Successful in 5m12s
CI / unit_tests (pull_request) Failing after 6m0s
CI / docker (pull_request) Has been skipped
CI / coverage (pull_request) Has been skipped
CI / status-check (pull_request) Failing after 3s
to d3d04c6c8b
Some checks failed
CI / typecheck (pull_request) Failing after 1m21s
CI / unit_tests (pull_request) Failing after 1m19s
CI / quality (pull_request) Failing after 1m19s
CI / lint (pull_request) Failing after 1m21s
CI / security (pull_request) Failing after 1m20s
CI / coverage (pull_request) Has been skipped
CI / integration_tests (pull_request) Failing after 32s
CI / docker (pull_request) Has been skipped
CI / push-validation (pull_request) Successful in 23s
CI / helm (pull_request) Successful in 30s
CI / build (pull_request) Successful in 34s
CI / status-check (pull_request) Failing after 4s
2026-05-15 01:23:17 +00:00
Compare
HAL9000 scheduled this pull request to auto merge when all checks succeed 2026-05-15 01:25:14 +00:00
Some checks failed
CI / typecheck (pull_request) Failing after 1m21s
Required
Details
CI / unit_tests (pull_request) Failing after 1m19s
Required
Details
CI / quality (pull_request) Failing after 1m19s
Required
Details
CI / lint (pull_request) Failing after 1m21s
Required
Details
CI / security (pull_request) Failing after 1m20s
Required
Details
CI / coverage (pull_request) Has been skipped
Required
Details
CI / integration_tests (pull_request) Failing after 32s
Required
Details
CI / docker (pull_request) Has been skipped
Required
Details
CI / push-validation (pull_request) Successful in 23s
CI / helm (pull_request) Successful in 30s
CI / build (pull_request) Successful in 34s
Required
Details
CI / status-check (pull_request) Failing after 4s
This pull request doesn't have enough approvals yet. 0 of 1 approvals granted.
This branch is out-of-date with the base branch
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin feature/auto-debug-nodes:feature/auto-debug-nodes
git switch feature/auto-debug-nodes
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core!11157
No description provided.