fix(auto_debug): Return update dicts instead of mutating state in node functions #11157
No reviewers
Labels
No labels
auto/needs-reevaluation
controller-managed
overdue
auto/blocked-by-deps
auto/ci-timeout
auto/claimed-implementer
auto/claimed-merge
auto/claimed-reviewer
auto/driver-down
auto/invariant-violation
auto/last-attempt-tier-0
auto/last-attempt-tier-1
auto/last-attempt-tier-2
auto/last-attempt-tier-min
Automation Tracking
auto/needs-conflict-resolution
auto/needs-implementer
auto/postmortem
auto/ready-to-merge
auto/restart-throttled
auto/revert
auto/sentinel
auto/stale-inactivity
auto/unstable
Blocked
Bounty
$100
Bounty
$1000
Bounty
$10000
Bounty
$20
Bounty
$2000
Bounty
$250
Bounty
$50
Bounty
$500
Bounty
$5000
Bounty
$750
MoSCoW
Could have
MoSCoW
Must have
MoSCoW
Should have
Needs Feedback
Points
1
Points
13
Points
2
Points
21
Points
3
Points
34
Points
5
Points
55
Points
8
Points
88
Priority
Backlog
Priority
CI Blocker
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Signed-off: Owner
Signed-off: Scrum Master
Signed-off: Tech Lead
Spike
State
Completed
State
Duplicate
State
In Progress
State
In Review
State
Paused
State
Unverified
State
Verified
State
Wont Do
Type
Automation
Type
Bug
Type
Discussion
Type
Documentation
Type
Epic
Type
Feature
Type
Legendary
Type
Refactor
Type
Support
Type
Task
Type
Testing
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
cleveragents/cleveragents-core!11157
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "feature/auto-debug-nodes"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes #10496
All four LangGraph node functions (_analyze_error, _generate_fix, _validate_fix, _finalize) in agents/graphs/auto_debug.py were violating the LangGraph node contract by mutating input state and returning the full state object. They now correctly return dicts containing only the changed keys.
_analyze_error: Returns{"messages": [...]}instead of mutating and returning state_generate_fix: Returns{"current_fix": fix_data}instead of mutating and returning state_validate_fix: Builds updates dict and returns it instead of mutating and returning state_finalize: Returns{"result": {...}}instead of mutating and returning stateFirst Review — REQUEST_CHANGES
Summary
The intent of this PR is correct and the core LangGraph fix is on the right track. All four node functions now return update dicts instead of mutating and returning the full state, which is the right approach. However, there are several blocking issues that must be resolved before this can be merged.
CI Status
CI is failing:
unit_testsandlintare both red. Per company policy all CI gates (lint, typecheck, security, unit_tests, coverage) must be green before a PR can be approved. These must be fixed.Blocking Issues Found
_analyze_errordrops pre-existing messages — The fix returns{"messages": [new_message]}which replaces the entiremessageslist with only the new message. SinceAutoDebugStateuses a plainTypedDict(no LangGraph reducer), LangGraph performs a full replace on this key. Any messages passed in the initial input state will be silently discarded. The issue acceptance criteria explicitly requires{"messages": state.get("messages", []) + [new_message]}.Commit missing
ISSUES CLOSED:footer — The commit has no footer referencing issue #10496. Per CONTRIBUTING.md every commit footer must includeISSUES CLOSED: #N. This is required for traceability.Wrong branch naming — The branch is
feature/auto-debug-nodesbut this is a bug fix in milestone v3.2.0 (m2), so the correct name isbugfix/m2-<descriptive-name>(e.g.bugfix/m2-auto-debug-nodes). Per CONTRIBUTING.md, bug fixes must use thebugfix/mN-prefix with the milestone number, and the suffix must match the companiontdd/mN-branch.Commit message first line does not match issue Metadata — The issue Metadata section prescribes the exact commit message first line as
fix(agents/graphs/auto_debug): return update dicts from node functions instead of mutating state in-place. The actual commit uses a different, reworded first linefix(auto_debug): Return update dicts instead of mutating state in node functions. Per CONTRIBUTING.md, when an issue prescribes a commit message in its Metadata section, that text must be used verbatim.CHANGELOG not updated — No CHANGELOG entry was added for this change. Per CONTRIBUTING.md, every commit must include a corresponding CHANGELOG entry in the same commit.
Missing Behave tests for
_generate_fix,_validate_fix, and_finalize— The TDD test intdd_auto_debug_analyze_error_mutation.featureonly covers_analyze_error(issue #10494). The issue acceptance criteria requires verifying that_generate_fixreturns{"current_fix": fix_data},_validate_fixreturns the correct updates dict, and_finalizereturns{"result": {...}}. New Behave scenarios (or an extension of the existing feature file) must be added to cover the other three functions.Non-Blocking Observations
_validate_fiximplementation is solid: it uses a properupdatesdict, handles theattempted_fixesaccumulation without mutation, and uses a defensive copy ofcurrent_fixwhen appending. Good work here._finalizefix is clean and straightforward._analyze_error(issue #10494) is well-structured and will flip from expected-fail to real pass once the_analyze_errorreturn is corrected.Please address all blocking issues above, push a clean commit, and re-request review.
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
BLOCKING — Incorrect
messagesreturn value drops pre-existing messagesThe current fix returns:
This creates a list containing only the new message, replacing the entire
messageskey in state. SinceAutoDebugStateis a plainTypedDict(no LangGraphAnnotatedreducer), LangGraph replaces the field entirely. Any messages already present in the input state will be silently discarded.The issue acceptance criteria (#10496) explicitly requires:
Fix:
This ensures pre-existing messages in the state are preserved when the node returns its update dict.
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
Review Summary
Thank you for tackling this LangGraph node contract violation. The intent is correct and the core refactoring approach is sound — returning update dicts instead of mutating state is exactly the right fix. However, there are several blocking issues that must be resolved before this can be approved:
CI Status
Two required CI gates are failing:
lint— Failing after 1m4sunit_tests— Failing after 6m0scoverage— Skipped (blocked by unit_tests failure)All required CI checks must be green before a PR can be approved and merged per company policy.
Blocking Issues Found
CI failures —
lintandunit_testsmust both pass. Please investigate and fix the lint errors and test failures before requesting re-review.Commit footer missing — The commit
3d37312dhas noISSUES CLOSED: #10496footer. Per CONTRIBUTING.md, every commit footer must reference the issue it closes.CHANGELOG not updated — No entry was added to
CHANGELOG.md. Per CONTRIBUTING.md, every commit must include a changelog entry describing the user-facing change.Wrong branch naming convention — This is a bug fix on milestone v3.2.0. Per CONTRIBUTING.md, bug fix branches must follow
bugfix/mN-<name>format. The branch should bebugfix/m3-auto-debug-nodes(or similar), notfeature/auto-debug-nodes._analyze_errorreturn deviates from acceptance criteria — The linked issue #10496 acceptance criteria explicitly states:The PR returns only
{"messages": [new_message]}, discarding any pre-existing messages. While this works for the current single-pass workflow (sincemessagesstarts empty), it deviates from the spec and would silently lose prior messages if the workflow is ever extended. The implementation must match the acceptance criteria.Missing BDD regression tests for
_generate_fix,_validate_fix, and_finalize— The TDD test (tdd_auto_debug_analyze_error_mutation.feature) only covers_analyze_error. The PR also fixes_generate_fix,_validate_fix, and_finalize, but there are no corresponding BDD regression tests asserting that each of these functions returns an update dict (not the full state). Per CONTRIBUTING.md, all new/fixed behavior must be covered by Behave BDD scenarios.Non-Blocking Observations
# type: ignore[attr-defined]on line 336 (self.app.astream) was not introduced by this PR and is a known pre-existing issue — no action needed here._validate_fiximprovement (using spread[*state.get(...), current_fix_copy]and explicit copy) is particularly well-done._finalizefix is correct and matches the spec.Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
BLOCKING — Commit quality issues
This file change has two commit-level problems that must be fixed:
Missing
ISSUES CLOSEDfooter — The commit message for3d37312dhas no footer referencing the closed issue. It must end with:CHANGELOG not updated — A
CHANGELOG.mdentry is required for every commit per CONTRIBUTING.md. Please add an entry describing this bug fix.Fix: Please squash and rebase your history to correct the commit message and add the changelog entry before requesting re-review.
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
BLOCKING —
_analyze_errorreturn deviates from acceptance criteriaThe acceptance criteria in issue #10496 explicitly requires:
But the current implementation returns:
This discards any pre-existing messages that may already be in
state["messages"]. While the current workflow starts with an empty messages list so this works today, it violates the spec and would silently lose messages if the workflow is ever extended. Please update to concatenate with the existing list:Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
Formal peer review submitted (review ID: 8811, status: REQUEST_CHANGES).
Blocking issues requiring author action:
lintandunit_testsmust passISSUES CLOSED: #10496footer to the commit messageCHANGELOG.mdwith an entry for this fixfeature/auto-debug-nodestobugfix/m3-auto-debug-nodes_analyze_errorreturn to concatenate existing messages:{"messages": state.get("messages", []) + [new_message]}_generate_fix,_validate_fix, and_finalizereturn-update-dict behaviourOnce all items are addressed, please push a new commit and request re-review.
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
3d37312d84d3d04c6c8b[CONTROLLER-DEFER:Gate 1:full_duplicate]
This PR has been deferred for re-evaluation. The controller has stepped back
from processing it. To resume, a human or scope-evaluator must clear the
deferral flag AND re-add the auto/sentinel label.
Decision:
To clear the deferral (SQL):
UPDATE workflows SET deferred_reason=NULL,
deferred_at=NULL,
deferred_target_workflow_id=NULL
WHERE workflow_id = 509;
Audit ID: 176870
Automated by the CleverAgents controller pipeline.
Identity: HAL9000 (pipeline action)
📋 Estimate: tier 1.
Single-file change (+25/-25) mechanically refactoring 4 LangGraph node functions to return update dicts instead of mutating state. The core pattern fix is narrow. CI has a trivial lint failure (E501, one long line) and test failures in unrelated subsystems (actor_run_signature, plan_service_coverage, tdd_memory_service_entity_persistence) — none touch auto_debug.py. The implementer needs to fix the lint error and assess whether the test failures are pre-existing flakiness or regressions. Per codebase calibration, tier 0 has poor track record even on apparently simple single-file changes; tier 1 is the appropriate default.
(attempt #9, tier 1)
🔧 Implementer attempt —
resolved.Pushed 2 commits:
10b3417,185d89c.Files touched:
CHANGELOG.md,features/steps/actor_run_signature_resolve_steps.py,features/steps/actor_run_signature_security_steps.py,features/steps/memory_service_coverage_steps.py,features/steps/tdd_auto_debug_node_contracts_steps.py,features/tdd_auto_debug_analyze_error_mutation.feature,features/tdd_auto_debug_node_contracts.feature,src/cleveragents/agents/graphs/auto_debug.py,src/cleveragents/application/services/memory_service.py,src/cleveragents/cli/commands/actor.py,src/cleveragents/cli/commands/actor_run.py.185d89c0937c24270afc(attempt #11, tier 1)
🔧 Implementer attempt —
ci-not-ready.✅ Approved
Reviewed at commit
7c24270.Confidence: high.
Claimed by
merge_drive.py(pid 2329255) until2026-06-14T18:23:30.979471+00:00.This claim is advisory and will be released when the cycle ends, or after the TTL by a sibling driver's expired-claim sweep.
Approved by the controller reviewer stage (workflow 509).