CLI plan correct never passes decision tree to CorrectionService, producing single-node impact analysis #606

Closed
opened 2026-03-05 13:37:29 +00:00 by hurui200320 · 4 comments
Member

Metadata

  • Commit Message: fix(cli): pass decision tree and influence edges to CorrectionService in plan correct handler
  • Branch: fix/plan-correct-tree-wiring

Parent

Epic: #401 (E2E Integration Testing — bug discovered during #494 review)

Background and Context

During code review of PR #559 (#494 — M3 acceptance gate E2E verification), reviewer @CoreRasurae identified that the plan correct CLI handler in src/cleveragents/cli/commands/plan.py creates a bare CorrectionService() and calls analyze_impact() and execute_correction() without passing the decision_tree or influence_edges arguments.

Both parameters default to None, which causes CorrectionService._compute_affected_subtree() to perform BFS on an empty adjacency dict — returning only the single target decision, never its descendants or influence-DAG dependents.

Current Behavior

The CLI handler at plan.py:2409-2468:

svc = CorrectionService()

request = svc.request_correction(
    plan_id=resolved_plan_id,
    target_decision_id=decision_id,
    mode=correction_mode,
    guidance=guidance,
    dry_run=dry_run,
)

# Dry-run path (line 2422):
impact = svc.analyze_impact(request.correction_id)  # No tree, no edges

# Live execution path (line 2468):
result = svc.execute_correction(request.correction_id)  # No tree, no edges

CorrectionService.analyze_impact() defaults decision_tree=None{} at line 145. _compute_affected_subtree() BFS on {} returns only [target_decision_id].

Consequence: In production, agents plan correct --dry-run always reports only 1 affected decision (the target), ignoring all descendants. agents plan correct --mode=revert only reverts the single target, never cascading to the subtree. This directly violates the spec's correction behavior.

Expected Behavior

The CLI handler should resolve the decision tree and influence edges from the DecisionService and pass them to CorrectionService:

  • analyze_impact(correction_id, decision_tree=structural_tree, influence_edges=influence_edges) should report the full affected subtree.
  • execute_correction(correction_id, decision_tree=structural_tree, influence_edges=influence_edges) should revert/append the full affected subtree.

The infrastructure already exists:

  • DecisionService.get_influence_edges(plan_id) returns dict[str, list[str]] — its docstring explicitly states: "This is the format expected by CorrectionService._compute_affected_subtree()."
  • DecisionService.list_decisions(plan_id) returns all decisions, from which the structural tree adjacency list can be built.
  • Every other command in plan.py resolves DecisionService via get_container().resolve(DecisionService).

Acceptance Criteria

  • correct_decision() in plan.py resolves DecisionService via the DI container
  • The structural tree (dict[str, list[str]]) is built from DecisionService.list_decisions(plan_id)
  • The influence edges are obtained from DecisionService.get_influence_edges(plan_id)
  • svc.analyze_impact() receives both decision_tree and influence_edges
  • svc.execute_correction() receives both decision_tree and influence_edges
  • The dry-run path reports the full affected subtree (target + all descendants)
  • The live execution path reverts/appends the full affected subtree
  • Existing tests in m3_e2e_verification.robot continue to pass (mock assertions updated if needed)

Supporting Information

  • Spec reference: docs/specification.md lines 14788-14802 (CLI signature), lines 28282-28478 (correction modes)
  • Discovered in: PR #559 review by @CoreRasurae (finding H1 in third review cycle)
  • Related: #494 (M3 acceptance gate — the E2E test mock assertions at helper_m3_e2e_verification.py:588 confirm the CLI passes only 1 arg)

Subtasks

  • Add get_structural_tree(plan_id) -> dict[str, list[str]] convenience method to DecisionService (optional — can also build inline)
  • Update correct_decision() in plan.py to resolve DecisionService via container and build/fetch tree + edges
  • Pass decision_tree and influence_edges to svc.analyze_impact() and svc.execute_correction()
  • Tests (Robot): Update or add integration test verifying CLI correction reports full subtree
  • Verify coverage >=97% via nox -s coverage_report
  • Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
## Metadata - **Commit Message**: `fix(cli): pass decision tree and influence edges to CorrectionService in plan correct handler` - **Branch**: `fix/plan-correct-tree-wiring` ## Parent Epic: #401 (E2E Integration Testing — bug discovered during #494 review) ## Background and Context During code review of PR #559 (#494 — M3 acceptance gate E2E verification), reviewer @CoreRasurae identified that the `plan correct` CLI handler in `src/cleveragents/cli/commands/plan.py` creates a bare `CorrectionService()` and calls `analyze_impact()` and `execute_correction()` without passing the `decision_tree` or `influence_edges` arguments. Both parameters default to `None`, which causes `CorrectionService._compute_affected_subtree()` to perform BFS on an empty adjacency dict — returning only the single target decision, never its descendants or influence-DAG dependents. ## Current Behavior The CLI handler at `plan.py:2409-2468`: ```python svc = CorrectionService() request = svc.request_correction( plan_id=resolved_plan_id, target_decision_id=decision_id, mode=correction_mode, guidance=guidance, dry_run=dry_run, ) # Dry-run path (line 2422): impact = svc.analyze_impact(request.correction_id) # No tree, no edges # Live execution path (line 2468): result = svc.execute_correction(request.correction_id) # No tree, no edges ``` `CorrectionService.analyze_impact()` defaults `decision_tree=None` → `{}` at line 145. `_compute_affected_subtree()` BFS on `{}` returns only `[target_decision_id]`. **Consequence:** In production, `agents plan correct --dry-run` always reports only 1 affected decision (the target), ignoring all descendants. `agents plan correct --mode=revert` only reverts the single target, never cascading to the subtree. This directly violates the spec's correction behavior. ## Expected Behavior The CLI handler should resolve the decision tree and influence edges from the `DecisionService` and pass them to `CorrectionService`: - `analyze_impact(correction_id, decision_tree=structural_tree, influence_edges=influence_edges)` should report the full affected subtree. - `execute_correction(correction_id, decision_tree=structural_tree, influence_edges=influence_edges)` should revert/append the full affected subtree. The infrastructure already exists: - `DecisionService.get_influence_edges(plan_id)` returns `dict[str, list[str]]` — its docstring explicitly states: *"This is the format expected by `CorrectionService._compute_affected_subtree()`."* - `DecisionService.list_decisions(plan_id)` returns all decisions, from which the structural tree adjacency list can be built. - Every other command in `plan.py` resolves `DecisionService` via `get_container().resolve(DecisionService)`. ## Acceptance Criteria - [x] `correct_decision()` in `plan.py` resolves `DecisionService` via the DI container - [x] The structural tree (`dict[str, list[str]]`) is built from `DecisionService.list_decisions(plan_id)` - [x] The influence edges are obtained from `DecisionService.get_influence_edges(plan_id)` - [x] `svc.analyze_impact()` receives both `decision_tree` and `influence_edges` - [x] `svc.execute_correction()` receives both `decision_tree` and `influence_edges` - [x] The dry-run path reports the full affected subtree (target + all descendants) - [x] The live execution path reverts/appends the full affected subtree - [x] Existing tests in `m3_e2e_verification.robot` continue to pass (mock assertions updated if needed) ## Supporting Information - **Spec reference:** `docs/specification.md` lines 14788-14802 (CLI signature), lines 28282-28478 (correction modes) - **Discovered in:** PR #559 review by @CoreRasurae (finding H1 in third review cycle) - **Related:** #494 (M3 acceptance gate — the E2E test mock assertions at `helper_m3_e2e_verification.py:588` confirm the CLI passes only 1 arg) ## Subtasks - [x] Add `get_structural_tree(plan_id) -> dict[str, list[str]]` convenience method to `DecisionService` (optional — can also build inline) - [x] Update `correct_decision()` in `plan.py` to resolve `DecisionService` via container and build/fetch tree + edges - [x] Pass `decision_tree` and `influence_edges` to `svc.analyze_impact()` and `svc.execute_correction()` - [x] Tests (Robot): Update or add integration test verifying CLI correction reports full subtree - [x] Verify coverage >=97% via `nox -s coverage_report` - [x] Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done.
hurui200320 added this to the v3.3.0 milestone 2026-03-06 03:19:25 +00:00
freemo self-assigned this 2026-03-06 23:13:10 +00:00
Owner

Triage — Day 27

Status: Confirmed bug. Labels corrected.

Actions Taken

  • Priority upgraded: Priority/HighPriority/Critical (all bugs get Critical per policy)
  • MoSCoW upgraded: MoSCoW/Should haveMoSCoW/Must have (all bugs are Must Have)
  • Created TDD counterpart: #636

Analysis

Well-documented issue with clear root cause identification by @hurui200320 and confirmed by @CoreRasurae during PR #559 review. The correct_decision() handler at plan.py:2409-2468 creates a bare CorrectionService() without passing decision_tree or influence_edges, causing single-node impact analysis instead of full subtree traversal.

The fix is straightforward — resolve DecisionService via DI container and pass the tree/edges. The infrastructure (get_influence_edges(), list_decisions()) already exists.

Dependencies

  • TDD tests (#636) should be written first per CONTRIBUTING.md Bug Fix Workflow
  • TDD infrastructure (#627, #628) blocks TDD test execution but not creation
  • This is blocking M4 acceptance — correction behavior violates the specification
## Triage — Day 27 **Status**: Confirmed bug. Labels corrected. ### Actions Taken - Priority upgraded: `Priority/High` → `Priority/Critical` (all bugs get Critical per policy) - MoSCoW upgraded: `MoSCoW/Should have` → `MoSCoW/Must have` (all bugs are Must Have) - Created TDD counterpart: #636 ### Analysis Well-documented issue with clear root cause identification by @hurui200320 and confirmed by @CoreRasurae during PR #559 review. The `correct_decision()` handler at `plan.py:2409-2468` creates a bare `CorrectionService()` without passing `decision_tree` or `influence_edges`, causing single-node impact analysis instead of full subtree traversal. The fix is straightforward — resolve `DecisionService` via DI container and pass the tree/edges. The infrastructure (`get_influence_edges()`, `list_decisions()`) already exists. ### Dependencies - TDD tests (#636) should be written first per CONTRIBUTING.md Bug Fix Workflow - TDD infrastructure (#627, #628) blocks TDD test execution but not creation - This is blocking M4 acceptance — correction behavior violates the specification
Owner

Verification — Day 28

Status: Verified. Bug confirmed and root cause identified.

Root Cause

The correct_decision() handler in plan.py (line ~2409) creates a bare CorrectionService() without resolving DecisionService via the DI container. It never fetches the decision tree or influence edges, so both default to None → empty dicts. _compute_affected_subtree() BFS on empty data returns only the single target decision.

The fix infrastructure already exists:

  • DecisionService.list_decisions(plan_id) returns all decisions with parent_decision_id for building the structural tree
  • DecisionService.get_influence_edges(plan_id) returns the influence DAG in the exact format CorrectionService expects

Every other plan command correctly resolves DecisionService via get_container() — this handler is the only one that doesn't.

Actions Taken

  • Transitioned from State/UnverifiedState/Verified
  • Issue already has proper CONTRIBUTING.md template

Fix Plan

  1. Resolve DecisionService via get_container() in correct_decision()
  2. Build structural tree adjacency list from list_decisions(plan_id)
  3. Fetch influence edges from get_influence_edges(plan_id)
  4. Pass both to analyze_impact() and execute_correction()
  5. Write Behave BDD + Robot + ASV tests
## Verification — Day 28 **Status**: Verified. Bug confirmed and root cause identified. ### Root Cause The `correct_decision()` handler in `plan.py` (line ~2409) creates a bare `CorrectionService()` without resolving `DecisionService` via the DI container. It never fetches the decision tree or influence edges, so both default to `None` → empty dicts. `_compute_affected_subtree()` BFS on empty data returns only the single target decision. The fix infrastructure already exists: - `DecisionService.list_decisions(plan_id)` returns all decisions with `parent_decision_id` for building the structural tree - `DecisionService.get_influence_edges(plan_id)` returns the influence DAG in the exact format `CorrectionService` expects Every other plan command correctly resolves `DecisionService` via `get_container()` — this handler is the only one that doesn't. ### Actions Taken - Transitioned from `State/Unverified` → `State/Verified` - Issue already has proper CONTRIBUTING.md template ### Fix Plan 1. Resolve `DecisionService` via `get_container()` in `correct_decision()` 2. Build structural tree adjacency list from `list_decisions(plan_id)` 3. Fetch influence edges from `get_influence_edges(plan_id)` 4. Pass both to `analyze_impact()` and `execute_correction()` 5. Write Behave BDD + Robot + ASV tests
Owner

Implementation Complete — PR #639

Summary

Fixed the plan correct CLI handler to properly wire the decision tree and influence DAG edges through to CorrectionService, resolving the single-node impact analysis bug.

What Changed

Core fix in src/cleveragents/cli/commands/plan.py correct_decision():

  1. Resolved DecisionService via DI container — follows the same pattern used by plan explain, plan tree, and other plan commands (container = get_container(); decision_svc = container.resolve(DecisionService))

  2. Built structural tree adjacency list from decision_svc.list_decisions(resolved_plan_id):

    decision_tree: dict[str, list[str]] = {}
    for d in decisions:
        if d.parent_decision_id is not None:
            decision_tree.setdefault(d.parent_decision_id, []).append(d.decision_id)
    
  3. Fetched influence DAG edges via decision_svc.get_influence_edges(resolved_plan_id)

  4. Passed both to CorrectionService calls:

    • svc.analyze_impact(correction_id, decision_tree=decision_tree, influence_edges=influence_edges)
    • svc.execute_correction(correction_id, decision_tree=decision_tree, influence_edges=influence_edges)

Test Coverage

Test Type File Scenarios/Cases Status
BDD (Behave) features/plan_correct_tree_wiring.feature 3 scenarios, 15 steps All pass
Integration (Robot) robot/plan_correct_tree_wiring.robot 3 test cases All pass
ASV Benchmark benchmarks/plan_correct_tree_wiring_bench.py 2 suites Added

Updated 4 existing step definition files to mock the DI container in their _invoke_correct() helpers, ensuring all existing correction tests continue to pass.

Quality Gates

  • nox -s lint — All checks passed
  • nox -s typecheck — 0 errors, 0 warnings, 0 informations (Pyright strict)
  • nox -s unit_tests9109 scenarios passed, 0 failed
  • nox -s integration_tests — New tests pass (3/3); 19 pre-existing failures unrelated
  • nox -s coverage_report97% coverage

Branch & Commit

  • Branch: fix/plan-correct-tree-wiring
  • Commit: ea8815e7fix(cli): pass decision tree and influence edges to CorrectionService in plan correct handler
## Implementation Complete — PR #639 ### Summary Fixed the `plan correct` CLI handler to properly wire the decision tree and influence DAG edges through to `CorrectionService`, resolving the single-node impact analysis bug. ### What Changed **Core fix** in `src/cleveragents/cli/commands/plan.py` `correct_decision()`: 1. **Resolved `DecisionService` via DI container** — follows the same pattern used by `plan explain`, `plan tree`, and other plan commands (`container = get_container(); decision_svc = container.resolve(DecisionService)`) 2. **Built structural tree adjacency list** from `decision_svc.list_decisions(resolved_plan_id)`: ```python decision_tree: dict[str, list[str]] = {} for d in decisions: if d.parent_decision_id is not None: decision_tree.setdefault(d.parent_decision_id, []).append(d.decision_id) ``` 3. **Fetched influence DAG edges** via `decision_svc.get_influence_edges(resolved_plan_id)` 4. **Passed both to CorrectionService** calls: - `svc.analyze_impact(correction_id, decision_tree=decision_tree, influence_edges=influence_edges)` - `svc.execute_correction(correction_id, decision_tree=decision_tree, influence_edges=influence_edges)` ### Test Coverage | Test Type | File | Scenarios/Cases | Status | |-----------|------|----------------|--------| | BDD (Behave) | `features/plan_correct_tree_wiring.feature` | 3 scenarios, 15 steps | All pass | | Integration (Robot) | `robot/plan_correct_tree_wiring.robot` | 3 test cases | All pass | | ASV Benchmark | `benchmarks/plan_correct_tree_wiring_bench.py` | 2 suites | Added | Updated 4 existing step definition files to mock the DI container in their `_invoke_correct()` helpers, ensuring all existing correction tests continue to pass. ### Quality Gates - `nox -s lint` — All checks passed - `nox -s typecheck` — 0 errors, 0 warnings, 0 informations (Pyright strict) - `nox -s unit_tests` — **9109 scenarios passed, 0 failed** - `nox -s integration_tests` — New tests pass (3/3); 19 pre-existing failures unrelated - `nox -s coverage_report` — **97% coverage** ### Branch & Commit - Branch: `fix/plan-correct-tree-wiring` - Commit: `ea8815e7` — `fix(cli): pass decision tree and influence edges to CorrectionService in plan correct handler`
Owner

Day 29 Planning Review — Closing

PR #639 (fix(cli): pass decision tree and influence edges to CorrectionService in plan correct handler) was merged 2026-03-08. All acceptance criteria are marked complete. The fix resolves the root cause: correct_decision() now resolves DecisionService via get_container(), builds the structural tree adjacency list, and passes both decision_tree and influence_edges to CorrectionService.

TDD counterpart #636 was closed in this planning session (tests included in the fix PR via maintainer override).

Closing as fix is merged and verified.

**Day 29 Planning Review — Closing** PR #639 (`fix(cli): pass decision tree and influence edges to CorrectionService in plan correct handler`) was merged 2026-03-08. All acceptance criteria are marked complete. The fix resolves the root cause: `correct_decision()` now resolves `DecisionService` via `get_container()`, builds the structural tree adjacency list, and passes both `decision_tree` and `influence_edges` to `CorrectionService`. TDD counterpart #636 was closed in this planning session (tests included in the fix PR via maintainer override). Closing as fix is merged and verified.
freemo 2026-03-09 02:44:46 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#606
No description provided.