test: add TDD bug-capture test for #968 — plan explain plan_id handling #1052

2026-03-18T09:20:58Z

hurui200320 commented

2026-03-18 09:20:58 +00:00

Summary

Add TDD bug-capture tests for bug #968: the plan explain CLI command fails with rc=1 when given a plan_id because explain_decision_cmd treats its argument as a decision_id and svc.get_decision(plan_id) raises DecisionNotFoundError.

These tests capture the exact failure condition described in #968 and are tagged with @tdd_expected_fail, @tdd_bug, and @tdd_bug_968. The @tdd_expected_fail tag inverts the test result so that CI passes while the bug is unfixed. Once bug #968 is fixed, the @tdd_expected_fail tag will be removed and the tests will serve as permanent regression tests.

Closes #978

Changes

Behave Tests (`features/tdd_plan_explain_plan_id.feature`)

Scenario 1: Plan explain succeeds when given a plan_id with decisions — Mocks DecisionService so get_decision(plan_id) raises DecisionNotFoundError (the bug) and list_decisions(plan_id) returns decisions. Invokes plan explain <plan_id> via CliRunner and asserts rc=0 with decision details visible. Verifies list_decisions was called with the correct plan_id.
Scenario 2: Plan explain with plan_id shows root decision question — Same setup, asserts the root decision question appears in the output.

Both scenarios currently fail (rc=1, proving the bug exists), but the @tdd_expected_fail tag makes CI pass.

Robot Tests (`robot/tdd_plan_explain_plan_id.robot`)

TDD Plan Explain Succeeds With Plan ID — Integration test using a helper script that records decisions via DecisionService, invokes plan explain <plan_id> via subprocess, and asserts rc=0.
TDD Plan Explain With Plan ID Shows Root Question — Same setup, asserts the root decision question appears in the output.

Both tests carry tdd_expected_fail, tdd_bug, and tdd_bug_968 tags.

Step Definitions (`features/steps/tdd_plan_explain_plan_id_steps.py`)

Step implementations for the Behave feature, following the established patterns from plan_explain_cli_coverage_steps.py. Includes list_decisions.assert_called_once_with(plan_id) verification to ensure the fix exercises the fallback lookup path.

Helper Script (`robot/helper_tdd_plan_explain_plan_id.py`)

Python helper for Robot tests, following the established pattern from helper_tdd_checkpoint_real_rollback.py. Uses a shared _run_plan_explain() helper to avoid subprocess invocation duplication, sets NO_COLOR=1 in subprocess environment to prevent ANSI escape codes.

CHANGELOG

Added entry under ## Unreleased describing the TDD bug-capture tests for bug #968.

Review Fixes Applied (Iteration 1)

Imports moved to module level (Major) — Three imports in _setup_plan_with_decisions() moved to module level with # noqa: E402, consistent with all other robot/helper_tdd_*.py files.
Mock accurately models real behavior (Minor) — svc.get_decision.return_value = None changed to svc.get_decision.side_effect = DecisionNotFoundError(...) to match the real DecisionService.get_decision() which raises DecisionNotFoundError, not returns None.
Strengthened assertion (Minor) — assert "decision_id" in output or "question" in output changed to AND for a more precise assertion that verifies both fields are present.
Removed # type: ignore[arg-type] (Minor) — Changed kwargs: dict[str, object] to kwargs: dict and removed the # type: ignore[arg-type] directive, per CONTRIBUTING.md type safety rules.
Handled subprocess.TimeoutExpired (Minor) — Both subprocess.run() calls wrapped in try/except subprocess.TimeoutExpired with descriptive _fail() messages.
Resolved timeout race condition (Minor) — Inner subprocess.run timeout reduced from 60s to 45s so the helper catches timeout before Robot's 60s outer timeout kills it.
Defensive setup check (Nit, addressed) — Added list_decisions() verification after record_decision() to distinguish setup failures from the actual bug.

Review Fixes Applied (Iteration 2)

Added CHANGELOG.md entry (Major) — Added entry under ## Unreleased describing the TDD bug-capture tests for bug #968, per CONTRIBUTING.md §"Pull Request Process".
Fixed Gherkin step text and feature comments (Minor) — Renamed step from "returns None" to "raises DecisionNotFoundError" in feature file and step decorator. Updated feature comment from "returns None" to "raises DecisionNotFoundError". Step text now accurately describes the mock behavior.
Fixed Robot helper output assertion logic (Minor) — Changed if "decision" not in combined.lower() and "question" not in combined.lower() to or so both keywords must be present, matching the Behave test's AND-based validation.
Added NO_COLOR=1 to subprocess environment (Minor) — Added _make_subprocess_env() helper that copies os.environ and sets NO_COLOR=1, following the established pattern from robot/helper_e2e_common.py.
Extracted shared subprocess helper (Nit, addressed) — Deduplicated subprocess invocation blocks into _run_plan_explain(plan_id) function, reducing code duplication across the two test subcommands.
Added list_decisions mock verification (Nit, addressed) — Added svc.list_decisions.assert_called_once_with(plan_id) in the Behave step that checks decision details, making the test contract clearer.
Rebased on current master (Nit, addressed) — Branch rebased on origin/master per CONTRIBUTING.md §"Branch Hygiene".

Quality Gates

✅ nox -e lint — All checks passed
✅ nox -e typecheck — 0 errors
✅ nox -e unit_tests — 391 features, 11175 scenarios, 0 failed
✅ nox -e integration_tests — All tests passed (including Tdd Plan Explain Plan Id)
✅ nox -e e2e_tests — 16 tests, 0 failed
✅ nox -e coverage_report — 97% coverage (≥97% threshold met)

## Summary Add TDD bug-capture tests for bug #968: the `plan explain` CLI command fails with rc=1 when given a plan_id because `explain_decision_cmd` treats its argument as a decision_id and `svc.get_decision(plan_id)` raises `DecisionNotFoundError`. These tests capture the exact failure condition described in #968 and are tagged with `@tdd_expected_fail`, `@tdd_bug`, and `@tdd_bug_968`. The `@tdd_expected_fail` tag inverts the test result so that CI passes while the bug is unfixed. Once bug #968 is fixed, the `@tdd_expected_fail` tag will be removed and the tests will serve as permanent regression tests. Closes #978 ## Changes ### Behave Tests (`features/tdd_plan_explain_plan_id.feature`) - **Scenario 1: Plan explain succeeds when given a plan_id with decisions** — Mocks `DecisionService` so `get_decision(plan_id)` raises `DecisionNotFoundError` (the bug) and `list_decisions(plan_id)` returns decisions. Invokes `plan explain <plan_id>` via CliRunner and asserts rc=0 with decision details visible. Verifies `list_decisions` was called with the correct `plan_id`. - **Scenario 2: Plan explain with plan_id shows root decision question** — Same setup, asserts the root decision question appears in the output. Both scenarios currently fail (rc=1, proving the bug exists), but the `@tdd_expected_fail` tag makes CI pass. ### Robot Tests (`robot/tdd_plan_explain_plan_id.robot`) - **TDD Plan Explain Succeeds With Plan ID** — Integration test using a helper script that records decisions via `DecisionService`, invokes `plan explain <plan_id>` via subprocess, and asserts rc=0. - **TDD Plan Explain With Plan ID Shows Root Question** — Same setup, asserts the root decision question appears in the output. Both tests carry `tdd_expected_fail`, `tdd_bug`, and `tdd_bug_968` tags. ### Step Definitions (`features/steps/tdd_plan_explain_plan_id_steps.py`) Step implementations for the Behave feature, following the established patterns from `plan_explain_cli_coverage_steps.py`. Includes `list_decisions.assert_called_once_with(plan_id)` verification to ensure the fix exercises the fallback lookup path. ### Helper Script (`robot/helper_tdd_plan_explain_plan_id.py`) Python helper for Robot tests, following the established pattern from `helper_tdd_checkpoint_real_rollback.py`. Uses a shared `_run_plan_explain()` helper to avoid subprocess invocation duplication, sets `NO_COLOR=1` in subprocess environment to prevent ANSI escape codes. ### CHANGELOG Added entry under `## Unreleased` describing the TDD bug-capture tests for bug #968. ## Review Fixes Applied (Iteration 1) 1. **Imports moved to module level** (Major) — Three imports in `_setup_plan_with_decisions()` moved to module level with `# noqa: E402`, consistent with all other `robot/helper_tdd_*.py` files. 2. **Mock accurately models real behavior** (Minor) — `svc.get_decision.return_value = None` changed to `svc.get_decision.side_effect = DecisionNotFoundError(...)` to match the real `DecisionService.get_decision()` which raises `DecisionNotFoundError`, not returns `None`. 3. **Strengthened assertion** (Minor) — `assert "decision_id" in output or "question" in output` changed to AND for a more precise assertion that verifies both fields are present. 4. **Removed `# type: ignore[arg-type]`** (Minor) — Changed `kwargs: dict[str, object]` to `kwargs: dict` and removed the `# type: ignore[arg-type]` directive, per CONTRIBUTING.md type safety rules. 5. **Handled `subprocess.TimeoutExpired`** (Minor) — Both `subprocess.run()` calls wrapped in `try/except subprocess.TimeoutExpired` with descriptive `_fail()` messages. 6. **Resolved timeout race condition** (Minor) — Inner `subprocess.run` timeout reduced from 60s to 45s so the helper catches timeout before Robot's 60s outer timeout kills it. 7. **Defensive setup check** (Nit, addressed) — Added `list_decisions()` verification after `record_decision()` to distinguish setup failures from the actual bug. ## Review Fixes Applied (Iteration 2) 1. **Added CHANGELOG.md entry** (Major) — Added entry under `## Unreleased` describing the TDD bug-capture tests for bug #968, per CONTRIBUTING.md §"Pull Request Process". 2. **Fixed Gherkin step text and feature comments** (Minor) — Renamed step from "returns None" to "raises DecisionNotFoundError" in feature file and step decorator. Updated feature comment from "returns None" to "raises DecisionNotFoundError". Step text now accurately describes the mock behavior. 3. **Fixed Robot helper output assertion logic** (Minor) — Changed `if "decision" not in combined.lower() and "question" not in combined.lower()` to `or` so both keywords must be present, matching the Behave test's AND-based validation. 4. **Added `NO_COLOR=1` to subprocess environment** (Minor) — Added `_make_subprocess_env()` helper that copies `os.environ` and sets `NO_COLOR=1`, following the established pattern from `robot/helper_e2e_common.py`. 5. **Extracted shared subprocess helper** (Nit, addressed) — Deduplicated subprocess invocation blocks into `_run_plan_explain(plan_id)` function, reducing code duplication across the two test subcommands. 6. **Added `list_decisions` mock verification** (Nit, addressed) — Added `svc.list_decisions.assert_called_once_with(plan_id)` in the Behave step that checks decision details, making the test contract clearer. 7. **Rebased on current master** (Nit, addressed) — Branch rebased on `origin/master` per CONTRIBUTING.md §"Branch Hygiene". ## Quality Gates - ✅ `nox -e lint` — All checks passed - ✅ `nox -e typecheck` — 0 errors - ✅ `nox -e unit_tests` — 391 features, 11175 scenarios, 0 failed - ✅ `nox -e integration_tests` — All tests passed (including Tdd Plan Explain Plan Id) - ✅ `nox -e e2e_tests` — 16 tests, 0 failed - ✅ `nox -e coverage_report` — 97% coverage (≥97% threshold met)

hurui200320 added this to the v3.2.0 milestone 2026-03-18 09:21:06 +00:00

hurui200320 added the

Type

Testing

label 2026-03-18 09:21:06 +00:00

hurui200320 added a new dependency 2026-03-18 09:23:47 +00:00

#978 TDD: Write failing test for #968 — plan explain expects decision_id but test passes plan_id

hurui200320 force-pushed tdd/m3-plan-explain-plan-id from ea584830cb to e17f8b4f48

2026-03-18 11:41:09 +00:00

Compare

hurui200320 force-pushed tdd/m3-plan-explain-plan-id from e17f8b4f48 to 211618e1b3

2026-03-18 12:13:20 +00:00

Compare

freemo approved these changes 2026-03-19 04:55:16 +00:00

Dismissed

freemo left a comment

Code Review — PR #1052 `test: TDD bug-capture test for #968 — plan explain plan_id handling`

Clean TDD test with correct tag triplet (@tdd_expected_fail @tdd_bug @tdd_bug_968). Good coverage of both CliRunner and subprocess invocation paths. The mock accurately models real behavior — get_decision raises DecisionNotFoundError (not returns None), matching the actual DecisionService. The list_decisions.assert_called_once_with(plan_id) verification ensures the fallback lookup path is exercised.

Approved. No issues found.

## Code Review — PR #1052 `test: TDD bug-capture test for #968 — plan explain plan_id handling` Clean TDD test with correct tag triplet (`@tdd_expected_fail @tdd_bug @tdd_bug_968`). Good coverage of both CliRunner and subprocess invocation paths. The mock accurately models real behavior — `get_decision` raises `DecisionNotFoundError` (not returns None), matching the actual `DecisionService`. The `list_decisions.assert_called_once_with(plan_id)` verification ensures the fallback lookup path is exercised. **Approved.** No issues found.

freemo requested review from brent.edwards 2026-03-19 05:16:18 +00:00

freemo requested review from freemo 2026-03-19 05:16:18 +00:00

hurui200320 force-pushed tdd/m3-plan-explain-plan-id from 211618e1b3 to 058d486593

2026-03-19 07:13:27 +00:00

Compare

hurui200320 dismissed freemo's review 2026-03-19 07:13:27 +00:00

Reason:

New commits pushed, approval review dismissed automatically according to repository settings

hurui200320 scheduled this pull request to auto merge when all checks succeed 2026-03-19 07:18:37 +00:00

hurui200320 merged commit c741329f70 into master

2026-03-19 07:21:12 +00:00

hurui200320 deleted branch tdd/m3-plan-explain-plan-id