test: add TDD bug-capture test for #968 — plan explain plan_id handling #1052

Merged
hurui200320 merged 1 commit from tdd/m3-plan-explain-plan-id into master 2026-03-19 07:21:12 +00:00
Member

Summary

Add TDD bug-capture tests for bug #968: the plan explain CLI command fails with rc=1 when given a plan_id because explain_decision_cmd treats its argument as a decision_id and svc.get_decision(plan_id) raises DecisionNotFoundError.

These tests capture the exact failure condition described in #968 and are tagged with @tdd_expected_fail, @tdd_bug, and @tdd_bug_968. The @tdd_expected_fail tag inverts the test result so that CI passes while the bug is unfixed. Once bug #968 is fixed, the @tdd_expected_fail tag will be removed and the tests will serve as permanent regression tests.

Closes #978

Changes

Behave Tests (features/tdd_plan_explain_plan_id.feature)

  • Scenario 1: Plan explain succeeds when given a plan_id with decisions — Mocks DecisionService so get_decision(plan_id) raises DecisionNotFoundError (the bug) and list_decisions(plan_id) returns decisions. Invokes plan explain <plan_id> via CliRunner and asserts rc=0 with decision details visible. Verifies list_decisions was called with the correct plan_id.
  • Scenario 2: Plan explain with plan_id shows root decision question — Same setup, asserts the root decision question appears in the output.

Both scenarios currently fail (rc=1, proving the bug exists), but the @tdd_expected_fail tag makes CI pass.

Robot Tests (robot/tdd_plan_explain_plan_id.robot)

  • TDD Plan Explain Succeeds With Plan ID — Integration test using a helper script that records decisions via DecisionService, invokes plan explain <plan_id> via subprocess, and asserts rc=0.
  • TDD Plan Explain With Plan ID Shows Root Question — Same setup, asserts the root decision question appears in the output.

Both tests carry tdd_expected_fail, tdd_bug, and tdd_bug_968 tags.

Step Definitions (features/steps/tdd_plan_explain_plan_id_steps.py)

Step implementations for the Behave feature, following the established patterns from plan_explain_cli_coverage_steps.py. Includes list_decisions.assert_called_once_with(plan_id) verification to ensure the fix exercises the fallback lookup path.

Helper Script (robot/helper_tdd_plan_explain_plan_id.py)

Python helper for Robot tests, following the established pattern from helper_tdd_checkpoint_real_rollback.py. Uses a shared _run_plan_explain() helper to avoid subprocess invocation duplication, sets NO_COLOR=1 in subprocess environment to prevent ANSI escape codes.

CHANGELOG

Added entry under ## Unreleased describing the TDD bug-capture tests for bug #968.

Review Fixes Applied (Iteration 1)

  1. Imports moved to module level (Major) — Three imports in _setup_plan_with_decisions() moved to module level with # noqa: E402, consistent with all other robot/helper_tdd_*.py files.
  2. Mock accurately models real behavior (Minor) — svc.get_decision.return_value = None changed to svc.get_decision.side_effect = DecisionNotFoundError(...) to match the real DecisionService.get_decision() which raises DecisionNotFoundError, not returns None.
  3. Strengthened assertion (Minor) — assert "decision_id" in output or "question" in output changed to AND for a more precise assertion that verifies both fields are present.
  4. Removed # type: ignore[arg-type] (Minor) — Changed kwargs: dict[str, object] to kwargs: dict and removed the # type: ignore[arg-type] directive, per CONTRIBUTING.md type safety rules.
  5. Handled subprocess.TimeoutExpired (Minor) — Both subprocess.run() calls wrapped in try/except subprocess.TimeoutExpired with descriptive _fail() messages.
  6. Resolved timeout race condition (Minor) — Inner subprocess.run timeout reduced from 60s to 45s so the helper catches timeout before Robot's 60s outer timeout kills it.
  7. Defensive setup check (Nit, addressed) — Added list_decisions() verification after record_decision() to distinguish setup failures from the actual bug.

Review Fixes Applied (Iteration 2)

  1. Added CHANGELOG.md entry (Major) — Added entry under ## Unreleased describing the TDD bug-capture tests for bug #968, per CONTRIBUTING.md §"Pull Request Process".
  2. Fixed Gherkin step text and feature comments (Minor) — Renamed step from "returns None" to "raises DecisionNotFoundError" in feature file and step decorator. Updated feature comment from "returns None" to "raises DecisionNotFoundError". Step text now accurately describes the mock behavior.
  3. Fixed Robot helper output assertion logic (Minor) — Changed if "decision" not in combined.lower() and "question" not in combined.lower() to or so both keywords must be present, matching the Behave test's AND-based validation.
  4. Added NO_COLOR=1 to subprocess environment (Minor) — Added _make_subprocess_env() helper that copies os.environ and sets NO_COLOR=1, following the established pattern from robot/helper_e2e_common.py.
  5. Extracted shared subprocess helper (Nit, addressed) — Deduplicated subprocess invocation blocks into _run_plan_explain(plan_id) function, reducing code duplication across the two test subcommands.
  6. Added list_decisions mock verification (Nit, addressed) — Added svc.list_decisions.assert_called_once_with(plan_id) in the Behave step that checks decision details, making the test contract clearer.
  7. Rebased on current master (Nit, addressed) — Branch rebased on origin/master per CONTRIBUTING.md §"Branch Hygiene".

Quality Gates

  • nox -e lint — All checks passed
  • nox -e typecheck — 0 errors
  • nox -e unit_tests — 391 features, 11175 scenarios, 0 failed
  • nox -e integration_tests — All tests passed (including Tdd Plan Explain Plan Id)
  • nox -e e2e_tests — 16 tests, 0 failed
  • nox -e coverage_report — 97% coverage (≥97% threshold met)
## Summary Add TDD bug-capture tests for bug #968: the `plan explain` CLI command fails with rc=1 when given a plan_id because `explain_decision_cmd` treats its argument as a decision_id and `svc.get_decision(plan_id)` raises `DecisionNotFoundError`. These tests capture the exact failure condition described in #968 and are tagged with `@tdd_expected_fail`, `@tdd_bug`, and `@tdd_bug_968`. The `@tdd_expected_fail` tag inverts the test result so that CI passes while the bug is unfixed. Once bug #968 is fixed, the `@tdd_expected_fail` tag will be removed and the tests will serve as permanent regression tests. Closes #978 ## Changes ### Behave Tests (`features/tdd_plan_explain_plan_id.feature`) - **Scenario 1: Plan explain succeeds when given a plan_id with decisions** — Mocks `DecisionService` so `get_decision(plan_id)` raises `DecisionNotFoundError` (the bug) and `list_decisions(plan_id)` returns decisions. Invokes `plan explain <plan_id>` via CliRunner and asserts rc=0 with decision details visible. Verifies `list_decisions` was called with the correct `plan_id`. - **Scenario 2: Plan explain with plan_id shows root decision question** — Same setup, asserts the root decision question appears in the output. Both scenarios currently fail (rc=1, proving the bug exists), but the `@tdd_expected_fail` tag makes CI pass. ### Robot Tests (`robot/tdd_plan_explain_plan_id.robot`) - **TDD Plan Explain Succeeds With Plan ID** — Integration test using a helper script that records decisions via `DecisionService`, invokes `plan explain <plan_id>` via subprocess, and asserts rc=0. - **TDD Plan Explain With Plan ID Shows Root Question** — Same setup, asserts the root decision question appears in the output. Both tests carry `tdd_expected_fail`, `tdd_bug`, and `tdd_bug_968` tags. ### Step Definitions (`features/steps/tdd_plan_explain_plan_id_steps.py`) Step implementations for the Behave feature, following the established patterns from `plan_explain_cli_coverage_steps.py`. Includes `list_decisions.assert_called_once_with(plan_id)` verification to ensure the fix exercises the fallback lookup path. ### Helper Script (`robot/helper_tdd_plan_explain_plan_id.py`) Python helper for Robot tests, following the established pattern from `helper_tdd_checkpoint_real_rollback.py`. Uses a shared `_run_plan_explain()` helper to avoid subprocess invocation duplication, sets `NO_COLOR=1` in subprocess environment to prevent ANSI escape codes. ### CHANGELOG Added entry under `## Unreleased` describing the TDD bug-capture tests for bug #968. ## Review Fixes Applied (Iteration 1) 1. **Imports moved to module level** (Major) — Three imports in `_setup_plan_with_decisions()` moved to module level with `# noqa: E402`, consistent with all other `robot/helper_tdd_*.py` files. 2. **Mock accurately models real behavior** (Minor) — `svc.get_decision.return_value = None` changed to `svc.get_decision.side_effect = DecisionNotFoundError(...)` to match the real `DecisionService.get_decision()` which raises `DecisionNotFoundError`, not returns `None`. 3. **Strengthened assertion** (Minor) — `assert "decision_id" in output or "question" in output` changed to AND for a more precise assertion that verifies both fields are present. 4. **Removed `# type: ignore[arg-type]`** (Minor) — Changed `kwargs: dict[str, object]` to `kwargs: dict` and removed the `# type: ignore[arg-type]` directive, per CONTRIBUTING.md type safety rules. 5. **Handled `subprocess.TimeoutExpired`** (Minor) — Both `subprocess.run()` calls wrapped in `try/except subprocess.TimeoutExpired` with descriptive `_fail()` messages. 6. **Resolved timeout race condition** (Minor) — Inner `subprocess.run` timeout reduced from 60s to 45s so the helper catches timeout before Robot's 60s outer timeout kills it. 7. **Defensive setup check** (Nit, addressed) — Added `list_decisions()` verification after `record_decision()` to distinguish setup failures from the actual bug. ## Review Fixes Applied (Iteration 2) 1. **Added CHANGELOG.md entry** (Major) — Added entry under `## Unreleased` describing the TDD bug-capture tests for bug #968, per CONTRIBUTING.md §"Pull Request Process". 2. **Fixed Gherkin step text and feature comments** (Minor) — Renamed step from "returns None" to "raises DecisionNotFoundError" in feature file and step decorator. Updated feature comment from "returns None" to "raises DecisionNotFoundError". Step text now accurately describes the mock behavior. 3. **Fixed Robot helper output assertion logic** (Minor) — Changed `if "decision" not in combined.lower() and "question" not in combined.lower()` to `or` so both keywords must be present, matching the Behave test's AND-based validation. 4. **Added `NO_COLOR=1` to subprocess environment** (Minor) — Added `_make_subprocess_env()` helper that copies `os.environ` and sets `NO_COLOR=1`, following the established pattern from `robot/helper_e2e_common.py`. 5. **Extracted shared subprocess helper** (Nit, addressed) — Deduplicated subprocess invocation blocks into `_run_plan_explain(plan_id)` function, reducing code duplication across the two test subcommands. 6. **Added `list_decisions` mock verification** (Nit, addressed) — Added `svc.list_decisions.assert_called_once_with(plan_id)` in the Behave step that checks decision details, making the test contract clearer. 7. **Rebased on current master** (Nit, addressed) — Branch rebased on `origin/master` per CONTRIBUTING.md §"Branch Hygiene". ## Quality Gates - ✅ `nox -e lint` — All checks passed - ✅ `nox -e typecheck` — 0 errors - ✅ `nox -e unit_tests` — 391 features, 11175 scenarios, 0 failed - ✅ `nox -e integration_tests` — All tests passed (including Tdd Plan Explain Plan Id) - ✅ `nox -e e2e_tests` — 16 tests, 0 failed - ✅ `nox -e coverage_report` — 97% coverage (≥97% threshold met)
hurui200320 added this to the v3.2.0 milestone 2026-03-18 09:21:06 +00:00
hurui200320 force-pushed tdd/m3-plan-explain-plan-id from ea584830cb
All checks were successful
CI / lint (pull_request) Successful in 16s
CI / typecheck (pull_request) Successful in 48s
CI / quality (pull_request) Successful in 25s
CI / security (pull_request) Successful in 49s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 14s
CI / e2e_tests (pull_request) Successful in 3m54s
CI / unit_tests (pull_request) Successful in 5m16s
CI / integration_tests (pull_request) Successful in 5m43s
CI / docker (pull_request) Successful in 55s
CI / coverage (pull_request) Successful in 6m44s
CI / benchmark-regression (pull_request) Successful in 38m10s
to e17f8b4f48
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 17s
CI / lint (pull_request) Successful in 24s
CI / quality (pull_request) Successful in 29s
CI / security (pull_request) Successful in 57s
CI / typecheck (pull_request) Successful in 1m4s
CI / unit_tests (pull_request) Successful in 3m0s
CI / integration_tests (pull_request) Successful in 3m38s
CI / docker (pull_request) Successful in 1m0s
CI / e2e_tests (pull_request) Successful in 4m14s
CI / coverage (pull_request) Successful in 6m37s
CI / benchmark-regression (pull_request) Has been cancelled
2026-03-18 11:41:09 +00:00
Compare
hurui200320 force-pushed tdd/m3-plan-explain-plan-id from e17f8b4f48
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 17s
CI / lint (pull_request) Successful in 24s
CI / quality (pull_request) Successful in 29s
CI / security (pull_request) Successful in 57s
CI / typecheck (pull_request) Successful in 1m4s
CI / unit_tests (pull_request) Successful in 3m0s
CI / integration_tests (pull_request) Successful in 3m38s
CI / docker (pull_request) Successful in 1m0s
CI / e2e_tests (pull_request) Successful in 4m14s
CI / coverage (pull_request) Successful in 6m37s
CI / benchmark-regression (pull_request) Has been cancelled
to 211618e1b3
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 14s
CI / build (pull_request) Successful in 24s
CI / quality (pull_request) Successful in 28s
CI / typecheck (pull_request) Successful in 39s
CI / security (pull_request) Successful in 56s
CI / unit_tests (pull_request) Successful in 3m25s
CI / integration_tests (pull_request) Successful in 3m42s
CI / e2e_tests (pull_request) Successful in 4m14s
CI / docker (pull_request) Successful in 54s
CI / coverage (pull_request) Successful in 6m45s
CI / benchmark-regression (pull_request) Successful in 38m39s
2026-03-18 12:13:20 +00:00
Compare
freemo approved these changes 2026-03-19 04:55:16 +00:00
Dismissed
freemo left a comment

Code Review — PR #1052 test: TDD bug-capture test for #968 — plan explain plan_id handling

Clean TDD test with correct tag triplet (@tdd_expected_fail @tdd_bug @tdd_bug_968). Good coverage of both CliRunner and subprocess invocation paths. The mock accurately models real behavior — get_decision raises DecisionNotFoundError (not returns None), matching the actual DecisionService. The list_decisions.assert_called_once_with(plan_id) verification ensures the fallback lookup path is exercised.

Approved. No issues found.

## Code Review — PR #1052 `test: TDD bug-capture test for #968 — plan explain plan_id handling` Clean TDD test with correct tag triplet (`@tdd_expected_fail @tdd_bug @tdd_bug_968`). Good coverage of both CliRunner and subprocess invocation paths. The mock accurately models real behavior — `get_decision` raises `DecisionNotFoundError` (not returns None), matching the actual `DecisionService`. The `list_decisions.assert_called_once_with(plan_id)` verification ensures the fallback lookup path is exercised. **Approved.** No issues found.
hurui200320 force-pushed tdd/m3-plan-explain-plan-id from 211618e1b3
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 14s
CI / build (pull_request) Successful in 24s
CI / quality (pull_request) Successful in 28s
CI / typecheck (pull_request) Successful in 39s
CI / security (pull_request) Successful in 56s
CI / unit_tests (pull_request) Successful in 3m25s
CI / integration_tests (pull_request) Successful in 3m42s
CI / e2e_tests (pull_request) Successful in 4m14s
CI / docker (pull_request) Successful in 54s
CI / coverage (pull_request) Successful in 6m45s
CI / benchmark-regression (pull_request) Successful in 38m39s
to 058d486593
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / build (pull_request) Successful in 17s
CI / quality (pull_request) Successful in 41s
CI / typecheck (pull_request) Successful in 46s
CI / security (pull_request) Successful in 53s
CI / unit_tests (pull_request) Successful in 3m24s
CI / integration_tests (pull_request) Successful in 3m50s
CI / docker (pull_request) Successful in 58s
CI / e2e_tests (pull_request) Successful in 5m12s
CI / coverage (pull_request) Successful in 6m55s
CI / benchmark-regression (pull_request) Successful in 38m8s
2026-03-19 07:13:27 +00:00
Compare
hurui200320 dismissed freemo's review 2026-03-19 07:13:27 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

hurui200320 scheduled this pull request to auto merge when all checks succeed 2026-03-19 07:18:37 +00:00
hurui200320 deleted branch tdd/m3-plan-explain-plan-id 2026-03-19 07:21:12 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core!1052
No description provided.