test: add TDD bug-capture test for #967 — plan execute phase processing #1050

2026-03-18T08:37:12Z

hurui200320 commented

2026-03-18 08:37:12 +00:00

Summary

This PR adds TDD bug-capture tests for bug #967 — plan execute only transitions state without running strategize or execute phase processing.

Motivation

Bug #967 describes that the plan execute CLI command originally only called service.execute_plan(plan_id), which is a state transition only (Strategize/COMPLETE → Execute/QUEUED). When a plan was in Strategize/QUEUED state (immediately after plan use), the command failed because execute_plan() requires Strategize/COMPLETE. The CLI should detect the plan's current phase and run PlanExecutor.run_strategize() before transitioning.

Per the project's TDD Bug Fix Workflow (CONTRIBUTING.md), the first step in fixing any bug is to write a test that captures the buggy behavior. Since the fix for #967 is already present in the codebase (the CLI handler already orchestrates properly), the @tdd_expected_fail tags have been removed and these tests serve as permanent regression guards ensuring the fix is never reverted.

Design Approach

All @tdd_expected_fail scenarios were rewritten to exercise the CLI orchestration layer — the actual code path affected by bug #967. This satisfies AC4: "The test is specific enough that it will pass normally (without the tag) only when the bug is genuinely fixed."

Scenarios 1, 2, 4 (previously @tdd_expected_fail): Use Typer's CliRunner with mocked services to invoke the plan execute CLI command handler directly. This tests the orchestration logic in plan.py — the exact code that was buggy.
Scenario 3 (positive control): Uses real PlanLifecycleService (in-memory) and PlanExecutor (stub actors) to demonstrate that proper service-level orchestration works.
Robot tests: Replicate the CLI orchestration logic using real services to verify at the integration level.

Changes

Behave Unit Tests

features/tdd_plan_execute_phase_processing.feature — 4 scenarios
features/steps/tdd_plan_execute_phase_processing_steps.py — Step definitions using CliRunner and mocked services

Scenarios:

CLI execute command handles plan in Strategize/QUEUED state — Invokes plan execute via CliRunner on a QUEUED plan. Verifies the CLI succeeds and the plan reaches Execute phase.
CLI execute command orchestrates full lifecycle for QUEUED plan — Verifies run_strategize() and run_execute() are both called by the CLI handler.
Positive control — proper orchestration transitions QUEUED plan to Execute — Demonstrates that run_strategize() → execute_plan() works correctly at the service level. Always passes.
CLI auto-discovery finds plans in Strategize/QUEUED state — Invokes plan execute with no plan_id. Verifies the auto-discovery filter includes QUEUED plans.

Robot Integration Tests

robot/tdd_plan_execute_phase_processing.robot — 4 test cases matching the Behave scenarios
robot/helper_tdd_plan_execute_phase_processing.py — Helper script replicating CLI orchestration logic

CHANGELOG

CHANGELOG.md — Added entry under "## Unreleased" describing the new tests.

Quality Gates

nox -e lint: ✅ passed
nox -e typecheck: ✅ passed (0 errors)
nox -e unit_tests: ✅ passed (391 features, 11177 scenarios, 0 failures)
nox -e integration_tests: ✅ passed (1572 tests, 0 failures)
nox -e e2e_tests: ✅ passed (16 tests, 0 failures)
nox -e coverage_report: ✅ 97% coverage

Review Cycle 2 Fixes

Critical #1: Rewrote @tdd_expected_fail scenarios to exercise the CLI orchestration layer via CliRunner instead of testing service/executor APIs that are correct by design. Removed @tdd_expected_fail tags since the bug fix is already in the codebase.
Major #2: Added CHANGELOG entry.
Major #3: Rebased onto current master.
Minor #4: Added ProcessingState.QUEUED assertion to positive control scenario.
Minor #5: Narrowed exception handler from bare except Exception to except (PlanError, PlanNotReadyError).
Minor #7: Changed Robot suite to use Setup Test Environment With Database Isolation.
Minor #8: Added on_timeout=kill to all Robot Run Process calls.
Minor #12: Added docstring to _fail() helper.
Minor #13: Removed redundant Settings() instantiation.

Known Limitations

The @tdd_expected_fail tags were removed because the bug fix for #967 is already in the codebase. If this PR is merged before the #967 fix PR, the tags would need to be re-added. However, the CHANGELOG and existing CLI code confirm the fix is already on master.

Closes #977

## Summary This PR adds TDD bug-capture tests for bug #967 — `plan execute` only transitions state without running strategize or execute phase processing. ### Motivation Bug #967 describes that the `plan execute` CLI command originally only called `service.execute_plan(plan_id)`, which is a state transition only (Strategize/COMPLETE → Execute/QUEUED). When a plan was in Strategize/QUEUED state (immediately after `plan use`), the command failed because `execute_plan()` requires Strategize/COMPLETE. The CLI should detect the plan's current phase and run `PlanExecutor.run_strategize()` before transitioning. Per the project's TDD Bug Fix Workflow (`CONTRIBUTING.md`), the first step in fixing any bug is to write a test that captures the buggy behavior. Since the fix for #967 is already present in the codebase (the CLI handler already orchestrates properly), the `@tdd_expected_fail` tags have been removed and these tests serve as **permanent regression guards** ensuring the fix is never reverted. ### Design Approach **All `@tdd_expected_fail` scenarios were rewritten to exercise the CLI orchestration layer** — the actual code path affected by bug #967. This satisfies AC4: "The test is specific enough that it will pass normally (without the tag) only when the bug is genuinely fixed." - **Scenarios 1, 2, 4** (previously `@tdd_expected_fail`): Use Typer's `CliRunner` with mocked services to invoke the `plan execute` CLI command handler directly. This tests the orchestration logic in `plan.py` — the exact code that was buggy. - **Scenario 3** (positive control): Uses real `PlanLifecycleService` (in-memory) and `PlanExecutor` (stub actors) to demonstrate that proper service-level orchestration works. - **Robot tests**: Replicate the CLI orchestration logic using real services to verify at the integration level. ### Changes #### Behave Unit Tests - `features/tdd_plan_execute_phase_processing.feature` — 4 scenarios - `features/steps/tdd_plan_execute_phase_processing_steps.py` — Step definitions using CliRunner and mocked services **Scenarios:** 1. **CLI execute command handles plan in Strategize/QUEUED state** — Invokes `plan execute` via CliRunner on a QUEUED plan. Verifies the CLI succeeds and the plan reaches Execute phase. 2. **CLI execute command orchestrates full lifecycle for QUEUED plan** — Verifies `run_strategize()` and `run_execute()` are both called by the CLI handler. 3. **Positive control — proper orchestration transitions QUEUED plan to Execute** — Demonstrates that `run_strategize()` → `execute_plan()` works correctly at the service level. Always passes. 4. **CLI auto-discovery finds plans in Strategize/QUEUED state** — Invokes `plan execute` with no plan_id. Verifies the auto-discovery filter includes QUEUED plans. #### Robot Integration Tests - `robot/tdd_plan_execute_phase_processing.robot` — 4 test cases matching the Behave scenarios - `robot/helper_tdd_plan_execute_phase_processing.py` — Helper script replicating CLI orchestration logic #### CHANGELOG - `CHANGELOG.md` — Added entry under "## Unreleased" describing the new tests. ### Quality Gates - `nox -e lint`: ✅ passed - `nox -e typecheck`: ✅ passed (0 errors) - `nox -e unit_tests`: ✅ passed (391 features, 11177 scenarios, 0 failures) - `nox -e integration_tests`: ✅ passed (1572 tests, 0 failures) - `nox -e e2e_tests`: ✅ passed (16 tests, 0 failures) - `nox -e coverage_report`: ✅ 97% coverage ### Review Cycle 2 Fixes - **Critical #1**: Rewrote `@tdd_expected_fail` scenarios to exercise the CLI orchestration layer via CliRunner instead of testing service/executor APIs that are correct by design. Removed `@tdd_expected_fail` tags since the bug fix is already in the codebase. - **Major #2**: Added CHANGELOG entry. - **Major #3**: Rebased onto current `master`. - **Minor #4**: Added `ProcessingState.QUEUED` assertion to positive control scenario. - **Minor #5**: Narrowed exception handler from bare `except Exception` to `except (PlanError, PlanNotReadyError)`. - **Minor #7**: Changed Robot suite to use `Setup Test Environment With Database Isolation`. - **Minor #8**: Added `on_timeout=kill` to all Robot `Run Process` calls. - **Minor #12**: Added docstring to `_fail()` helper. - **Minor #13**: Removed redundant `Settings()` instantiation. ### Known Limitations - The `@tdd_expected_fail` tags were removed because the bug fix for #967 is already in the codebase. If this PR is merged before the #967 fix PR, the tags would need to be re-added. However, the CHANGELOG and existing CLI code confirm the fix is already on `master`. Closes #977

hurui200320 added the

Type

Testing

label 2026-03-18 08:37:19 +00:00

hurui200320 added this to the v3.2.0 milestone 2026-03-18 08:37:21 +00:00

hurui200320 added a new dependency 2026-03-18 08:38:26 +00:00

#977 TDD: Write failing test for #967 — plan execute only transitions state, no phase processing

hurui200320 force-pushed tdd/m3-plan-execute-phase-processing from 49d403a004 to 99d7004480

2026-03-18 11:48:53 +00:00

Compare

hurui200320 force-pushed tdd/m3-plan-execute-phase-processing from 99d7004480 to a641dae5bf

2026-03-18 13:43:33 +00:00

Compare

hurui200320 referenced this pull request

2026-03-18 13:53:55 +00:00

TDD: Write failing test for #967 — plan execute only transitions state, no phase processing #977

freemo approved these changes 2026-03-19 04:55:08 +00:00

freemo left a comment

Code Review — PR #1050 `test: TDD bug-capture test for #967 — plan execute phase processing`

Comprehensive 4-scenario test suite. The correct removal of @tdd_expected_fail is well-justified since the bug fix for #967 is already on master — these serve as permanent regression guards. The step file is well-structured with helper functions for constructing Plan domain objects. Good documentation of 13 fixes from 2 review cycles.

Approved. No issues found.

## Code Review — PR #1050 `test: TDD bug-capture test for #967 — plan execute phase processing` Comprehensive 4-scenario test suite. The correct removal of `@tdd_expected_fail` is well-justified since the bug fix for #967 is already on master — these serve as permanent regression guards. The step file is well-structured with helper functions for constructing `Plan` domain objects. Good documentation of 13 fixes from 2 review cycles. **Approved.** No issues found.

freemo requested review from brent.edwards 2026-03-19 05:16:11 +00:00

freemo requested review from freemo 2026-03-19 05:16:11 +00:00

hurui200320 added 1 commit 2026-03-19 06:33:06 +00:00

Merge branch 'master' into tdd/m3-plan-execute-phase-processing

CI / lint (pull_request) Successful in 22s

Details

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / quality (pull_request) Successful in 32s

Details

CI / build (pull_request) Successful in 25s

Details

CI / security (pull_request) Successful in 53s

Details

CI / typecheck (pull_request) Successful in 1m4s

Details

CI / unit_tests (pull_request) Successful in 3m42s

Details

CI / docker (pull_request) Successful in 9s

Details

CI / e2e_tests (pull_request) Successful in 3m53s

Details

CI / integration_tests (pull_request) Successful in 5m10s

Details

CI / coverage (pull_request) Successful in 7m12s

Details

CI / benchmark-regression (pull_request) Successful in 40m6s

Details

468c84ec16

hurui200320 scheduled this pull request to auto merge when all checks succeed 2026-03-19 06:33:35 +00:00

hurui200320 merged commit 774dfedc6b into master

2026-03-19 06:41:27 +00:00

hurui200320 deleted branch tdd/m3-plan-execute-phase-processing

2026-03-19 06:41:27 +00:00

hurui200320 referenced this issue from a commit

2026-03-19 06:41:28 +00:00

test: add TDD bug-capture test for #967 — plan execute phase processing (#1050)

CoreRasurae referenced this pull request

2026-03-19 21:55:35 +00:00

refactor(cli): align actor run signature with spec positional args #1072

CoreRasurae referenced this pull request

2026-03-19 22:01:27 +00:00

feat(tool): add move_file, create_directory, and get_file_info built-in tools #1070

Sign in to join this conversation.

2 Participants

Notifications

Due Date

No due date set.

Blocks

#977 TDD: Write failing test for #967 — plan execute only transitions state, no phase processing

cleveragents/cleveragents-core

Reference: cleveragents/cleveragents-core#1050

test: add TDD bug-capture test for #967 — plan execute phase processing #1050

Summary

Motivation

Design Approach

Changes

Behave Unit Tests

Robot Integration Tests

CHANGELOG

Quality Gates

Review Cycle 2 Fixes

Known Limitations

Code Review — PR #1050 test: TDD bug-capture test for #967 — plan execute phase processing

Code Review — PR #1050 `test: TDD bug-capture test for #967 — plan execute phase processing`