UAT: [plan-lifecycle] Parallel fail_fast leaves queued subplans in complete state #7394

Open
opened 2026-04-10 18:54:22 +00:00 by HAL9000 · 1 comment
Owner

What was tested

  • agents plan execute fail_fast handling in parallel mode (behave scenario Parallel fail_fast marks unstarted futures as CANCELLED not ERRORED)

Expected behavior (per plan lifecycle spec)

  • When fail_fast is enabled and the first parallel subplan errors, any subplans that have not started yet must transition to the CANCELLED state so that downstream tooling can distinguish skipped work from successful completions.

Actual behavior

  • Remaining subplans end up in the complete state instead of cancelled. The automated test fails with:
    ASSERT FAILED: Expected CANCELLED for 01HGZ6FE0AQDYTR4BXVQZ6EB00, got complete
    
    This leaves plan inspectors believing the work finished successfully even though it never began.

Steps to reproduce

  1. Create a plan that schedules three subplans in parallel execution mode with max_parallel=1 and fail_fast enabled (see features/subplan_execution.feature scenario for reference).
  2. Configure the first subplan to raise a ValidationError: schema mismatch when executed.
  3. Execute the plan via agents plan execute <PLAN_ID>.
  4. Inspect the execution statuses of the second and third subplans (e.g., via the plan lifecycle service or CLI status output).

Observed result

  • Subplan 2 (and subsequently subplan 3) report complete even though they were never started; they should report cancelled.

Impact

  • Consumers relying on status codes cannot distinguish skipped work from successful completion when fail_fast triggers, violating the plan lifecycle specification and breaking downstream automation.

Automated by CleverAgents Bot
Supervisor: UAT Test Pool | Agent: uat-test-pool-supervisor

## What was tested - `agents plan execute` fail_fast handling in parallel mode (behave scenario `Parallel fail_fast marks unstarted futures as CANCELLED not ERRORED`) ## Expected behavior (per plan lifecycle spec) - When `fail_fast` is enabled and the first parallel subplan errors, any subplans that have not started yet must transition to the `CANCELLED` state so that downstream tooling can distinguish skipped work from successful completions. ## Actual behavior - Remaining subplans end up in the `complete` state instead of `cancelled`. The automated test fails with: ``` ASSERT FAILED: Expected CANCELLED for 01HGZ6FE0AQDYTR4BXVQZ6EB00, got complete ``` This leaves plan inspectors believing the work finished successfully even though it never began. ## Steps to reproduce 1. Create a plan that schedules three subplans in `parallel` execution mode with `max_parallel=1` and `fail_fast` enabled (see `features/subplan_execution.feature` scenario for reference). 2. Configure the first subplan to raise a `ValidationError: schema mismatch` when executed. 3. Execute the plan via `agents plan execute <PLAN_ID>`. 4. Inspect the execution statuses of the second and third subplans (e.g., via the plan lifecycle service or CLI status output). ## Observed result - Subplan 2 (and subsequently subplan 3) report `complete` even though they were never started; they should report `cancelled`. ## Impact - Consumers relying on status codes cannot distinguish skipped work from successful completion when fail_fast triggers, violating the plan lifecycle specification and breaking downstream automation. --- **Automated by CleverAgents Bot** Supervisor: UAT Test Pool | Agent: uat-test-pool-supervisor
HAL9000 added this to the v3.3.0 milestone 2026-04-10 20:26:54 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified — UAT confirmed: parallel fail_fast leaves subplans in complete instead of cancelled
  • Priority: Priority/Critical — incorrect state transitions break downstream automation and violate the plan lifecycle spec
  • Milestone: v3.3.0 — Parallel subplan execution and status tracking is a core acceptance criterion for v3.3.0
  • Type: Type/Bug
  • MoSCoW: Must Have — "Parent plan tracks all subplan statuses" is a v3.3.0 acceptance criterion

The fix: when fail_fast triggers, unstarted subplans must transition to CANCELLED state, not complete.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

Issue triaged by project owner: - **State**: Verified — UAT confirmed: parallel fail_fast leaves subplans in `complete` instead of `cancelled` - **Priority**: Priority/Critical — incorrect state transitions break downstream automation and violate the plan lifecycle spec - **Milestone**: v3.3.0 — Parallel subplan execution and status tracking is a core acceptance criterion for v3.3.0 - **Type**: Type/Bug - **MoSCoW**: Must Have — "Parent plan tracks all subplan statuses" is a v3.3.0 acceptance criterion The fix: when `fail_fast` triggers, unstarted subplans must transition to `CANCELLED` state, not `complete`. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#7394
No description provided.