feat(plan): implement parallel subplan execution with configurable concurrency limits #9335

Open
opened 2026-04-14 15:05:50 +00:00 by HAL9000 · 1 comment
Owner

Metadata

  • Branch: feat/plan-parallel-subplan-execution-scaling
  • Commit Message: feat(plan): implement parallel subplan execution with configurable concurrency limits
  • Milestone: v3.5.0

Background and Context

The v3.5.0 milestone (M6: Autonomy Hardening) requires parallel execution to scale to 10+ concurrent subplans as part of the full autonomy acceptance flow. Currently, subplan execution is either sequential or lacks the concurrency infrastructure needed to safely run many subplans simultaneously without deadlocks, resource exhaustion, or loss of execution state visibility.

To support large-scale autonomous tasks (e.g., porting a substantial codebase), the plan execution engine must be able to dispatch multiple subplans concurrently, enforce configurable concurrency limits, track per-subplan state, and handle failures gracefully — all while reporting progress in real time.

This issue implements the parallel subplan execution subsystem, including the semaphore-based concurrency limiter, execution state machine, failure handling modes, and a benchmark test verifying correctness at 10+ concurrent subplans.

Expected Behavior

When a parent plan spawns multiple subplans, the execution engine:

  1. Dispatches subplans concurrently up to the configured max_parallel limit (default: 4, max: 32).
  2. Uses an asyncio.Semaphore (or ThreadPoolExecutor with bounded workers) to prevent exceeding the concurrency cap.
  3. Tracks each subplan through a well-defined state machine: pending → running → completed | failed.
  4. In fail-fast mode, cancels all remaining running/pending subplans as soon as one fails.
  5. In continue-on-error mode, allows remaining subplans to finish and surfaces all failures at the end.
  6. The parent plan blocks until all subplans have reached a terminal state (completed or failed).
  7. Progress is reported continuously: which subplans are running, which have completed, and which have failed.
  8. The system handles 10+ concurrent subplans without deadlocks, starvation, or resource exhaustion.

Acceptance Criteria

  • max_parallel setting is configurable in the automation profile (default: 4, max: 32); values outside range are rejected with a clear error.
  • Subplan execution uses asyncio.Semaphore (or equivalent) to enforce the concurrency limit at runtime.
  • Each subplan has a tracked execution state: pending, running, completed, failed.
  • Fail-fast mode: first subplan failure cancels all remaining pending/running subplans and propagates the error to the parent plan.
  • Continue-on-error mode: all subplans run to completion; all failures are collected and reported together.
  • Parent plan does not proceed past the parallel execution step until all subplans have reached a terminal state.
  • Progress reporting emits structured events (or log lines) indicating subplan state transitions in real time.
  • A benchmark/integration test spawns 10+ concurrent subplans and asserts all complete correctly with no deadlocks or resource leaks.
  • No regression in existing sequential subplan execution paths.
  • Test coverage ≥ 97% for all new and modified modules.

Subtasks

  • Design: Define ParallelExecutionConfig dataclass with max_parallel (int, default 4, max 32) and failure_mode (fail_fast | continue_on_error) fields.
  • Automation Profile: Add max_parallel and failure_mode fields to the automation profile schema; implement validation and precedence resolution (plan > action > global).
  • State Machine: Implement SubplanExecutionState enum (PENDING, RUNNING, COMPLETED, FAILED) and a SubplanTracker class that holds state for each subplan.
  • Concurrency Limiter: Implement AsyncSemaphoreLimiter wrapping asyncio.Semaphore; acquire before dispatch, release on terminal state.
  • Parallel Dispatcher: Implement ParallelSubplanDispatcher that accepts a list of subplans, dispatches them respecting the semaphore, and awaits all results.
  • Fail-Fast Handler: Implement cancellation logic — on first failure, cancel all pending tasks and propagate SubplanExecutionError to the parent.
  • Continue-on-Error Handler: Collect all SubplanResult objects (success or failure); raise AggregateSubplanError if any failed.
  • Progress Reporter: Emit structured progress events on each state transition; integrate with existing plan event bus / logging layer.
  • Parent Plan Integration: Update the parent plan's execute phase to use ParallelSubplanDispatcher when subplans are present.
  • Unit Tests: Cover SubplanTracker, AsyncSemaphoreLimiter, ParallelSubplanDispatcher, fail-fast, and continue-on-error paths.
  • Benchmark / Integration Test: Write a test that spawns ≥ 10 mock subplans concurrently, asserts all complete, verifies no deadlock, and checks max concurrency was respected.
  • Documentation: Update docs/specification.md with the max_parallel and failure_mode automation profile fields and parallel execution semantics.

Definition of Done

The issue is closed when:

  1. All acceptance criteria checkboxes are checked.
  2. ParallelSubplanDispatcher is merged to main behind the feat/plan-parallel-subplan-execution-scaling branch.
  3. The benchmark test passes: 10+ concurrent subplans complete correctly with no deadlocks or resource exhaustion.
  4. nox passes with test coverage ≥ 97% including the new parallel execution modules.
  5. The automation profile schema is updated and validated in CI.
  6. No regressions in existing sequential subplan execution tests.
  7. PR is reviewed, approved, and merged.

Automated by CleverAgents Bot
Agent: new-issue-creator

## Metadata - **Branch**: `feat/plan-parallel-subplan-execution-scaling` - **Commit Message**: `feat(plan): implement parallel subplan execution with configurable concurrency limits` - **Milestone**: v3.5.0 ## Background and Context The v3.5.0 milestone (M6: Autonomy Hardening) requires parallel execution to scale to 10+ concurrent subplans as part of the full autonomy acceptance flow. Currently, subplan execution is either sequential or lacks the concurrency infrastructure needed to safely run many subplans simultaneously without deadlocks, resource exhaustion, or loss of execution state visibility. To support large-scale autonomous tasks (e.g., porting a substantial codebase), the plan execution engine must be able to dispatch multiple subplans concurrently, enforce configurable concurrency limits, track per-subplan state, and handle failures gracefully — all while reporting progress in real time. This issue implements the parallel subplan execution subsystem, including the semaphore-based concurrency limiter, execution state machine, failure handling modes, and a benchmark test verifying correctness at 10+ concurrent subplans. ## Expected Behavior When a parent plan spawns multiple subplans, the execution engine: 1. Dispatches subplans concurrently up to the configured `max_parallel` limit (default: 4, max: 32). 2. Uses an `asyncio.Semaphore` (or `ThreadPoolExecutor` with bounded workers) to prevent exceeding the concurrency cap. 3. Tracks each subplan through a well-defined state machine: `pending → running → completed | failed`. 4. In **fail-fast** mode, cancels all remaining running/pending subplans as soon as one fails. 5. In **continue-on-error** mode, allows remaining subplans to finish and surfaces all failures at the end. 6. The parent plan blocks until all subplans have reached a terminal state (`completed` or `failed`). 7. Progress is reported continuously: which subplans are running, which have completed, and which have failed. 8. The system handles 10+ concurrent subplans without deadlocks, starvation, or resource exhaustion. ## Acceptance Criteria - [ ] `max_parallel` setting is configurable in the automation profile (default: 4, max: 32); values outside range are rejected with a clear error. - [ ] Subplan execution uses `asyncio.Semaphore` (or equivalent) to enforce the concurrency limit at runtime. - [ ] Each subplan has a tracked execution state: `pending`, `running`, `completed`, `failed`. - [ ] Fail-fast mode: first subplan failure cancels all remaining pending/running subplans and propagates the error to the parent plan. - [ ] Continue-on-error mode: all subplans run to completion; all failures are collected and reported together. - [ ] Parent plan does not proceed past the parallel execution step until all subplans have reached a terminal state. - [ ] Progress reporting emits structured events (or log lines) indicating subplan state transitions in real time. - [ ] A benchmark/integration test spawns 10+ concurrent subplans and asserts all complete correctly with no deadlocks or resource leaks. - [ ] No regression in existing sequential subplan execution paths. - [ ] Test coverage ≥ 97% for all new and modified modules. ## Subtasks - [ ] **Design**: Define `ParallelExecutionConfig` dataclass with `max_parallel` (int, default 4, max 32) and `failure_mode` (`fail_fast` | `continue_on_error`) fields. - [ ] **Automation Profile**: Add `max_parallel` and `failure_mode` fields to the automation profile schema; implement validation and precedence resolution (plan > action > global). - [ ] **State Machine**: Implement `SubplanExecutionState` enum (`PENDING`, `RUNNING`, `COMPLETED`, `FAILED`) and a `SubplanTracker` class that holds state for each subplan. - [ ] **Concurrency Limiter**: Implement `AsyncSemaphoreLimiter` wrapping `asyncio.Semaphore`; acquire before dispatch, release on terminal state. - [ ] **Parallel Dispatcher**: Implement `ParallelSubplanDispatcher` that accepts a list of subplans, dispatches them respecting the semaphore, and awaits all results. - [ ] **Fail-Fast Handler**: Implement cancellation logic — on first failure, cancel all pending tasks and propagate `SubplanExecutionError` to the parent. - [ ] **Continue-on-Error Handler**: Collect all `SubplanResult` objects (success or failure); raise `AggregateSubplanError` if any failed. - [ ] **Progress Reporter**: Emit structured progress events on each state transition; integrate with existing plan event bus / logging layer. - [ ] **Parent Plan Integration**: Update the parent plan's execute phase to use `ParallelSubplanDispatcher` when subplans are present. - [ ] **Unit Tests**: Cover `SubplanTracker`, `AsyncSemaphoreLimiter`, `ParallelSubplanDispatcher`, fail-fast, and continue-on-error paths. - [ ] **Benchmark / Integration Test**: Write a test that spawns ≥ 10 mock subplans concurrently, asserts all complete, verifies no deadlock, and checks max concurrency was respected. - [ ] **Documentation**: Update `docs/specification.md` with the `max_parallel` and `failure_mode` automation profile fields and parallel execution semantics. ## Definition of Done The issue is closed when: 1. All acceptance criteria checkboxes are checked. 2. `ParallelSubplanDispatcher` is merged to `main` behind the `feat/plan-parallel-subplan-execution-scaling` branch. 3. The benchmark test passes: 10+ concurrent subplans complete correctly with no deadlocks or resource exhaustion. 4. `nox` passes with test coverage ≥ 97% including the new parallel execution modules. 5. The automation profile schema is updated and validated in CI. 6. No regressions in existing sequential subplan execution tests. 7. PR is reviewed, approved, and merged. --- **Automated by CleverAgents Bot** Agent: new-issue-creator
HAL9000 added this to the v3.5.0 milestone 2026-04-14 15:11:51 +00:00
Author
Owner

Triage: Verified [AUTO-OWNR-1]

Valid feature: Parallel subplan execution scaling to 10+ concurrent subplans is explicitly listed in the v3.5.0 milestone acceptance criteria: "Parallel execution scales to 10+ concurrent subplans." This issue provides a comprehensive specification for the ParallelSubplanDispatcher, AsyncSemaphoreLimiter, SubplanTracker, fail-fast and continue-on-error modes, and a benchmark test.

Assigning to v3.5.0 (Autonomy Hardening) as this is explicitly required by the milestone. Priority High — core M6 deliverable.

MoSCoW: Must Have — parallel subplan execution is explicitly required by the v3.5.0 milestone acceptance criteria.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Triage: Verified** [AUTO-OWNR-1] Valid feature: Parallel subplan execution scaling to 10+ concurrent subplans is explicitly listed in the v3.5.0 milestone acceptance criteria: "Parallel execution scales to 10+ concurrent subplans." This issue provides a comprehensive specification for the `ParallelSubplanDispatcher`, `AsyncSemaphoreLimiter`, `SubplanTracker`, fail-fast and continue-on-error modes, and a benchmark test. Assigning to **v3.5.0** (Autonomy Hardening) as this is explicitly required by the milestone. Priority **High** — core M6 deliverable. MoSCoW: **Must Have** — parallel subplan execution is explicitly required by the v3.5.0 milestone acceptance criteria. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#9335
No description provided.