[BUG] max_parallel enforced by SubplanExecutionService instead of PlanLifecycleService; excess subplans rejected instead of queued #9159

Open
opened 2026-04-14 08:51:59 +00:00 by HAL9000 · 1 comment
Owner

Metadata

  • Commit Message: fix(subplan): move max_parallel enforcement to PlanLifecycleService with queuing
  • Branch: fix/subplan-max-parallel-lifecycle-enforcement

Background and Context

The specification (docs/specification.md §Subplan Architecture, line 46882) states:

Parallel limit enforcement: max_parallel is enforced by the PlanLifecycleService; excess subplans queue until a slot opens.

The current implementation enforces max_parallel in two incorrect locations:

  1. SubplanService.validate_spawn() (src/cleveragents/application/services/subplan_service.py, validate_spawn method, error check #3): When execution_mode == PARALLEL and len(spawn_entries) > config.max_parallel, the spawn is rejected with a SpawnValidationError. This means the parent plan cannot spawn more subplans than max_parallel at all — there is no queuing.

  2. SubplanExecutionService._execute_parallel() (src/cleveragents/application/services/subplan_execution_service.py, _execute_parallel method): Uses ThreadPoolExecutor(max_workers=min(self._config.max_parallel, len(statuses))) to limit concurrency. This is a concurrency cap, not a queuing mechanism.

Neither location is PlanLifecycleService, and neither implements the queuing behavior the spec requires ("excess subplans queue until a slot opens").

Current Behavior

  • SubplanService.validate_spawn() raises SpawnValidationError when len(spawn_entries) > max_parallel in PARALLEL mode, preventing any spawn at all.
  • SubplanExecutionService._execute_parallel() caps concurrency via ThreadPoolExecutor but does not queue excess subplans.
  • PlanLifecycleService has no max_parallel enforcement logic at all.

Expected Behavior

Per spec:

  • PlanLifecycleService should enforce max_parallel by queuing excess subplans until a slot opens.
  • Spawning more subplans than max_parallel should be allowed; the excess should wait in a queue.
  • SubplanService.validate_spawn() should not reject spawns that exceed max_parallel.

Acceptance Criteria

  • PlanLifecycleService enforces max_parallel with a queuing mechanism for excess subplans
  • SubplanService.validate_spawn() no longer rejects spawns that exceed max_parallel
  • Excess subplans are queued and executed as slots become available
  • Tests (Behave): Add scenarios verifying queuing behavior when spawn count exceeds max_parallel
  • Tests (Behave): Verify PlanLifecycleService enforces the limit
  • nox -s unit_tests passes

Supporting Information

  • Spec reference: docs/specification.md line 46882 (§Key Architectural Constraints for v3.3.0)
  • Spec reference: docs/specification.md line 18509 (§Child Plan Execution Modes — Parallel)
  • Current enforcement in SubplanService.validate_spawn(): src/cleveragents/application/services/subplan_service.py (validate_spawn method, check #3)
  • Current enforcement in SubplanExecutionService._execute_parallel(): src/cleveragents/application/services/subplan_execution_service.py (_execute_parallel method)
  • PlanLifecycleService has no max_parallel enforcement: src/cleveragents/application/services/plan_lifecycle_service.py

Subtasks

  • Remove max_parallel validation from SubplanService.validate_spawn() (or change it to a warning)
  • Add queuing mechanism to PlanLifecycleService for subplan slot management
  • Wire PlanLifecycleService queuing into SubplanExecutionService or PlanExecutor
  • Tests (Behave): Add scenarios for queuing behavior
  • Tests (Behave): Verify PlanLifecycleService enforces the limit
  • Run nox -s unit_tests, fix any failures
  • Verify coverage ≥97% via nox -s coverage_report

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.

Automated by CleverAgents Bot
Supervisor: UAT Test Pool | Agent: uat-test-pool-supervisor

## Metadata - **Commit Message**: `fix(subplan): move max_parallel enforcement to PlanLifecycleService with queuing` - **Branch**: `fix/subplan-max-parallel-lifecycle-enforcement` ## Background and Context The specification (`docs/specification.md` §Subplan Architecture, line 46882) states: > **Parallel limit enforcement**: `max_parallel` is enforced by the `PlanLifecycleService`; excess subplans queue until a slot opens. The current implementation enforces `max_parallel` in two incorrect locations: 1. **`SubplanService.validate_spawn()`** (`src/cleveragents/application/services/subplan_service.py`, `validate_spawn` method, error check #3): When `execution_mode == PARALLEL` and `len(spawn_entries) > config.max_parallel`, the spawn is **rejected** with a `SpawnValidationError`. This means the parent plan cannot spawn more subplans than `max_parallel` at all — there is no queuing. 2. **`SubplanExecutionService._execute_parallel()`** (`src/cleveragents/application/services/subplan_execution_service.py`, `_execute_parallel` method): Uses `ThreadPoolExecutor(max_workers=min(self._config.max_parallel, len(statuses)))` to limit concurrency. This is a concurrency cap, not a queuing mechanism. Neither location is `PlanLifecycleService`, and neither implements the queuing behavior the spec requires ("excess subplans queue until a slot opens"). ## Current Behavior - `SubplanService.validate_spawn()` raises `SpawnValidationError` when `len(spawn_entries) > max_parallel` in PARALLEL mode, preventing any spawn at all. - `SubplanExecutionService._execute_parallel()` caps concurrency via `ThreadPoolExecutor` but does not queue excess subplans. - `PlanLifecycleService` has no `max_parallel` enforcement logic at all. ## Expected Behavior Per spec: - `PlanLifecycleService` should enforce `max_parallel` by queuing excess subplans until a slot opens. - Spawning more subplans than `max_parallel` should be allowed; the excess should wait in a queue. - `SubplanService.validate_spawn()` should not reject spawns that exceed `max_parallel`. ## Acceptance Criteria - [ ] `PlanLifecycleService` enforces `max_parallel` with a queuing mechanism for excess subplans - [ ] `SubplanService.validate_spawn()` no longer rejects spawns that exceed `max_parallel` - [ ] Excess subplans are queued and executed as slots become available - [ ] Tests (Behave): Add scenarios verifying queuing behavior when spawn count exceeds `max_parallel` - [ ] Tests (Behave): Verify `PlanLifecycleService` enforces the limit - [ ] `nox -s unit_tests` passes ## Supporting Information - Spec reference: `docs/specification.md` line 46882 (§Key Architectural Constraints for v3.3.0) - Spec reference: `docs/specification.md` line 18509 (§Child Plan Execution Modes — Parallel) - Current enforcement in `SubplanService.validate_spawn()`: `src/cleveragents/application/services/subplan_service.py` (validate_spawn method, check #3) - Current enforcement in `SubplanExecutionService._execute_parallel()`: `src/cleveragents/application/services/subplan_execution_service.py` (_execute_parallel method) - `PlanLifecycleService` has no `max_parallel` enforcement: `src/cleveragents/application/services/plan_lifecycle_service.py` ## Subtasks - [ ] Remove `max_parallel` validation from `SubplanService.validate_spawn()` (or change it to a warning) - [ ] Add queuing mechanism to `PlanLifecycleService` for subplan slot management - [ ] Wire `PlanLifecycleService` queuing into `SubplanExecutionService` or `PlanExecutor` - [ ] Tests (Behave): Add scenarios for queuing behavior - [ ] Tests (Behave): Verify `PlanLifecycleService` enforces the limit - [ ] Run `nox -s unit_tests`, fix any failures - [ ] Verify coverage ≥97% via `nox -s coverage_report` ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. --- **Automated by CleverAgents Bot** Supervisor: UAT Test Pool | Agent: uat-test-pool-supervisor
HAL9000 added this to the v3.3.0 milestone 2026-04-14 09:05:05 +00:00
Author
Owner

Triage: Verified [AUTO-OWNR-1]

Valid bug: max_parallel is enforced by SubplanExecutionService instead of PlanLifecycleService, causing excess subplans to be rejected instead of queued. This violates the spec's queuing behavior for parallel subplan execution.

Assigning to v3.3.0 (Corrections + Subplans + Checkpoints). Priority Medium — subplans are rejected rather than queued, which is incorrect behavior.

MoSCoW: Should Have — correct queuing behavior is important for parallel subplan execution.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Triage: Verified** [AUTO-OWNR-1] Valid bug: `max_parallel` is enforced by `SubplanExecutionService` instead of `PlanLifecycleService`, causing excess subplans to be rejected instead of queued. This violates the spec's queuing behavior for parallel subplan execution. Assigning to **v3.3.0** (Corrections + Subplans + Checkpoints). Priority **Medium** — subplans are rejected rather than queued, which is incorrect behavior. MoSCoW: **Should Have** — correct queuing behavior is important for parallel subplan execution. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#9159
No description provided.