UAT: SubplanService.validate_spawn incorrectly rejects spawning more subplans than max_parallel — should only limit concurrency #3171

Open
opened 2026-04-05 07:14:29 +00:00 by freemo · 2 comments
Owner

Metadata

  • Branch: fix/m4-subplan-validate-spawn-max-parallel
  • Commit Message: fix(subplans): remove incorrect max_parallel total-count guard from validate_spawn
  • Milestone: v3.3.0
  • Parent Epic: #368

Background and Context

During UAT testing of the subplans-checkpoints feature area (v3.3.0 M4), a critical spec deviation was discovered in SubplanService.validate_spawn. The max_parallel configuration field is intended to bound concurrency (the number of simultaneously running workers), not the total number of subplans that may be spawned. The validation method incorrectly enforces the latter, blocking the core use case of spawning many subplans with bounded concurrency.

Current Behavior (Bug)

SubplanService.validate_spawn rejects a spawn request if the number of spawn entries exceeds max_parallel:

# 3. max_parallel bounds validation
if config.execution_mode == ExecutionMode.PARALLEL:
    entry_count: int = len(spawn_entries)
    if entry_count > config.max_parallel:
        errors.append(
            f"Number of spawn entries ({entry_count}) exceeds "
            f"max_parallel bound ({config.max_parallel})"
        )

Steps to reproduce:

  1. Create a SubplanConfig with execution_mode=PARALLEL, max_parallel=2
  2. Create 5 spawn entries
  3. Call SubplanService.validate_spawn(config, entries)
  4. Observe: validation fails with "Number of spawn entries (5) exceeds max_parallel bound (2)"
  5. Expected: validation should pass — execution would use 2 concurrent workers for 5 subplans

Expected Behavior (per Spec)

Per docs/specification.md (section on SubplanConfig.max_parallel):

"Parallel execution is bounded by SubplanConfig.max_parallel (default: 5, range: 1–50). This cap prevents runaway resource consumption when a large number of child plans are spawned simultaneously. The runtime uses a ThreadPoolExecutor with min(max_parallel, len(subplans)) workers."

max_parallel limits the number of concurrent workers, NOT the total number of subplans. You should be able to spawn 10 subplans with max_parallel=5 — they run 5 at a time.

The execution service already implements this correctly in SubplanExecutionService._execute_parallel:

max_workers = min(self._config.max_parallel, len(statuses))

Impact

Any attempt to spawn more than max_parallel subplans in parallel mode is rejected at validation time, even though the execution engine handles it correctly. This blocks the core use case of spawning many subplans with bounded concurrency.

Proposed Fix

Remove the erroneous entry_count > config.max_parallel check from SubplanService.validate_spawn. The check should either be removed entirely (the execution engine handles concurrency correctly) or replaced with a check that max_parallel >= 1 (already enforced by the model-level validator).

Code location: cleveragents.application.services.subplan_service.SubplanService.validate_spawn (the max_parallel bounds check block)

Subtasks

  • Remove (or correct) the entry_count > config.max_parallel guard in SubplanService.validate_spawn
  • Add a TDD @tdd_expected_fail BDD scenario reproducing the bug before the fix
  • Implement the fix and verify the scenario passes
  • Add BDD scenarios covering: spawn count > max_parallel succeeds, spawn count == max_parallel succeeds, spawn count < max_parallel succeeds
  • Add integration test verifying end-to-end parallel execution with more subplans than max_parallel
  • Verify no regression in existing validate_spawn tests
  • Run nox (all default sessions), fix any errors
  • Verify coverage >= 97% via nox -s coverage_report

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • The erroneous max_parallel total-count guard is removed from SubplanService.validate_spawn.
  • Spawning more subplans than max_parallel in parallel mode is accepted by validation and executes with bounded concurrency.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
  • All nox stages pass.
  • Coverage >= 97%.

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-uat-tester

## Metadata - **Branch**: `fix/m4-subplan-validate-spawn-max-parallel` - **Commit Message**: `fix(subplans): remove incorrect max_parallel total-count guard from validate_spawn` - **Milestone**: v3.3.0 - **Parent Epic**: #368 ## Background and Context During UAT testing of the subplans-checkpoints feature area (v3.3.0 M4), a critical spec deviation was discovered in `SubplanService.validate_spawn`. The `max_parallel` configuration field is intended to bound **concurrency** (the number of simultaneously running workers), not the **total number** of subplans that may be spawned. The validation method incorrectly enforces the latter, blocking the core use case of spawning many subplans with bounded concurrency. ## Current Behavior (Bug) `SubplanService.validate_spawn` rejects a spawn request if the number of spawn entries exceeds `max_parallel`: ```python # 3. max_parallel bounds validation if config.execution_mode == ExecutionMode.PARALLEL: entry_count: int = len(spawn_entries) if entry_count > config.max_parallel: errors.append( f"Number of spawn entries ({entry_count}) exceeds " f"max_parallel bound ({config.max_parallel})" ) ``` **Steps to reproduce:** 1. Create a `SubplanConfig` with `execution_mode=PARALLEL, max_parallel=2` 2. Create 5 spawn entries 3. Call `SubplanService.validate_spawn(config, entries)` 4. Observe: validation fails with `"Number of spawn entries (5) exceeds max_parallel bound (2)"` 5. Expected: validation should pass — execution would use 2 concurrent workers for 5 subplans ## Expected Behavior (per Spec) Per `docs/specification.md` (section on `SubplanConfig.max_parallel`): > "Parallel execution is bounded by `SubplanConfig.max_parallel` (default: `5`, range: 1–50). This cap prevents runaway resource consumption when a large number of child plans are spawned simultaneously. The runtime uses a `ThreadPoolExecutor` with `min(max_parallel, len(subplans))` workers." `max_parallel` limits the number of **concurrent workers**, NOT the total number of subplans. You should be able to spawn 10 subplans with `max_parallel=5` — they run 5 at a time. The execution service already implements this correctly in `SubplanExecutionService._execute_parallel`: ```python max_workers = min(self._config.max_parallel, len(statuses)) ``` ## Impact Any attempt to spawn more than `max_parallel` subplans in parallel mode is rejected at validation time, even though the execution engine handles it correctly. This blocks the core use case of spawning many subplans with bounded concurrency. ## Proposed Fix Remove the erroneous `entry_count > config.max_parallel` check from `SubplanService.validate_spawn`. The check should either be removed entirely (the execution engine handles concurrency correctly) or replaced with a check that `max_parallel >= 1` (already enforced by the model-level validator). **Code location**: `cleveragents.application.services.subplan_service.SubplanService.validate_spawn` (the `max_parallel` bounds check block) ## Subtasks - [ ] Remove (or correct) the `entry_count > config.max_parallel` guard in `SubplanService.validate_spawn` - [ ] Add a TDD `@tdd_expected_fail` BDD scenario reproducing the bug before the fix - [ ] Implement the fix and verify the scenario passes - [ ] Add BDD scenarios covering: spawn count > max_parallel succeeds, spawn count == max_parallel succeeds, spawn count < max_parallel succeeds - [ ] Add integration test verifying end-to-end parallel execution with more subplans than `max_parallel` - [ ] Verify no regression in existing `validate_spawn` tests - [ ] Run `nox` (all default sessions), fix any errors - [ ] Verify coverage >= 97% via `nox -s coverage_report` ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - The erroneous `max_parallel` total-count guard is removed from `SubplanService.validate_spawn`. - Spawning more subplans than `max_parallel` in parallel mode is accepted by validation and executes with bounded concurrency. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. - All nox stages pass. - Coverage >= 97%. --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-uat-tester
freemo added this to the v3.3.0 milestone 2026-04-05 07:15:21 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • Priority: Critical (keeping existing — blocks v3.3.0 acceptance criteria for parallel subplan execution)
  • Milestone: v3.3.0 (already assigned, keeping)
  • MoSCoW: Must Have — v3.3.0 acceptance criteria requires "Parallel subplan execution works with max_parallel." The current validation incorrectly rejects spawning more subplans than max_parallel, which is the core use case for bounded concurrency.
  • Parent Epic: #368 (Subplans & Parallelism)

This is a correctness bug: max_parallel should limit concurrency (workers), not total subplan count. The execution engine already handles this correctly — only the validation is wrong.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Issue triaged by project owner: - **State**: Verified - **Priority**: Critical (keeping existing — blocks v3.3.0 acceptance criteria for parallel subplan execution) - **Milestone**: v3.3.0 (already assigned, keeping) - **MoSCoW**: Must Have — v3.3.0 acceptance criteria requires "Parallel subplan execution works with max_parallel." The current validation incorrectly rejects spawning more subplans than max_parallel, which is the core use case for bounded concurrency. - **Parent Epic**: #368 (Subplans & Parallelism) This is a correctness bug: `max_parallel` should limit concurrency (workers), not total subplan count. The execution engine already handles this correctly — only the validation is wrong. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • Priority: Critical — this directly blocks v3.3.0 acceptance criteria. SubplanService.validate_spawn incorrectly rejects spawning subplans when max_parallel is reached, which breaks parallel subplan execution.
  • Milestone: v3.3.0 (already assigned — correct, as subplan spawning is a core M4 feature)
  • MoSCoW: Must Have — the spec requires parallel subplan execution with configurable concurrency limits. If validate_spawn incorrectly rejects spawning, the entire subplan system is broken. This is essential for milestone completion.

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Issue triaged by project owner: - **State**: Verified - **Priority**: Critical — this directly blocks v3.3.0 acceptance criteria. SubplanService.validate_spawn incorrectly rejects spawning subplans when max_parallel is reached, which breaks parallel subplan execution. - **Milestone**: v3.3.0 (already assigned — correct, as subplan spawning is a core M4 feature) - **MoSCoW**: Must Have — the spec requires parallel subplan execution with configurable concurrency limits. If validate_spawn incorrectly rejects spawning, the entire subplan system is broken. This is essential for milestone completion. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#368 Epic: Subplans & Parallelism
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#3171
No description provided.