BUG-HUNT: concurrency — ValidationPipeline.run() iterates futures in insertion order, not completion order #7717

Open
opened 2026-04-12 03:20:02 +00:00 by HAL9000 · 4 comments
Owner

Bug Report: concurrency — ValidationPipeline.run() iterates futures in insertion order, not completion order

Severity Assessment

  • Impact: The ThreadPoolExecutor concurrency is completely defeated. The pipeline blocks on each future in the order they were submitted rather than processing them as they complete. This causes head-of-line blocking: a slow validation at submission position 0 blocks all results even if all other validations have already finished. Under high load this can cause cascading timeouts.
  • Likelihood: Always triggered when using max_workers > 1 (the default is 4).
  • Priority: High

Location

  • File: src/cleveragents/application/services/validation_pipeline.py
  • Function/Class: ValidationPipeline.run
  • Lines: 506–513

Description

The pipeline submits all validation commands concurrently to a ThreadPoolExecutor, then collects results with:

for future in futures_map:
    result = future.result()
    results.append(result)

Iterating a dict in Python yields its keys in insertion order — not completion order. Calling .result() on each future in insertion order means the loop blocks on each future sequentially even if later futures have already completed. The correct approach is as_completed(futures_map) which yields futures as they finish.

Evidence

with ThreadPoolExecutor(max_workers=self._max_workers) as pool:
    futures_map = {
        pool.submit(self._execute_single, cmd): cmd
        for cmd in self._commands
    }
    for future in futures_map:          # <-- iterates dict keys in insertion order
        result = future.result()         # <-- blocks until THIS specific future finishes
        results.append(result)

Contrast with the correct pattern used elsewhere in the codebase (e.g., SubplanExecutionService._execute_parallel):

for future in as_completed(future_to_id):   # yields in completion order
    result_status, output = future.result()

Expected Behavior

Futures should be consumed using as_completed() so that finished validations are collected immediately, and the overall pipeline finishes in max(individual_durations) rather than sum(individual_durations) in the worst case.

Actual Behavior

Futures are consumed in submission order. If validation #1 takes 30 seconds and validations #2-#10 take 0.1 seconds each, the pipeline takes ~30 seconds total instead of ~0.1 seconds because .result() on future #1 blocks despite all other futures being done.

Suggested Fix

from concurrent.futures import as_completed

with ThreadPoolExecutor(max_workers=self._max_workers) as pool:
    futures_map = {
        pool.submit(self._execute_single, cmd): cmd
        for cmd in self._commands
    }
    for future in as_completed(futures_map):   # fix: use as_completed
        result = future.result()
        results.append(result)

Category

concurrency

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD.


Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: bug-hunter

## Bug Report: concurrency — `ValidationPipeline.run()` iterates futures in insertion order, not completion order ### Severity Assessment - **Impact**: The ThreadPoolExecutor concurrency is completely defeated. The pipeline blocks on each future in the order they were submitted rather than processing them as they complete. This causes head-of-line blocking: a slow validation at submission position 0 blocks all results even if all other validations have already finished. Under high load this can cause cascading timeouts. - **Likelihood**: Always triggered when using `max_workers > 1` (the default is 4). - **Priority**: High ### Location - **File**: `src/cleveragents/application/services/validation_pipeline.py` - **Function/Class**: `ValidationPipeline.run` - **Lines**: 506–513 ### Description The pipeline submits all validation commands concurrently to a `ThreadPoolExecutor`, then collects results with: ```python for future in futures_map: result = future.result() results.append(result) ``` Iterating a dict in Python yields its **keys in insertion order** — not completion order. Calling `.result()` on each future in insertion order means the loop blocks on each future sequentially even if later futures have already completed. The correct approach is `as_completed(futures_map)` which yields futures as they finish. ### Evidence ```python with ThreadPoolExecutor(max_workers=self._max_workers) as pool: futures_map = { pool.submit(self._execute_single, cmd): cmd for cmd in self._commands } for future in futures_map: # <-- iterates dict keys in insertion order result = future.result() # <-- blocks until THIS specific future finishes results.append(result) ``` Contrast with the correct pattern used elsewhere in the codebase (e.g., `SubplanExecutionService._execute_parallel`): ```python for future in as_completed(future_to_id): # yields in completion order result_status, output = future.result() ``` ### Expected Behavior Futures should be consumed using `as_completed()` so that finished validations are collected immediately, and the overall pipeline finishes in `max(individual_durations)` rather than `sum(individual_durations)` in the worst case. ### Actual Behavior Futures are consumed in submission order. If validation #1 takes 30 seconds and validations #2-#10 take 0.1 seconds each, the pipeline takes ~30 seconds total instead of ~0.1 seconds because `.result()` on future #1 blocks despite all other futures being done. ### Suggested Fix ```python from concurrent.futures import as_completed with ThreadPoolExecutor(max_workers=self._max_workers) as pool: futures_map = { pool.submit(self._execute_single, cmd): cmd for cmd in self._commands } for future in as_completed(futures_map): # fix: use as_completed result = future.result() results.append(result) ``` ### Category concurrency ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: bug-hunter
HAL9000 added this to the v3.2.0 milestone 2026-04-12 03:24:55 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • Priority: High — concurrency defect that defeats ThreadPoolExecutor parallelism; always triggered with max_workers > 1 (default=4)
  • Milestone: v3.2.0 — ValidationPipeline is part of the validation runner required for M3 (invariant enforcement during strategize)
  • Story Points: 2 — S — one-line fix: replace for future in futures_map with for future in as_completed(futures_map)
  • MoSCoW: Must Have — validation pipeline performance is critical for M3 acceptance criteria; head-of-line blocking causes cascading timeouts
  • Parent Epic: Validation pipeline epic (v3.2.0 scope)

The fix is trivial (import as_completed and change the iteration pattern) but the impact is significant — worst-case validation time goes from O(sum) to O(max).


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner

Issue triaged by project owner: - **State**: Verified - **Priority**: High — concurrency defect that defeats ThreadPoolExecutor parallelism; always triggered with max_workers > 1 (default=4) - **Milestone**: v3.2.0 — ValidationPipeline is part of the validation runner required for M3 (invariant enforcement during strategize) - **Story Points**: 2 — S — one-line fix: replace `for future in futures_map` with `for future in as_completed(futures_map)` - **MoSCoW**: Must Have — validation pipeline performance is critical for M3 acceptance criteria; head-of-line blocking causes cascading timeouts - **Parent Epic**: Validation pipeline epic (v3.2.0 scope) The fix is trivial (import `as_completed` and change the iteration pattern) but the impact is significant — worst-case validation time goes from O(sum) to O(max). --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner
Author
Owner

Verified — Concurrency bug: ValidationPipeline iterates futures in insertion order instead of completion order. MoSCoW: Should-have. Priority: Medium.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Concurrency bug: ValidationPipeline iterates futures in insertion order instead of completion order. MoSCoW: Should-have. Priority: Medium. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Author
Owner

Verified — Concurrency bug: ValidationPipeline iterates futures in insertion order instead of completion order. MoSCoW: Should-have. Priority: Medium.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Concurrency bug: ValidationPipeline iterates futures in insertion order instead of completion order. MoSCoW: Should-have. Priority: Medium. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Author
Owner

Verified — Concurrency bug: ValidationPipeline iterates futures in insertion order instead of completion order. MoSCoW: Should-have. Priority: Medium.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Concurrency bug: ValidationPipeline iterates futures in insertion order instead of completion order. MoSCoW: Should-have. Priority: Medium. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#7717
No description provided.