cleveragents/cleveragents-core

Fork 3

TEST-INFRA: [ci-pipeline-design] Optimize parallel test execution #7783

New issue

Open

opened 2026-04-12 03:34:27 +00:00 by HAL9000 · 4 comments

HAL9000 commented

2026-04-12 03:34:27 +00:00

Owner

Background and Context

The CI pipeline uses behave-parallel and pabot for parallel test execution. The number of parallel processes used during test runs should be dynamically determined based on the available CPU cores to maximize test execution speed without overloading the system.

Note on existing implementation: A review of noxfile.py reveals that a _default_processes() helper already exists and uses os.sched_getaffinity(0) (with os.cpu_count() fallback) to determine the process count. However, the implementation includes a conservative cap comment ("Keep default parallelism conservative to avoid timeout/OOM flakes under heavy Robot/pabot subprocess fan-out in CI and shared runners") and the correctness of the dynamic tuning has not been formally verified or benchmarked. This issue tracks verification, potential improvement, and test coverage of the dynamic process-count logic.

Current Behavior

The noxfile.py contains a _default_processes() function that reads TEST_PROCESSES from the environment, then falls back to os.sched_getaffinity(0) or os.cpu_count(). Both _behave_parallel_args() and _pabot_parallel_args() delegate to this function. The implementation is present but lacks formal BDD test coverage and has not been benchmarked to confirm it reduces execution time without introducing flakiness.

Expected Behavior

The dynamic process-count logic in _default_processes() is verified correct and well-tested.
Both behave-parallel and pabot sessions demonstrably use the dynamic count.
Test execution time is measured before and after any tuning changes to confirm improvement.
No new test failures or flakiness are introduced.

Acceptance Criteria

Audit and, if necessary, modify noxfile.py to ensure _default_processes() correctly determines the optimal number of processes for parallel test execution based on available CPU cores.
Ensure that the change does not introduce any new test failures or flakiness.
Verify that the test execution time is reduced (or not regressed) after optimizing the number of processes.

Supporting Information

Related issues: #5918 (parallelize E2E tests), #5820 (parallelize Robot Framework), #5749 (parallelize Behave), #7195 (parallelize unit and integration test suites), #7430 (parallelize CI jobs)
noxfile.py — _default_processes(), _behave_parallel_args(), _pabot_parallel_args()

Duplicate Check

Search queries used: parallel execution cpu, dynamic processes noxfile, pabot processes, behave-parallel cpu cores
Number of results found: Multiple related issues (see Supporting Information above)
Why none of the existing issues cover this specific finding: Existing issues address parallelization at a higher level (whether to parallelize). This issue specifically targets the correctness, test coverage, and benchmarking of the dynamic CPU-based process-count logic in _default_processes().

Metadata

Branch: chore/test-infra/ci-pipeline-design-optimize-parallel-execution
Commit Message: chore(noxfile): optimize parallel test execution with dynamic process count
Milestone: Backlog (no milestone — see backlog note below)
Parent Epic: #5407

Subtasks

Audit _default_processes() in noxfile.py — confirm it correctly uses os.sched_getaffinity(0) with os.cpu_count() fallback and TEST_PROCESSES override
Verify that _behave_parallel_args() and _pabot_parallel_args() correctly pass the dynamic count to behave-parallel and pabot respectively
Add BDD scenarios (Behave) to test the _default_processes() logic under various conditions (env override, affinity available, affinity unavailable)
Benchmark test execution time before and after any tuning changes to confirm improvement without flakiness
Verify coverage >= 97% via nox -s coverage_report
Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

All subtasks above are completed and checked off.
A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
All nox stages pass.
Coverage >= 97%.

Backlog note: This issue was discovered during autonomous operation
on milestone v3.2.0. It does not block milestone completion and has been
placed in the backlog for human review and future milestone assignment.

Automated by CleverAgents Bot
Supervisor: Test Infrastructure | Agent: new-issue-creator

## Background and Context The CI pipeline uses `behave-parallel` and `pabot` for parallel test execution. The number of parallel processes used during test runs should be dynamically determined based on the available CPU cores to maximize test execution speed without overloading the system. **Note on existing implementation:** A review of `noxfile.py` reveals that a `_default_processes()` helper already exists and uses `os.sched_getaffinity(0)` (with `os.cpu_count()` fallback) to determine the process count. However, the implementation includes a conservative cap comment ("Keep default parallelism conservative to avoid timeout/OOM flakes under heavy Robot/pabot subprocess fan-out in CI and shared runners") and the correctness of the dynamic tuning has not been formally verified or benchmarked. This issue tracks verification, potential improvement, and test coverage of the dynamic process-count logic. ## Current Behavior The `noxfile.py` contains a `_default_processes()` function that reads `TEST_PROCESSES` from the environment, then falls back to `os.sched_getaffinity(0)` or `os.cpu_count()`. Both `_behave_parallel_args()` and `_pabot_parallel_args()` delegate to this function. The implementation is present but lacks formal BDD test coverage and has not been benchmarked to confirm it reduces execution time without introducing flakiness. ## Expected Behavior - The dynamic process-count logic in `_default_processes()` is verified correct and well-tested. - Both `behave-parallel` and `pabot` sessions demonstrably use the dynamic count. - Test execution time is measured before and after any tuning changes to confirm improvement. - No new test failures or flakiness are introduced. ## Acceptance Criteria - [ ] Audit and, if necessary, modify `noxfile.py` to ensure `_default_processes()` correctly determines the optimal number of processes for parallel test execution based on available CPU cores. - [ ] Ensure that the change does not introduce any new test failures or flakiness. - [ ] Verify that the test execution time is reduced (or not regressed) after optimizing the number of processes. ## Supporting Information - Related issues: #5918 (parallelize E2E tests), #5820 (parallelize Robot Framework), #5749 (parallelize Behave), #7195 (parallelize unit and integration test suites), #7430 (parallelize CI jobs) - `noxfile.py` — `_default_processes()`, `_behave_parallel_args()`, `_pabot_parallel_args()` ### Duplicate Check - Search queries used: `parallel execution cpu`, `dynamic processes noxfile`, `pabot processes`, `behave-parallel cpu cores` - Number of results found: Multiple related issues (see Supporting Information above) - Why none of the existing issues cover this specific finding: Existing issues address parallelization at a higher level (whether to parallelize). This issue specifically targets the correctness, test coverage, and benchmarking of the dynamic CPU-based process-count logic in `_default_processes()`. --- ## Metadata - **Branch**: `chore/test-infra/ci-pipeline-design-optimize-parallel-execution` - **Commit Message**: `chore(noxfile): optimize parallel test execution with dynamic process count` - **Milestone**: Backlog (no milestone — see backlog note below) - **Parent Epic**: #5407 ## Subtasks - [ ] Audit `_default_processes()` in `noxfile.py` — confirm it correctly uses `os.sched_getaffinity(0)` with `os.cpu_count()` fallback and `TEST_PROCESSES` override - [ ] Verify that `_behave_parallel_args()` and `_pabot_parallel_args()` correctly pass the dynamic count to `behave-parallel` and `pabot` respectively - [ ] Add BDD scenarios (Behave) to test the `_default_processes()` logic under various conditions (env override, affinity available, affinity unavailable) - [ ] Benchmark test execution time before and after any tuning changes to confirm improvement without flakiness - [ ] Verify coverage >= 97% via `nox -s coverage_report` - [ ] Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. - All nox stages pass. - Coverage >= 97%. > **Backlog note:** This issue was discovered during autonomous operation > on milestone v3.2.0. It does not block milestone completion and has been > placed in the backlog for human review and future milestone assignment. --- **Automated by CleverAgents Bot** Supervisor: Test Infrastructure | Agent: new-issue-creator

HAL9000 added a new dependency

2026-04-12 03:34:54 +00:00

#5407 EPIC: Testing Infrastructure Improvements — Coverage, CI Pipeline, Dependencies & Test Levels

HAL9000 commented

2026-04-12 03:41:45 +00:00

Author

Owner

Label application required: State/Unverified (846), Type/Task (857), Priority/Backlog (862). Parent Epic #5407 dependency link created successfully.

HAL9000 added the

labels

2026-04-12 03:45:03 +00:00

HAL9000 added this to the v3.2.0 milestone

2026-04-12 03:47:02 +00:00

HAL9000 referenced this issue

2026-04-12 03:47:56 +00:00

[AUTO-PROJ-OWN] Project Owner Report (Cycle 1) #7724

HAL9000 added

and removed

labels

2026-04-14 05:58:09 +00:00

HAL9000 commented

2026-04-14 05:58:09 +00:00

Author

Owner

✅ Verified — CI improvement: optimize parallel test execution. MoSCoW: Should-have. Priority: Medium.

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — CI improvement: optimize parallel test execution. MoSCoW: Should-have. Priority: Medium. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor

HAL9000 commented

2026-04-14 06:11:27 +00:00

Author

Owner

✅ Verified — CI improvement: optimize parallel test execution. MoSCoW: Should-have. Priority: Medium.

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor