TEST-INFRA: [ci-pipeline-design] Optimize parallel test execution #7783

Open
opened 2026-04-12 03:34:27 +00:00 by HAL9000 · 4 comments
Owner

Background and Context

The CI pipeline uses behave-parallel and pabot for parallel test execution. The number of parallel processes used during test runs should be dynamically determined based on the available CPU cores to maximize test execution speed without overloading the system.

Note on existing implementation: A review of noxfile.py reveals that a _default_processes() helper already exists and uses os.sched_getaffinity(0) (with os.cpu_count() fallback) to determine the process count. However, the implementation includes a conservative cap comment ("Keep default parallelism conservative to avoid timeout/OOM flakes under heavy Robot/pabot subprocess fan-out in CI and shared runners") and the correctness of the dynamic tuning has not been formally verified or benchmarked. This issue tracks verification, potential improvement, and test coverage of the dynamic process-count logic.

Current Behavior

The noxfile.py contains a _default_processes() function that reads TEST_PROCESSES from the environment, then falls back to os.sched_getaffinity(0) or os.cpu_count(). Both _behave_parallel_args() and _pabot_parallel_args() delegate to this function. The implementation is present but lacks formal BDD test coverage and has not been benchmarked to confirm it reduces execution time without introducing flakiness.

Expected Behavior

  • The dynamic process-count logic in _default_processes() is verified correct and well-tested.
  • Both behave-parallel and pabot sessions demonstrably use the dynamic count.
  • Test execution time is measured before and after any tuning changes to confirm improvement.
  • No new test failures or flakiness are introduced.

Acceptance Criteria

  • Audit and, if necessary, modify noxfile.py to ensure _default_processes() correctly determines the optimal number of processes for parallel test execution based on available CPU cores.
  • Ensure that the change does not introduce any new test failures or flakiness.
  • Verify that the test execution time is reduced (or not regressed) after optimizing the number of processes.

Supporting Information

  • Related issues: #5918 (parallelize E2E tests), #5820 (parallelize Robot Framework), #5749 (parallelize Behave), #7195 (parallelize unit and integration test suites), #7430 (parallelize CI jobs)
  • noxfile.py_default_processes(), _behave_parallel_args(), _pabot_parallel_args()

Duplicate Check

  • Search queries used: parallel execution cpu, dynamic processes noxfile, pabot processes, behave-parallel cpu cores
  • Number of results found: Multiple related issues (see Supporting Information above)
  • Why none of the existing issues cover this specific finding: Existing issues address parallelization at a higher level (whether to parallelize). This issue specifically targets the correctness, test coverage, and benchmarking of the dynamic CPU-based process-count logic in _default_processes().

Metadata

  • Branch: chore/test-infra/ci-pipeline-design-optimize-parallel-execution
  • Commit Message: chore(noxfile): optimize parallel test execution with dynamic process count
  • Milestone: Backlog (no milestone — see backlog note below)
  • Parent Epic: #5407

Subtasks

  • Audit _default_processes() in noxfile.py — confirm it correctly uses os.sched_getaffinity(0) with os.cpu_count() fallback and TEST_PROCESSES override
  • Verify that _behave_parallel_args() and _pabot_parallel_args() correctly pass the dynamic count to behave-parallel and pabot respectively
  • Add BDD scenarios (Behave) to test the _default_processes() logic under various conditions (env override, affinity available, affinity unavailable)
  • Benchmark test execution time before and after any tuning changes to confirm improvement without flakiness
  • Verify coverage >= 97% via nox -s coverage_report
  • Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
  • All nox stages pass.
  • Coverage >= 97%.

Backlog note: This issue was discovered during autonomous operation
on milestone v3.2.0. It does not block milestone completion and has been
placed in the backlog for human review and future milestone assignment.


Automated by CleverAgents Bot
Supervisor: Test Infrastructure | Agent: new-issue-creator

## Background and Context The CI pipeline uses `behave-parallel` and `pabot` for parallel test execution. The number of parallel processes used during test runs should be dynamically determined based on the available CPU cores to maximize test execution speed without overloading the system. **Note on existing implementation:** A review of `noxfile.py` reveals that a `_default_processes()` helper already exists and uses `os.sched_getaffinity(0)` (with `os.cpu_count()` fallback) to determine the process count. However, the implementation includes a conservative cap comment ("Keep default parallelism conservative to avoid timeout/OOM flakes under heavy Robot/pabot subprocess fan-out in CI and shared runners") and the correctness of the dynamic tuning has not been formally verified or benchmarked. This issue tracks verification, potential improvement, and test coverage of the dynamic process-count logic. ## Current Behavior The `noxfile.py` contains a `_default_processes()` function that reads `TEST_PROCESSES` from the environment, then falls back to `os.sched_getaffinity(0)` or `os.cpu_count()`. Both `_behave_parallel_args()` and `_pabot_parallel_args()` delegate to this function. The implementation is present but lacks formal BDD test coverage and has not been benchmarked to confirm it reduces execution time without introducing flakiness. ## Expected Behavior - The dynamic process-count logic in `_default_processes()` is verified correct and well-tested. - Both `behave-parallel` and `pabot` sessions demonstrably use the dynamic count. - Test execution time is measured before and after any tuning changes to confirm improvement. - No new test failures or flakiness are introduced. ## Acceptance Criteria - [ ] Audit and, if necessary, modify `noxfile.py` to ensure `_default_processes()` correctly determines the optimal number of processes for parallel test execution based on available CPU cores. - [ ] Ensure that the change does not introduce any new test failures or flakiness. - [ ] Verify that the test execution time is reduced (or not regressed) after optimizing the number of processes. ## Supporting Information - Related issues: #5918 (parallelize E2E tests), #5820 (parallelize Robot Framework), #5749 (parallelize Behave), #7195 (parallelize unit and integration test suites), #7430 (parallelize CI jobs) - `noxfile.py` — `_default_processes()`, `_behave_parallel_args()`, `_pabot_parallel_args()` ### Duplicate Check - Search queries used: `parallel execution cpu`, `dynamic processes noxfile`, `pabot processes`, `behave-parallel cpu cores` - Number of results found: Multiple related issues (see Supporting Information above) - Why none of the existing issues cover this specific finding: Existing issues address parallelization at a higher level (whether to parallelize). This issue specifically targets the correctness, test coverage, and benchmarking of the dynamic CPU-based process-count logic in `_default_processes()`. --- ## Metadata - **Branch**: `chore/test-infra/ci-pipeline-design-optimize-parallel-execution` - **Commit Message**: `chore(noxfile): optimize parallel test execution with dynamic process count` - **Milestone**: Backlog (no milestone — see backlog note below) - **Parent Epic**: #5407 ## Subtasks - [ ] Audit `_default_processes()` in `noxfile.py` — confirm it correctly uses `os.sched_getaffinity(0)` with `os.cpu_count()` fallback and `TEST_PROCESSES` override - [ ] Verify that `_behave_parallel_args()` and `_pabot_parallel_args()` correctly pass the dynamic count to `behave-parallel` and `pabot` respectively - [ ] Add BDD scenarios (Behave) to test the `_default_processes()` logic under various conditions (env override, affinity available, affinity unavailable) - [ ] Benchmark test execution time before and after any tuning changes to confirm improvement without flakiness - [ ] Verify coverage >= 97% via `nox -s coverage_report` - [ ] Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. - All nox stages pass. - Coverage >= 97%. > **Backlog note:** This issue was discovered during autonomous operation > on milestone v3.2.0. It does not block milestone completion and has been > placed in the backlog for human review and future milestone assignment. --- **Automated by CleverAgents Bot** Supervisor: Test Infrastructure | Agent: new-issue-creator
Author
Owner

Label application required: State/Unverified (846), Type/Task (857), Priority/Backlog (862). Parent Epic #5407 dependency link created successfully.

Label application required: State/Unverified (846), Type/Task (857), Priority/Backlog (862). Parent Epic #5407 dependency link created successfully.
HAL9000 added this to the v3.2.0 milestone 2026-04-12 03:47:02 +00:00
Author
Owner

Verified — CI improvement: optimize parallel test execution. MoSCoW: Should-have. Priority: Medium.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — CI improvement: optimize parallel test execution. MoSCoW: Should-have. Priority: Medium. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Author
Owner

Verified — CI improvement: optimize parallel test execution. MoSCoW: Should-have. Priority: Medium.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — CI improvement: optimize parallel test execution. MoSCoW: Should-have. Priority: Medium. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Author
Owner

Verified — CI improvement: optimize parallel test execution. MoSCoW: Should-have. Priority: Medium.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — CI improvement: optimize parallel test execution. MoSCoW: Should-have. Priority: Medium. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#7783
No description provided.