perf(tests): optimize coverage instrumentation and reporting pipeline #482

Closed
opened 2026-03-01 01:27:14 +00:00 by freemo · 0 comments
Owner

Metadata

  • Commit Message: perf(tests): optimize coverage instrumentation and reporting pipeline
  • Branch: perf/optimize-coverage-pipeline

Background and Context

Part of #478.

The coverage report (nox -e coverage_report) takes 75 minutes 20 seconds3.1x slower than the plain unit tests (24m 21s). The per-feature execution times are nearly identical between unit tests and coverage (2,352s vs 2,358s in aggregate), so the overhead comes from:

  1. Coverage instrumentation startup — each of 32 parallel workers runs coverage run --parallel-mode, adding coverage.py tracer initialization per subprocess
  2. Coverage data file I/O — 32 workers simultaneously write .coverage.* files, creating disk contention
  3. Post-test processing — 5 sequential steps after tests complete:
    • coverage combine (merge 339+ .coverage.* files)
    • coverage html (generate full HTML report)
    • coverage xml (generate XML report)
    • coverage report --show-missing (terminal output)
    • coverage json (JSON export)

Current Coverage Session (noxfile.py lines 484-577)

The coverage session runs behave-parallel with BEHAVE_PARALLEL_COVERAGE=1, which causes each worker to wrap behave in coverage run --parallel-mode. After all features complete, it runs the 5 reporting steps sequentially.

Acceptance Criteria

  • nox -e coverage_report wall-clock time reduced by at least 50% (from 75m to under 37m)
  • Coverage percentage remains accurate and at or above 97%
  • Coverage HTML, XML, and JSON reports are still generated
  • CI can still parse coverage results

Subtasks

Research Phase

  • Measure time spent in each post-test step: coverage combine, coverage html, coverage xml, coverage report, coverage json
  • Evaluate slipcover as a faster coverage tracer alternative to coverage.py — slipcover uses bytecode instrumentation rather than sys.settrace, typically 2-10x faster
  • Evaluate coverage.py with [run] branch = false to see if disabling branch coverage speeds up tracing (if branch coverage is not required for the 97% gate)
  • Evaluate writing coverage data to a RAM-backed filesystem (/dev/shm or tmpfs) to eliminate disk I/O contention from 32 parallel workers
  • Measure whether reducing parallel workers for coverage (e.g., 8 instead of 32) reduces I/O contention enough to offset the reduced parallelism

Implementation Phase

  • Move coverage data directory to /dev/shm or a tmpfs mount to eliminate disk I/O: set COVERAGE_FILE=/dev/shm/.coverage in the nox session
  • Run coverage html and coverage xml in parallel (they are independent operations on the combined data)
  • Add a --quick mode to coverage_report that skips HTML/XML generation for local development (only generates terminal report + JSON for threshold check)
  • If slipcover is viable: integrate slipcover as the default tracer with a fallback to coverage.py, update _build_base_args() in the behave-parallel CLI
  • If in-process parallelism is implemented (see #481): leverage coverage.process_startup() for multiprocessing coverage instead of --parallel-mode subprocess approach
  • Optimize coverage combine by using --keep flag and incremental combination if supported
  • Consider generating only JSON report and deriving HTML/XML from it (single data pass instead of three)

Verification Phase

  • Run nox -e coverage_report and confirm coverage >= 97%
  • Verify HTML, XML, and JSON reports are generated correctly
  • Measure wall-clock improvement and confirm >= 50% reduction
  • Verify CI can still parse the coverage output format

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
## Metadata - **Commit Message**: `perf(tests): optimize coverage instrumentation and reporting pipeline` - **Branch**: `perf/optimize-coverage-pipeline` ## Background and Context Part of #478. The coverage report (`nox -e coverage_report`) takes **75 minutes 20 seconds** — **3.1x slower** than the plain unit tests (24m 21s). The per-feature execution times are nearly identical between unit tests and coverage (2,352s vs 2,358s in aggregate), so the overhead comes from: 1. **Coverage instrumentation startup** — each of 32 parallel workers runs `coverage run --parallel-mode`, adding `coverage.py` tracer initialization per subprocess 2. **Coverage data file I/O** — 32 workers simultaneously write `.coverage.*` files, creating disk contention 3. **Post-test processing** — 5 sequential steps after tests complete: - `coverage combine` (merge 339+ `.coverage.*` files) - `coverage html` (generate full HTML report) - `coverage xml` (generate XML report) - `coverage report --show-missing` (terminal output) - `coverage json` (JSON export) ### Current Coverage Session (noxfile.py lines 484-577) The coverage session runs behave-parallel with `BEHAVE_PARALLEL_COVERAGE=1`, which causes each worker to wrap behave in `coverage run --parallel-mode`. After all features complete, it runs the 5 reporting steps sequentially. ## Acceptance Criteria - [ ] `nox -e coverage_report` wall-clock time reduced by at least 50% (from 75m to under 37m) - [ ] Coverage percentage remains accurate and at or above 97% - [ ] Coverage HTML, XML, and JSON reports are still generated - [ ] CI can still parse coverage results ## Subtasks ### Research Phase - [ ] Measure time spent in each post-test step: `coverage combine`, `coverage html`, `coverage xml`, `coverage report`, `coverage json` - [ ] Evaluate `slipcover` as a faster coverage tracer alternative to `coverage.py` — slipcover uses bytecode instrumentation rather than sys.settrace, typically 2-10x faster - [ ] Evaluate `coverage.py` with `[run] branch = false` to see if disabling branch coverage speeds up tracing (if branch coverage is not required for the 97% gate) - [ ] Evaluate writing coverage data to a RAM-backed filesystem (`/dev/shm` or `tmpfs`) to eliminate disk I/O contention from 32 parallel workers - [ ] Measure whether reducing parallel workers for coverage (e.g., 8 instead of 32) reduces I/O contention enough to offset the reduced parallelism ### Implementation Phase - [ ] Move coverage data directory to `/dev/shm` or a `tmpfs` mount to eliminate disk I/O: set `COVERAGE_FILE=/dev/shm/.coverage` in the nox session - [ ] Run `coverage html` and `coverage xml` in parallel (they are independent operations on the combined data) - [ ] Add a `--quick` mode to coverage_report that skips HTML/XML generation for local development (only generates terminal report + JSON for threshold check) - [ ] If slipcover is viable: integrate slipcover as the default tracer with a fallback to coverage.py, update `_build_base_args()` in the behave-parallel CLI - [ ] If in-process parallelism is implemented (see #481): leverage `coverage.process_startup()` for multiprocessing coverage instead of `--parallel-mode` subprocess approach - [ ] Optimize `coverage combine` by using `--keep` flag and incremental combination if supported - [ ] Consider generating only JSON report and deriving HTML/XML from it (single data pass instead of three) ### Verification Phase - [ ] Run `nox -e coverage_report` and confirm coverage >= 97% - [ ] Verify HTML, XML, and JSON reports are generated correctly - [ ] Measure wall-clock improvement and confirm >= 50% reduction - [ ] Verify CI can still parse the coverage output format ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done.
freemo added this to the v3.2.0 milestone 2026-03-02 01:45:03 +00:00
freemo added reference perf/bdd-test-optimization 2026-03-02 01:46:39 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#482
No description provided.