perf(tests): profile and optimize the 20 medium-slow feature files (10-100s) #480

Closed
opened 2026-03-01 01:26:13 +00:00 by freemo · 0 comments
Owner

Metadata

  • Commit Message: perf(tests): optimize medium-slow BDD features (10-100s tier)
  • Branch: perf/optimize-medium-features

Background and Context

Part of #478.

The 20 features in the 10-100s tier account for 24% of total BDD test runtime (565s of 2,352s). While individually less egregious than the top 8, this tier collectively contributes significant runtime and contains features with the same class of bottlenecks (database setup, subprocess calls, heavy I/O).

Target Features

# Feature File Runtime (s)
1 plan_persistence.feature 65.3
2 repositories_error_handling_coverage.feature 57.4
3 auto_debug_integration.feature 54.2
4 action_persistence.feature 51.8
5 repository_coverage_boost.feature 44.5
6 legacy_plan_removal.feature 42.1
7 context_service_uncovered_lines.feature 36.7
8 retry_patterns.feature 30.7
9 module_coverage.feature 25.4
10 coverage_maximum.feature 21.8
11 garbage_collection.feature 21.0
12 legacy_migrator_coverage.feature 16.8
13 plan_service_uncovered_lines.feature 14.9
14 plan_service_coverage.feature 13.8
15 repositories_uncovered_lines.feature 12.8
16 project_service_coverage.feature 12.2
17 main_coverage_complete.feature 11.8
18 project_cli_commands.feature 11.4
19 resource_registry_tables.feature 10.4
20 coverage_boost.feature 10.1

Target

Reduce each of these features to under 5 seconds, achieving a ~90% reduction in this tier.

Acceptance Criteria

  • Each of the 20 target features completes in under 5 seconds
  • No scenarios are removed — all existing behavior is preserved
  • All tests continue to pass via nox -e unit_tests
  • Coverage remains at or above 97% via nox -e coverage_report

Subtasks

Investigation Phase

  • Profile each of the 20 features to identify per-scenario timing and root cause of slowness
  • Categorize bottlenecks into: (a) database setup/teardown, (b) CLI subprocess calls, (c) filesystem I/O, (d) heavy imports, (e) sleep/wait calls, (f) retry loop overhead
  • For persistence-heavy features (plan_persistence, action_persistence, repositories_*): measure SQLite connection setup/teardown cost per scenario
  • For retry_patterns.feature (30.7s): check for real time.sleep() calls in retry logic steps
  • For garbage_collection.feature (21.0s): check for real GC triggering with large object graphs
  • For legacy_migrator_coverage.feature (16.8s): check for real database migration execution per scenario
  • For module_coverage.feature (25.4s): check for expensive module import/reload cycles

Optimization Phase

  • Replace real time.sleep() calls in retry_patterns.feature with mock/patched sleep that returns immediately
  • Replace per-scenario database creation with shared in-memory SQLite and transaction rollback
  • Replace CLI subprocess invocations in project_cli_commands.feature with CliRunner.invoke()
  • Optimize garbage_collection.feature to use smaller object graphs while preserving GC behavior verification
  • Replace real migration execution in legacy_migrator_coverage.feature with mock migration runner
  • Optimize module_coverage.feature to avoid redundant importlib.reload() calls
  • Replace filesystem I/O in resource_registry_tables.feature with in-memory registry
  • Consolidate overlapping repository scenarios in repositories_error_handling_coverage.feature, repositories_uncovered_lines.feature, and repository_coverage_boost.feature

Verification Phase

  • Run nox -e unit_tests and confirm all 339 features pass
  • Run nox -e coverage_report and confirm coverage >= 97%
  • Record new per-feature timing for each of the 20 features and verify each is under 5s

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
## Metadata - **Commit Message**: `perf(tests): optimize medium-slow BDD features (10-100s tier)` - **Branch**: `perf/optimize-medium-features` ## Background and Context Part of #478. The 20 features in the 10-100s tier account for **24% of total BDD test runtime** (565s of 2,352s). While individually less egregious than the top 8, this tier collectively contributes significant runtime and contains features with the same class of bottlenecks (database setup, subprocess calls, heavy I/O). ### Target Features | # | Feature File | Runtime (s) | |---|---|---| | 1 | `plan_persistence.feature` | 65.3 | | 2 | `repositories_error_handling_coverage.feature` | 57.4 | | 3 | `auto_debug_integration.feature` | 54.2 | | 4 | `action_persistence.feature` | 51.8 | | 5 | `repository_coverage_boost.feature` | 44.5 | | 6 | `legacy_plan_removal.feature` | 42.1 | | 7 | `context_service_uncovered_lines.feature` | 36.7 | | 8 | `retry_patterns.feature` | 30.7 | | 9 | `module_coverage.feature` | 25.4 | | 10 | `coverage_maximum.feature` | 21.8 | | 11 | `garbage_collection.feature` | 21.0 | | 12 | `legacy_migrator_coverage.feature` | 16.8 | | 13 | `plan_service_uncovered_lines.feature` | 14.9 | | 14 | `plan_service_coverage.feature` | 13.8 | | 15 | `repositories_uncovered_lines.feature` | 12.8 | | 16 | `project_service_coverage.feature` | 12.2 | | 17 | `main_coverage_complete.feature` | 11.8 | | 18 | `project_cli_commands.feature` | 11.4 | | 19 | `resource_registry_tables.feature` | 10.4 | | 20 | `coverage_boost.feature` | 10.1 | ### Target Reduce each of these features to **under 5 seconds**, achieving a ~90% reduction in this tier. ## Acceptance Criteria - [ ] Each of the 20 target features completes in under 5 seconds - [ ] No scenarios are removed — all existing behavior is preserved - [ ] All tests continue to pass via `nox -e unit_tests` - [ ] Coverage remains at or above 97% via `nox -e coverage_report` ## Subtasks ### Investigation Phase - [ ] Profile each of the 20 features to identify per-scenario timing and root cause of slowness - [ ] Categorize bottlenecks into: (a) database setup/teardown, (b) CLI subprocess calls, (c) filesystem I/O, (d) heavy imports, (e) sleep/wait calls, (f) retry loop overhead - [ ] For persistence-heavy features (`plan_persistence`, `action_persistence`, `repositories_*`): measure SQLite connection setup/teardown cost per scenario - [ ] For `retry_patterns.feature` (30.7s): check for real `time.sleep()` calls in retry logic steps - [ ] For `garbage_collection.feature` (21.0s): check for real GC triggering with large object graphs - [ ] For `legacy_migrator_coverage.feature` (16.8s): check for real database migration execution per scenario - [ ] For `module_coverage.feature` (25.4s): check for expensive module import/reload cycles ### Optimization Phase - [ ] Replace real `time.sleep()` calls in `retry_patterns.feature` with mock/patched sleep that returns immediately - [ ] Replace per-scenario database creation with shared in-memory SQLite and transaction rollback - [ ] Replace CLI subprocess invocations in `project_cli_commands.feature` with `CliRunner.invoke()` - [ ] Optimize `garbage_collection.feature` to use smaller object graphs while preserving GC behavior verification - [ ] Replace real migration execution in `legacy_migrator_coverage.feature` with mock migration runner - [ ] Optimize `module_coverage.feature` to avoid redundant `importlib.reload()` calls - [ ] Replace filesystem I/O in `resource_registry_tables.feature` with in-memory registry - [ ] Consolidate overlapping repository scenarios in `repositories_error_handling_coverage.feature`, `repositories_uncovered_lines.feature`, and `repository_coverage_boost.feature` ### Verification Phase - [ ] Run `nox -e unit_tests` and confirm all 339 features pass - [ ] Run `nox -e coverage_report` and confirm coverage >= 97% - [ ] Record new per-feature timing for each of the 20 features and verify each is under 5s ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done.
freemo added this to the v3.2.0 milestone 2026-03-02 01:45:02 +00:00
freemo added reference perf/bdd-test-optimization 2026-03-02 01:46:38 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#480
No description provided.