perf(tests): optimize medium-slow BDD features (10-100s tier) #489

Closed
brent.edwards wants to merge 1 commit from perf/optimize-medium-features into perf/optimize-slowest-features
Member

Summary

Optimize the 20 medium-slow BDD features (10-100s range) to all run under 5s behave-internal time. Total tier runtime reduced from 565s to 21s (96% reduction).

Changes

Global sleep cap (features/environment.py)

  • Cap time.sleep and asyncio.sleep at 10ms in before_all to eliminate retry/backoff waits
  • Save originals as time._original_sleep / asyncio._original_sleep for tests needing real delays

Template DB patch enhancements (features/environment.py)

  • Add "db." to _SCENARIO_DB_PREFIXES for .cleveragents/db.sqlite paths
  • Check db_path.stat().st_size > 0 so 0-byte auto-created SQLite files get template copy

subprocess.run → CliRunner

  • module_coverage_steps.py, main_coverage_complete_steps.py, coverage_extras_steps.py
  • Eliminates ~6s Python cold-start overhead per CLI invocation

In-memory SQLite by default

  • plan_persistence_steps.py: in-memory default, file-based via explicit Given step
  • action_persistence_steps.py: switched to in-memory SQLite
  • plan_service_steps.py: Base.metadata.create_all instead of MigrationRunner.run_migrations

Real-sleep escapes for timing-sensitive tests

  • retry_patterns_steps.py: CircuitBreaker uses time._original_sleep
  • security_async_steps.py: timeout test uses asyncio._original_sleep
  • skill_refresh_steps.py: debounce wait uses time._original_sleep
  • validation_pipeline_steps.py: timeout executor uses time._original_sleep

Per-feature results

Feature Before After
plan_persistence 65.3s 2.1s
repositories_error_handling_coverage 57.4s 1.3s
auto_debug_integration 54.2s 1.6s
action_persistence 51.8s 0.3s
repository_coverage_boost 44.5s 1.1s
legacy_plan_removal 42.1s 0.02s
context_service_uncovered_lines 36.7s 0.3s
retry_patterns 30.7s 0.7s
module_coverage 25.4s 0.3s
coverage_maximum 21.8s 0.1s
garbage_collection 21.0s 1.2s
legacy_migrator_coverage 16.8s 3.4s
plan_service_uncovered_lines 14.9s 0.9s
plan_service_coverage 13.8s 0.9s
repositories_uncovered_lines 12.8s 0.6s
project_service_coverage 12.2s 3.4s
main_coverage_complete 11.8s 0.05s
project_cli_commands 11.4s 1.2s
resource_registry_tables 10.4s 1.4s
coverage_boost 10.1s 0.4s

Verification

Full 342-feature suite passes (327 pass, 4 fail + 4 error are pre-existing stderr not separately captured issues in click.testing.CliRunner, verified on clean branch).

Part of #478. Closes #480.

## Summary Optimize the 20 medium-slow BDD features (10-100s range) to all run under 5s behave-internal time. Total tier runtime reduced from 565s to 21s (96% reduction). ## Changes ### Global sleep cap (`features/environment.py`) - Cap `time.sleep` and `asyncio.sleep` at 10ms in `before_all` to eliminate retry/backoff waits - Save originals as `time._original_sleep` / `asyncio._original_sleep` for tests needing real delays ### Template DB patch enhancements (`features/environment.py`) - Add `"db."` to `_SCENARIO_DB_PREFIXES` for `.cleveragents/db.sqlite` paths - Check `db_path.stat().st_size > 0` so 0-byte auto-created SQLite files get template copy ### subprocess.run → CliRunner - `module_coverage_steps.py`, `main_coverage_complete_steps.py`, `coverage_extras_steps.py` - Eliminates ~6s Python cold-start overhead per CLI invocation ### In-memory SQLite by default - `plan_persistence_steps.py`: in-memory default, file-based via explicit Given step - `action_persistence_steps.py`: switched to in-memory SQLite - `plan_service_steps.py`: `Base.metadata.create_all` instead of `MigrationRunner.run_migrations` ### Real-sleep escapes for timing-sensitive tests - `retry_patterns_steps.py`: CircuitBreaker uses `time._original_sleep` - `security_async_steps.py`: timeout test uses `asyncio._original_sleep` - `skill_refresh_steps.py`: debounce wait uses `time._original_sleep` - `validation_pipeline_steps.py`: timeout executor uses `time._original_sleep` ## Per-feature results | Feature | Before | After | |---------|--------|-------| | plan_persistence | 65.3s | 2.1s | | repositories_error_handling_coverage | 57.4s | 1.3s | | auto_debug_integration | 54.2s | 1.6s | | action_persistence | 51.8s | 0.3s | | repository_coverage_boost | 44.5s | 1.1s | | legacy_plan_removal | 42.1s | 0.02s | | context_service_uncovered_lines | 36.7s | 0.3s | | retry_patterns | 30.7s | 0.7s | | module_coverage | 25.4s | 0.3s | | coverage_maximum | 21.8s | 0.1s | | garbage_collection | 21.0s | 1.2s | | legacy_migrator_coverage | 16.8s | 3.4s | | plan_service_uncovered_lines | 14.9s | 0.9s | | plan_service_coverage | 13.8s | 0.9s | | repositories_uncovered_lines | 12.8s | 0.6s | | project_service_coverage | 12.2s | 3.4s | | main_coverage_complete | 11.8s | 0.05s | | project_cli_commands | 11.4s | 1.2s | | resource_registry_tables | 10.4s | 1.4s | | coverage_boost | 10.1s | 0.4s | ## Verification Full 342-feature suite passes (327 pass, 4 fail + 4 error are pre-existing `stderr not separately captured` issues in click.testing.CliRunner, verified on clean branch). Part of #478. Closes #480.
Cap time.sleep and asyncio.sleep globally at 10ms in before_all to
eliminate retry/backoff waits (tenacity wait_fixed, retry_auto_debug
exponential backoff). Save originals as time._original_sleep and
asyncio._original_sleep for tests that need real wall-clock delays
(CircuitBreaker recovery, debounce timers, validation timeouts).

Enhance template-DB patch: add "db." prefix to _SCENARIO_DB_PREFIXES
and check db_path.stat().st_size > 0 so 0-byte auto-created SQLite
files receive the template copy instead of falling through to real
migrations.

Replace subprocess.run CLI invocations with typer.testing.CliRunner
in module_coverage, main_coverage_complete, and coverage_extras step
files, eliminating ~6s Python cold-start overhead per call.

Switch plan_persistence and action_persistence to in-memory SQLite by
default; only cross-restart scenarios use file-based via an explicit
Given step.

Replace MigrationRunner.run_migrations with Base.metadata.create_all
in plan_service_steps for freshly created in-memory engines.

Total tier runtime: 565s -> 21s (96% reduction), 20 features all
under 5s behave-internal time.

Part of #478. Closes #480.
freemo left a comment

Code Review: PR #489 — perf(tests): optimize medium-slow BDD features (10-100s tier)

What I Fixed

  • Labels: Added Type/Task label (was missing entirely).

Code Quality Assessment

The most aggressive optimization in the chain, but carefully implemented. Key observations:

  1. Global sleep cap (_install_fast_sleep_patch) — Monkey-patching time.sleep and asyncio.sleep globally at 10ms is inherently risky but handled well:

    • Originals saved as time._original_sleep / asyncio._original_sleep
    • Guard against double-patching (if not callable(getattr(...)))
    • Well-documented rationale in the docstring
    • The _real_sleep pattern used consistently in timing-sensitive tests
  2. _original_sleep usage in test steps — Correctly applied in:

    • retry_patterns_steps.py: CircuitBreaker recovery timeout
    • security_async_steps.py: MockAsyncResource close delay
    • skill_refresh_steps.py: Debounce wait
    • validation_pipeline_steps.py: Timeout executor
      Each has a clear comment explaining why the real sleep is needed.
  3. Template DB enhancements — Adding "db." prefix and st_size > 0 check handles edge cases where SQLite auto-creates empty files.

  4. subprocess.run → CliRunner — Clean migrations in module_coverage_steps.py, main_coverage_complete_steps.py, coverage_extras_steps.py. The namespace wrapper in module_coverage_steps.py (type("R", (), {...})()) is a pragmatic solution to preserve the downstream step interface.

  5. In-memory SQLite defaultsplan_persistence_steps.py and action_persistence_steps.py changes are well-structured with the file_based parameter for cross-restart scenarios. The .feature file properly adds explicit Given steps for file-based scenarios.

  6. Base.metadata.create_all instead of MigrationRunner.run_migrations — Correct optimization in plan_service_steps.py.

  7. Single commit — This PR has exactly 1 commit.

Minor Code Concern

  • Leftover debug comment in environment.py: # DEBUG: trace which calls reach the patch at line ~199. This should be removed or reworded to a regular comment.

Process Violations Requiring Changes

  1. Missing ISSUES CLOSED: footer in commit message — Uses Closes #480. in the body; CONTRIBUTING.md §5.5 requires:

    ISSUES CLOSED: #480
    
  2. Missing CHANGELOG.md update — Per CONTRIBUTING.md §8.3.

  3. Missing milestone — Per CONTRIBUTING.md §8.2.

  4. Missing dependency links — Per CONTRIBUTING.md §7.3-7.4, this PR should "block" #480, and issue #480 should "depend on" #489.

Verdict

REQUEST CHANGES — Code quality is strong with one minor cleanup needed (debug comment). Process requirements must be addressed.

## Code Review: PR #489 — perf(tests): optimize medium-slow BDD features (10-100s tier) ### What I Fixed - **Labels**: Added `Type/Task` label (was missing entirely). ### Code Quality Assessment The most aggressive optimization in the chain, but carefully implemented. Key observations: 1. **Global sleep cap (`_install_fast_sleep_patch`)** — Monkey-patching `time.sleep` and `asyncio.sleep` globally at 10ms is inherently risky but handled well: - Originals saved as `time._original_sleep` / `asyncio._original_sleep` - Guard against double-patching (`if not callable(getattr(...))`) - Well-documented rationale in the docstring - The `_real_sleep` pattern used consistently in timing-sensitive tests 2. **`_original_sleep` usage in test steps** — Correctly applied in: - `retry_patterns_steps.py`: CircuitBreaker recovery timeout - `security_async_steps.py`: MockAsyncResource close delay - `skill_refresh_steps.py`: Debounce wait - `validation_pipeline_steps.py`: Timeout executor Each has a clear comment explaining why the real sleep is needed. 3. **Template DB enhancements** — Adding `"db."` prefix and `st_size > 0` check handles edge cases where SQLite auto-creates empty files. 4. **subprocess.run → CliRunner** — Clean migrations in `module_coverage_steps.py`, `main_coverage_complete_steps.py`, `coverage_extras_steps.py`. The namespace wrapper in `module_coverage_steps.py` (`type("R", (), {...})()`) is a pragmatic solution to preserve the downstream step interface. 5. **In-memory SQLite defaults** — `plan_persistence_steps.py` and `action_persistence_steps.py` changes are well-structured with the `file_based` parameter for cross-restart scenarios. The `.feature` file properly adds explicit Given steps for file-based scenarios. 6. **`Base.metadata.create_all` instead of `MigrationRunner.run_migrations`** — Correct optimization in `plan_service_steps.py`. 7. **Single commit** — This PR has exactly 1 commit. ### Minor Code Concern - **Leftover debug comment** in `environment.py`: `# DEBUG: trace which calls reach the patch` at line ~199. This should be removed or reworded to a regular comment. ### Process Violations Requiring Changes 1. **Missing `ISSUES CLOSED:` footer in commit message** — Uses `Closes #480.` in the body; CONTRIBUTING.md §5.5 requires: ``` ISSUES CLOSED: #480 ``` 2. **Missing CHANGELOG.md update** — Per CONTRIBUTING.md §8.3. 3. **Missing milestone** — Per CONTRIBUTING.md §8.2. 4. **Missing dependency links** — Per CONTRIBUTING.md §7.3-7.4, this PR should "block" #480, and issue #480 should "depend on" #489. ### Verdict **REQUEST CHANGES** — Code quality is strong with one minor cleanup needed (debug comment). Process requirements must be addressed.
Owner

Superseded by #493 — this PR has been consolidated with all code review fixes applied into a single clean branch (perf/bdd-test-optimization) with one commit per issue and no merge commits.

Superseded by #493 — this PR has been consolidated with all code review fixes applied into a single clean branch (`perf/bdd-test-optimization`) with one commit per issue and no merge commits.
freemo closed this pull request 2026-03-02 01:45:33 +00:00

Pull request closed

Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core!489
No description provided.