perf(tests): optimize the 8 slowest BDD feature files #488

2026-03-01T05:34:18Z

brent.edwards commented

2026-03-01 05:34:18 +00:00

Summary

features/environment.py: Added _ensure_template_db() for direct behave invocations, expanded _SCENARIO_DB_PREFIXES to include "test_" for step-file-created DBs, added @mock_only tag support to skip unnecessary DB setup
features/plan_commands_coverage.feature: Added @mock_only tag (fully mocked, no DB needed)
features/plan_service.feature: 14 actor-resolution scenarios now use lightweight in-memory plan service instead of heavyweight file-based DB + project init
features/steps/plan_service_steps.py: New step_create_lightweight_plan_service for actor-resolution tests
features/steps/services_coverage_steps.py: Extracted 3 helper functions to consolidate 8 near-identical Given steps (~200 lines of duplicated boilerplate removed)

Impact

Behave-internal execution times for the 8 slowest features:

Feature	Original (s)	Now (s)	Reduction
services_coverage	245	2.8	99%
context_service	215	2.1	99%
project_service	140	0.7	99%
plan_service	215	4.6	98%
core_cli_commands	113	4.2	96%
cli_streaming	213	6.4	97%
plan_commands_coverage	116	20.0	83%
cli_plan_context_commands	248	31.6	87%

Note: Wall-clock times in subprocess mode include ~25s Python startup overhead per feature, which issue #481 (in-process parallel execution) will eliminate.

Testing

All 8 target features pass (209 total scenarios, 0 failures)
10 additional features verified passing (271 scenarios, 0 failures)
No scenarios removed — all existing behavior preserved

Part of #478. Closes #479.

## Summary - **`features/environment.py`**: Added `_ensure_template_db()` for direct behave invocations, expanded `_SCENARIO_DB_PREFIXES` to include `"test_"` for step-file-created DBs, added `@mock_only` tag support to skip unnecessary DB setup - **`features/plan_commands_coverage.feature`**: Added `@mock_only` tag (fully mocked, no DB needed) - **`features/plan_service.feature`**: 14 actor-resolution scenarios now use lightweight in-memory plan service instead of heavyweight file-based DB + project init - **`features/steps/plan_service_steps.py`**: New `step_create_lightweight_plan_service` for actor-resolution tests - **`features/steps/services_coverage_steps.py`**: Extracted 3 helper functions to consolidate 8 near-identical Given steps (~200 lines of duplicated boilerplate removed) ## Impact Behave-internal execution times for the 8 slowest features: | Feature | Original (s) | Now (s) | Reduction | |---|---|---|---| | services_coverage | 245 | 2.8 | 99% | | context_service | 215 | 2.1 | 99% | | project_service | 140 | 0.7 | 99% | | plan_service | 215 | 4.6 | 98% | | core_cli_commands | 113 | 4.2 | 96% | | cli_streaming | 213 | 6.4 | 97% | | plan_commands_coverage | 116 | 20.0 | 83% | | cli_plan_context_commands | 248 | 31.6 | 87% | Note: Wall-clock times in subprocess mode include ~25s Python startup overhead per feature, which issue #481 (in-process parallel execution) will eliminate. ## Testing - All 8 target features pass (209 total scenarios, 0 failures) - 10 additional features verified passing (271 scenarios, 0 failures) - No scenarios removed — all existing behavior preserved Part of #478. Closes #479.

brent.edwards added 1 commit 2026-03-01 05:34:18 +00:00

perf(tests): optimize the 8 slowest BDD feature files 8dda3df3e0

Reduce behave-internal execution time for the 8 slowest features
(originally 113-248s each, now 0.7-31.6s behave-internal time):

features/environment.py:
- Add _ensure_template_db() for direct behave invocations (outside nox)
  so the template DB fast-path is always active.
- Expand _SCENARIO_DB_PREFIXES to include "test_" so step-file-created
  databases (services_coverage, etc.) also get the template copy.
- Add @mock_only tag support in before_scenario to skip unnecessary
  temp DB file creation for fully-mocked features.

features/plan_commands_coverage.feature:
- Add @mock_only tag — this feature uses fully mocked services and
  never touches the database, so DB setup is pure waste.

features/plan_service.feature:
- Change 14 actor-resolution scenarios from "Given I have a plan
  service" (heavyweight: file-based DB + project init) to "Given I have
  a lightweight plan service for actor testing" (in-memory DB, no
  project init).

features/steps/plan_service_steps.py:
- Add step_create_lightweight_plan_service for actor-resolution tests
  that only need PlanService with in-memory UoW.

features/steps/services_coverage_steps.py:
- Extract _setup_context_service(), _setup_plan_service(), and
  _setup_project_service() helpers to consolidate 8 near-identical
  Given steps that each had ~30 lines of duplicated DB setup
  boilerplate. Reduces code by ~200 lines with no behavior change.

Part of #478. Closes #479.

freemo added the

Type

Task

label 2026-03-01 16:06:01 +00:00

freemo requested changes 2026-03-01 16:08:51 +00:00

freemo left a comment

Code Review: PR #488 — perf(tests): optimize the 8 slowest BDD feature files

What I Fixed

Labels: Added Type/Task label (was missing entirely).

Code Quality Assessment

Excellent refactoring with significant performance wins. Key observations:

_ensure_template_db() in environment.py — Smart auto-creation of the template DB for direct behave invocations (outside nox). The fallback to normal Alembic migrations on failure is correct. The sys.path.insert to import create_template_db is a reasonable approach for a test helper.
@mock_only tag support — Clean implementation. Skipping temp DB creation for fully-mocked features is a well-targeted optimization. The before_scenario change properly checks scenario.effective_tags.
_SCENARIO_DB_PREFIXES expansion — Adding "test_" to capture step-file-created databases is correct.
services_coverage_steps.py refactoring — Outstanding cleanup. The _setup_context_service(), _setup_plan_service(), and _setup_project_service() helpers eliminate ~200 lines of duplicated boilerplate across 8 Given steps. The with_plan parameter on _setup_context_service is a nice touch for flexibility.
Lightweight plan service — step_create_lightweight_plan_service in plan_service_steps.py is a good addition for actor-resolution tests that don't need full DB/project setup.
plan_service.feature — The 14 actor-resolution scenarios correctly switched to lightweight plan service. Trailing whitespace cleanup is fine.
plan_commands_coverage.feature — Adding @mock_only is appropriate since the feature uses fully mocked services.
Single commit — This PR has exactly 1 commit.

No bugs or design issues found. The refactoring preserves all existing behavior while dramatically reducing setup overhead.

Process Violations Requiring Changes

Missing ISSUES CLOSED: footer in commit message — The commit uses Closes #479. in the body, but CONTRIBUTING.md §5.5 requires:
```
ISSUES CLOSED: #479
```
Missing CHANGELOG.md update — Per CONTRIBUTING.md §8.3, every PR must include a CHANGELOG entry.
Missing milestone — Per CONTRIBUTING.md §8.2, every PR must have a milestone.
Missing dependency links — Per CONTRIBUTING.md §7.3-7.4, this PR should "block" #479, and issue #479 should "depend on" #488.

Verdict

REQUEST CHANGES — Code is excellent; process requirements must be addressed.

## Code Review: PR #488 — perf(tests): optimize the 8 slowest BDD feature files ### What I Fixed - **Labels**: Added `Type/Task` label (was missing entirely). ### Code Quality Assessment Excellent refactoring with significant performance wins. Key observations: 1. **`_ensure_template_db()` in `environment.py`** — Smart auto-creation of the template DB for direct `behave` invocations (outside nox). The fallback to normal Alembic migrations on failure is correct. The `sys.path.insert` to import `create_template_db` is a reasonable approach for a test helper. 2. **`@mock_only` tag support** — Clean implementation. Skipping temp DB creation for fully-mocked features is a well-targeted optimization. The `before_scenario` change properly checks `scenario.effective_tags`. 3. **`_SCENARIO_DB_PREFIXES` expansion** — Adding `"test_"` to capture step-file-created databases is correct. 4. **`services_coverage_steps.py` refactoring** — Outstanding cleanup. The `_setup_context_service()`, `_setup_plan_service()`, and `_setup_project_service()` helpers eliminate ~200 lines of duplicated boilerplate across 8 Given steps. The `with_plan` parameter on `_setup_context_service` is a nice touch for flexibility. 5. **Lightweight plan service** — `step_create_lightweight_plan_service` in `plan_service_steps.py` is a good addition for actor-resolution tests that don't need full DB/project setup. 6. **`plan_service.feature`** — The 14 actor-resolution scenarios correctly switched to lightweight plan service. Trailing whitespace cleanup is fine. 7. **`plan_commands_coverage.feature`** — Adding `@mock_only` is appropriate since the feature uses fully mocked services. 8. **Single commit** — This PR has exactly 1 commit. No bugs or design issues found. The refactoring preserves all existing behavior while dramatically reducing setup overhead. ### Process Violations Requiring Changes 1. **Missing `ISSUES CLOSED:` footer in commit message** — The commit uses `Closes #479.` in the body, but CONTRIBUTING.md §5.5 requires: ``` ISSUES CLOSED: #479 ``` 2. **Missing CHANGELOG.md update** — Per CONTRIBUTING.md §8.3, every PR must include a CHANGELOG entry. 3. **Missing milestone** — Per CONTRIBUTING.md §8.2, every PR must have a milestone. 4. **Missing dependency links** — Per CONTRIBUTING.md §7.3-7.4, this PR should "block" #479, and issue #479 should "depend on" #488. ### Verdict **REQUEST CHANGES** — Code is excellent; process requirements must be addressed.

~~freemo referenced this pull request 2026-03-02 01:43:09 +00:00~~

perf(tests): reduce BDD test suite and coverage report runtime by 90%+ #493

freemo commented

2026-03-02 01:45:24 +00:00

Superseded by #493 — this PR has been consolidated with all code review fixes applied into a single clean branch (perf/bdd-test-optimization) with one commit per issue and no merge commits.

Superseded by #493 — this PR has been consolidated with all code review fixes applied into a single clean branch (`perf/bdd-test-optimization`) with one commit per issue and no merge commits.

freemo closed this pull request

2026-03-02 01:45:32 +00:00

freemo referenced this pull request

2026-03-02 01:46:06 +00:00

perf(tests): reduce BDD test suite and coverage report runtime by 90%+ #493

freemo added the

State

Wont Do

label 2026-03-04 00:41:24 +00:00