perf(tests): reduce per-feature startup cost via shared fixtures and lazy imports #483

Closed
opened 2026-03-01 01:27:32 +00:00 by freemo · 0 comments
Owner

Metadata

  • Commit Message: perf(tests): reduce per-feature startup cost with shared fixtures and lazy imports
  • Branch: perf/shared-fixtures-lazy-imports

Background and Context

Part of #478.

Every feature file currently initializes its own database connections, DI containers, and imported module state. With 339 features running as separate subprocesses, this means 339 independent Python startups each importing the full cleveragents package. Even after the subprocess model is replaced (see #481), per-feature setup within the runner adds overhead.

Key Overhead Sources

  1. Import chain: cleveragents imports SQLAlchemy, Pydantic, Click, LangChain, and other heavy dependencies. Each import chain takes ~500ms-2s.
  2. Database setup: Many features create fresh in-memory SQLite databases with full schema initialization per scenario or per feature.
  3. DI container: The application container (get_container()) is instantiated per feature, wiring up all service dependencies.
  4. Environment setup: environment.py hooks run per feature, potentially doing expensive setup.

Acceptance Criteria

  • Measured import time for cleveragents package is documented
  • Shared database session pool is implemented for read-only test scenarios
  • DI container is initialized once and reused across features (with per-feature override capability)
  • Per-feature startup overhead is reduced by at least 50%
  • All 339 features pass via nox -e unit_tests
  • Coverage remains at or above 97%

Subtasks

Investigation Phase

  • Measure import time: run python -X importtime -c "import cleveragents" and identify the 10 slowest imported modules
  • Profile features/environment.py hooks (before_all, before_feature, before_scenario) to measure per-feature and per-scenario overhead
  • Count how many features create their own database session vs. reusing a shared one
  • Identify step definition files that import heavy modules at module level (e.g., LangChain, SQLAlchemy engine creation)
  • Measure time to initialize the DI container (get_container())

Implementation Phase

  • Implement lazy imports for heavy optional dependencies in step definitions: wrap import langchain, import langgraph, etc. in functions that cache the import result
  • Create a shared TestDatabaseManager in features/environment.py that initializes one in-memory SQLite database in before_all and provides per-scenario savepoints with rollback in after_scenario
  • Create a shared TestContainerManager that initializes the DI container once in before_all and provides per-feature/per-scenario reset of mutable state
  • Move expensive one-time setup (schema creation, fixture data loading) from before_feature/before_scenario to before_all
  • Audit all 339 features' Background sections and step definitions for redundant setup that can be hoisted to shared fixtures
  • Add __all__ exports to step definition files to prevent unnecessary attribute resolution during import

Verification Phase

  • Run nox -e unit_tests and confirm all 339 features pass
  • Run nox -e coverage_report and confirm coverage >= 97%
  • Measure per-feature startup time before and after, document improvement

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
## Metadata - **Commit Message**: `perf(tests): reduce per-feature startup cost with shared fixtures and lazy imports` - **Branch**: `perf/shared-fixtures-lazy-imports` ## Background and Context Part of #478. Every feature file currently initializes its own database connections, DI containers, and imported module state. With 339 features running as separate subprocesses, this means 339 independent Python startups each importing the full `cleveragents` package. Even after the subprocess model is replaced (see #481), per-feature setup within the runner adds overhead. ### Key Overhead Sources 1. **Import chain**: `cleveragents` imports SQLAlchemy, Pydantic, Click, LangChain, and other heavy dependencies. Each import chain takes ~500ms-2s. 2. **Database setup**: Many features create fresh in-memory SQLite databases with full schema initialization per scenario or per feature. 3. **DI container**: The application container (`get_container()`) is instantiated per feature, wiring up all service dependencies. 4. **Environment setup**: `environment.py` hooks run per feature, potentially doing expensive setup. ## Acceptance Criteria - [ ] Measured import time for `cleveragents` package is documented - [ ] Shared database session pool is implemented for read-only test scenarios - [ ] DI container is initialized once and reused across features (with per-feature override capability) - [ ] Per-feature startup overhead is reduced by at least 50% - [ ] All 339 features pass via `nox -e unit_tests` - [ ] Coverage remains at or above 97% ## Subtasks ### Investigation Phase - [ ] Measure import time: run `python -X importtime -c "import cleveragents"` and identify the 10 slowest imported modules - [ ] Profile `features/environment.py` hooks (`before_all`, `before_feature`, `before_scenario`) to measure per-feature and per-scenario overhead - [ ] Count how many features create their own database session vs. reusing a shared one - [ ] Identify step definition files that import heavy modules at module level (e.g., LangChain, SQLAlchemy engine creation) - [ ] Measure time to initialize the DI container (`get_container()`) ### Implementation Phase - [ ] Implement lazy imports for heavy optional dependencies in step definitions: wrap `import langchain`, `import langgraph`, etc. in functions that cache the import result - [ ] Create a shared `TestDatabaseManager` in `features/environment.py` that initializes one in-memory SQLite database in `before_all` and provides per-scenario savepoints with rollback in `after_scenario` - [ ] Create a shared `TestContainerManager` that initializes the DI container once in `before_all` and provides per-feature/per-scenario reset of mutable state - [ ] Move expensive one-time setup (schema creation, fixture data loading) from `before_feature`/`before_scenario` to `before_all` - [ ] Audit all 339 features' `Background` sections and step definitions for redundant setup that can be hoisted to shared fixtures - [ ] Add `__all__` exports to step definition files to prevent unnecessary attribute resolution during import ### Verification Phase - [ ] Run `nox -e unit_tests` and confirm all 339 features pass - [ ] Run `nox -e coverage_report` and confirm coverage >= 97% - [ ] Measure per-feature startup time before and after, document improvement ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done.
freemo added this to the v3.2.0 milestone 2026-03-02 01:45:04 +00:00
freemo added reference perf/bdd-test-optimization 2026-03-02 01:46:39 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#483
No description provided.