[AUTO-INF-3] Extract database initialization helpers from features/environment.py into a dedicated features/testing/db_setup.py module #10244

Open
opened 2026-04-17 10:25:18 +00:00 by HAL9000 · 0 comments
Owner

Metadata

  • Commit message: refactor(tests): extract DB init helpers from environment.py into features/testing/db_setup.py
  • Branch name: refactor/auto-inf-3-extract-db-setup

Background and Context

features/environment.py is a 768-line file that serves as the Behave test lifecycle entry point. In addition to the lifecycle hooks (before_all, before_scenario, after_scenario), it contains substantial database initialization logic that is unrelated to the lifecycle hook interface:

  • _INITIALIZED_DBS: set[str] — a process-global mutable set tracking which DB paths have been initialized
  • _fast_init_or_upgrade() — a function that either copies the pre-migrated template DB or runs Alembic migrations
  • Template DB path resolution and caching logic
  • Per-process unique DB path generation

This database initialization logic is a self-contained concern that is currently embedded in the lifecycle hook file, making environment.py harder to read, test, and maintain.

Current State

features/environment.py (768 lines) handles six distinct concerns:

  1. TDD tag validationvalidate_tdd_tags(), should_invert_result(), apply_tdd_inversion() (partially addressed by #9991)
  2. Database initialization_INITIALIZED_DBS, _fast_init_or_upgrade(), template DB logic
  3. Mock AI setup — mock AI provider installation and re-application
  4. Fast sleep patch_install_fast_sleep_patch() (addressed by #9993 for type: ignore)
  5. Template DB patch — template database path patching
  6. Lifecycle hooksbefore_all(), before_scenario(), after_scenario()

The database initialization concern (items 2) is the most self-contained and testable of these concerns. The _INITIALIZED_DBS set and _fast_init_or_upgrade() function together form a complete subsystem that:

  • Has its own internal state (_INITIALIZED_DBS)
  • Has its own optimization logic (short-circuit on already-initialized paths)
  • Has its own error handling
  • Is referenced in the comment "See issue #735 — eliminates ~65,000 unnecessary function executions per full test run"

Problem

  • environment.py is 768 lines — too large for a single-concern file
  • The DB initialization subsystem (_INITIALIZED_DBS + _fast_init_or_upgrade) cannot be unit-tested without importing the entire environment.py (which triggers Behave-specific imports)
  • New contributors must read 768 lines to understand the DB initialization logic
  • The _INITIALIZED_DBS process-global set is a hidden side effect of importing environment.py

Proposed Solution

Extract the database initialization subsystem into features/testing/db_setup.py:

# features/testing/db_setup.py
"""Database initialization helpers for Behave test scenarios.

Provides fast template-copy-based DB initialization to avoid running
Alembic migrations for every scenario (see issue #735).
"""
from __future__ import annotations

from pathlib import Path

# Process-global set of already-initialized DB paths.
# Cleared in before_scenario to prevent cross-scenario state leaks.
_INITIALIZED_DBS: set[str] = set()


def fast_init_or_upgrade(db_url: str, template_db_path: Path) -> None:
    """Initialize a test database using the pre-migrated template.
    
    Short-circuits if the DB URL has already been initialized in this
    process (see _INITIALIZED_DBS). Copies the template DB if the target
    is empty, otherwise verifies it is up-to-date.
    """
    ...


def clear_initialized_dbs() -> None:
    """Clear the initialized DB set. Call from before_scenario."""
    _INITIALIZED_DBS.clear()

features/environment.py then imports and delegates:

from features.testing.db_setup import fast_init_or_upgrade, clear_initialized_dbs

Expected Behavior

features/testing/db_setup.py is a focused, independently-importable module containing all DB initialization logic. features/environment.py delegates to it via imports. The extracted module can be unit-tested without triggering Behave-specific imports. All existing Behave scenarios continue to pass, and test coverage remains >= 97%.

Acceptance Criteria

  • features/testing/db_setup.py exists and contains the DB initialization subsystem
  • _INITIALIZED_DBS, _fast_init_or_upgrade() (renamed to fast_init_or_upgrade()), and related helpers are moved to features/testing/db_setup.py
  • features/environment.py imports from features/testing/db_setup instead of defining these inline
  • features/environment.py is reduced by at least 80 lines
  • The extracted module has its own unit tests in features/ (Behave scenarios or direct Python tests)
  • All existing Behave scenarios pass without modification
  • Test coverage remains >= 97%

Subtasks

  • Identify all DB initialization code in features/environment.py (lines containing _INITIALIZED_DBS, _fast_init_or_upgrade, template DB logic)
  • Create features/testing/db_setup.py with the extracted code
  • Update features/environment.py to import from features/testing/db_setup
  • Add unit tests for fast_init_or_upgrade() and clear_initialized_dbs()
  • Run full test suite and verify no regressions
  • Verify coverage >= 97%

Definition of Done

This issue is closed when:

  1. features/testing/db_setup.py exists with the DB initialization subsystem
  2. features/environment.py delegates to features/testing/db_setup
  3. Unit tests exist for the extracted module
  4. All Behave scenarios pass
  5. Coverage >= 97%
  6. A PR has been reviewed and merged to main

Duplicate Check

Searched open issues (pages 1–7) and closed issues (pages 1–7) for keywords:

  • "environment.py" — Found: #9993 (type: ignore comments in _install_fast_sleep_patch()), #9991 (duplicate TDD tag validation logic) — neither addresses DB initialization extraction
  • "environment.py split" — No results
  • "environment.py refactor" — No results
  • "db_setup" — No results
  • "fast_init_or_upgrade" — No results
  • "_INITIALIZED_DBS" — No results
  • "single responsibility" — No results
  • "before_all" — No results
  • "template DB" — No results (only #9389, #9375, #9372, #9377 which are bug reports about readonly DB, not about extraction)

Existing issues #9991 and #9993 address specific bugs in environment.py (DRY violation for TDD tags, type: ignore comments). This issue addresses a different concern: extracting the DB initialization subsystem into a testable, focused module. There is no overlap.

No duplicate found.


Automated by CleverAgents Bot
Agent: new-issue-creator

## Metadata - **Commit message:** `refactor(tests): extract DB init helpers from environment.py into features/testing/db_setup.py` - **Branch name:** `refactor/auto-inf-3-extract-db-setup` ## Background and Context `features/environment.py` is a 768-line file that serves as the Behave test lifecycle entry point. In addition to the lifecycle hooks (`before_all`, `before_scenario`, `after_scenario`), it contains substantial database initialization logic that is unrelated to the lifecycle hook interface: - `_INITIALIZED_DBS: set[str]` — a process-global mutable set tracking which DB paths have been initialized - `_fast_init_or_upgrade()` — a function that either copies the pre-migrated template DB or runs Alembic migrations - Template DB path resolution and caching logic - Per-process unique DB path generation This database initialization logic is a self-contained concern that is currently embedded in the lifecycle hook file, making `environment.py` harder to read, test, and maintain. ### Current State `features/environment.py` (768 lines) handles **six distinct concerns**: 1. **TDD tag validation** — `validate_tdd_tags()`, `should_invert_result()`, `apply_tdd_inversion()` (partially addressed by #9991) 2. **Database initialization** — `_INITIALIZED_DBS`, `_fast_init_or_upgrade()`, template DB logic 3. **Mock AI setup** — mock AI provider installation and re-application 4. **Fast sleep patch** — `_install_fast_sleep_patch()` (addressed by #9993 for type: ignore) 5. **Template DB patch** — template database path patching 6. **Lifecycle hooks** — `before_all()`, `before_scenario()`, `after_scenario()` The database initialization concern (items 2) is the most self-contained and testable of these concerns. The `_INITIALIZED_DBS` set and `_fast_init_or_upgrade()` function together form a complete subsystem that: - Has its own internal state (`_INITIALIZED_DBS`) - Has its own optimization logic (short-circuit on already-initialized paths) - Has its own error handling - Is referenced in the comment "See issue #735 — eliminates ~65,000 unnecessary function executions per full test run" ## Problem - `environment.py` is 768 lines — too large for a single-concern file - The DB initialization subsystem (`_INITIALIZED_DBS` + `_fast_init_or_upgrade`) cannot be unit-tested without importing the entire `environment.py` (which triggers Behave-specific imports) - New contributors must read 768 lines to understand the DB initialization logic - The `_INITIALIZED_DBS` process-global set is a hidden side effect of importing `environment.py` ## Proposed Solution Extract the database initialization subsystem into `features/testing/db_setup.py`: ```python # features/testing/db_setup.py """Database initialization helpers for Behave test scenarios. Provides fast template-copy-based DB initialization to avoid running Alembic migrations for every scenario (see issue #735). """ from __future__ import annotations from pathlib import Path # Process-global set of already-initialized DB paths. # Cleared in before_scenario to prevent cross-scenario state leaks. _INITIALIZED_DBS: set[str] = set() def fast_init_or_upgrade(db_url: str, template_db_path: Path) -> None: """Initialize a test database using the pre-migrated template. Short-circuits if the DB URL has already been initialized in this process (see _INITIALIZED_DBS). Copies the template DB if the target is empty, otherwise verifies it is up-to-date. """ ... def clear_initialized_dbs() -> None: """Clear the initialized DB set. Call from before_scenario.""" _INITIALIZED_DBS.clear() ``` `features/environment.py` then imports and delegates: ```python from features.testing.db_setup import fast_init_or_upgrade, clear_initialized_dbs ``` ## Expected Behavior `features/testing/db_setup.py` is a focused, independently-importable module containing all DB initialization logic. `features/environment.py` delegates to it via imports. The extracted module can be unit-tested without triggering Behave-specific imports. All existing Behave scenarios continue to pass, and test coverage remains >= 97%. ## Acceptance Criteria - [ ] `features/testing/db_setup.py` exists and contains the DB initialization subsystem - [ ] `_INITIALIZED_DBS`, `_fast_init_or_upgrade()` (renamed to `fast_init_or_upgrade()`), and related helpers are moved to `features/testing/db_setup.py` - [ ] `features/environment.py` imports from `features/testing/db_setup` instead of defining these inline - [ ] `features/environment.py` is reduced by at least 80 lines - [ ] The extracted module has its own unit tests in `features/` (Behave scenarios or direct Python tests) - [ ] All existing Behave scenarios pass without modification - [ ] Test coverage remains >= 97% ## Subtasks - [ ] Identify all DB initialization code in `features/environment.py` (lines containing `_INITIALIZED_DBS`, `_fast_init_or_upgrade`, template DB logic) - [ ] Create `features/testing/db_setup.py` with the extracted code - [ ] Update `features/environment.py` to import from `features/testing/db_setup` - [ ] Add unit tests for `fast_init_or_upgrade()` and `clear_initialized_dbs()` - [ ] Run full test suite and verify no regressions - [ ] Verify coverage >= 97% ## Definition of Done This issue is closed when: 1. `features/testing/db_setup.py` exists with the DB initialization subsystem 2. `features/environment.py` delegates to `features/testing/db_setup` 3. Unit tests exist for the extracted module 4. All Behave scenarios pass 5. Coverage >= 97% 6. A PR has been reviewed and merged to `main` ### Duplicate Check Searched open issues (pages 1–7) and closed issues (pages 1–7) for keywords: - `"environment.py"` — Found: #9993 (type: ignore comments in `_install_fast_sleep_patch()`), #9991 (duplicate TDD tag validation logic) — **neither addresses DB initialization extraction** - `"environment.py split"` — No results - `"environment.py refactor"` — No results - `"db_setup"` — No results - `"fast_init_or_upgrade"` — No results - `"_INITIALIZED_DBS"` — No results - `"single responsibility"` — No results - `"before_all"` — No results - `"template DB"` — No results (only #9389, #9375, #9372, #9377 which are bug reports about readonly DB, not about extraction) Existing issues #9991 and #9993 address specific bugs in `environment.py` (DRY violation for TDD tags, type: ignore comments). This issue addresses a different concern: extracting the DB initialization subsystem into a testable, focused module. There is no overlap. No duplicate found. --- **Automated by CleverAgents Bot** Agent: new-issue-creator
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#10244
No description provided.