[AUTO-INF-6] Duplicate test data factory functions across BDD step files create maintenance burden and inconsistency risk #10304

Open
opened 2026-04-18 08:27:49 +00:00 by HAL9000 · 0 comments
Owner

Metadata

Field Value
Branch test/auto-inf-6-consolidate-step-factories
Commit Message test(infra): consolidate duplicate test data factory functions from step files into shared features/mocks/ module
Milestone v3.5.0
Parent Epic

Background and Context

Several test data factory functions are duplicated verbatim across multiple BDD step files. When the domain model changes (e.g., a new required field is added to Plan or Action), every duplicate factory must be updated independently. Missing one update causes a silent inconsistency where some scenarios test with stale data while others test with current data.

Evidence — three duplicated factory groups:

1. _make_plan() — duplicated in 2 files

features/steps/plan_persistence_steps.py (lines 60–90):

def _make_plan(
    plan_id: str,
    action_name: str = "local/persist-action",
    phase: PlanPhase = PlanPhase.ACTION,
    # ... 10+ parameters ...
) -> Plan:
    """Build a Plan domain object for testing."""
    now = datetime(2026, 3, 1, tzinfo=UTC)  # HARDCODED DATE
    return Plan(
        identity=PlanIdentity(
            plan_id=plan_id,
            parent_plan_id=parent_plan_id,
            root_plan_id=root_plan_id,
            attempt=1,
        ),
        # ... 20+ fields ...
    )

features/steps/repositories_coverage_steps.py (lines 60–90):

def _make_plan(
    plan_id: str,
    action_name: str = "local/test-action",  # DIFFERENT DEFAULT
    phase: PlanPhase = PlanPhase.ACTION,
    # ... 10+ parameters ...
) -> Plan:
    """Build a Plan domain object for testing."""
    now = datetime(2026, 3, 1, tzinfo=UTC)  # SAME HARDCODED DATE
    return Plan(...)

The two implementations have different default values for action_name ("local/persist-action" vs "local/test-action"), creating subtle inconsistency. Both use the same hardcoded date datetime(2026, 3, 1, tzinfo=UTC), which means tests cannot exercise time-based logic.

2. _make_action() — duplicated in 2 files

features/steps/uow_lifecycle_steps.py (lines 60–80):

def _make_action(
    name: str = "local/test-action",
    state: str = "available",
) -> Action:
    """Create a minimal valid Action domain object."""
    # ... 20 lines of boilerplate ...

features/steps/repositories_coverage_steps.py (lines 90–110):

def _make_action(
    name: str = "local/test-action",
    state: str = "available",
) -> Action:
    """Create a minimal valid Action domain object."""
    # ... 20 lines of boilerplate (identical) ...

These are byte-for-byte identical. Any change to the Action domain model requires updating both files.

3. _suppress_structlog_stdout() — duplicated in 2 files

features/steps/tdd_sqlite_url_cwd_steps.py:

@contextmanager
def _suppress_structlog_stdout() -> Generator[None, None, None]:
    """Suppress structlog output to stdout during test."""
    # ... implementation ...

features/steps/tdd_session_list_missing_db_steps.py:

@contextmanager
def _suppress_structlog_stdout() -> Generator[None, None, None]:
    """Suppress structlog output to stdout during test."""
    # ... identical implementation ...

Impact

  1. Maintenance burden: When Plan, Action, or other domain models gain new required fields, every duplicate factory must be updated. Missing one update causes TypeError at test runtime.
  2. Inconsistency risk: The two _make_plan() implementations already have different defaults (action_name). This means scenarios using plan_persistence_steps.py test with "local/persist-action" while scenarios using repositories_coverage_steps.py test with "local/test-action", creating invisible inconsistency.
  3. Hardcoded dates: Both _make_plan() implementations use datetime(2026, 3, 1, tzinfo=UTC) as a hardcoded date. Tests cannot exercise time-based logic (e.g., sorting by created_at, expiry checks).
  4. Coverage gaps: Duplicate utilities in step files are not themselves tested, so coverage of the factory logic depends on which scenarios happen to exercise which factory.

Expected Behavior

Shared test data factories should live in features/mocks/ (which already contains test_uow_factory.py, fake_provider.py, etc.) and be imported by step files that need them.

Proposed structure:

features/mocks/
    test_data_factories.py   # NEW: shared Plan, Action, etc. factories
    test_uow_factory.py      # existing
    fake_provider.py         # existing
    ...

Correct pattern:

# features/mocks/test_data_factories.py
from datetime import datetime, UTC

def make_plan(
    plan_id: str,
    action_name: str = "local/test-action",
    phase: PlanPhase = PlanPhase.ACTION,
    created_at: datetime | None = None,
    # ...
) -> Plan:
    """Create a Plan domain object for testing.
    
    Args:
        created_at: If None, uses datetime.now(UTC) for realistic timestamps.
    """
    now = created_at or datetime.now(UTC)
    return Plan(...)

Step files then import:

from features.mocks.test_data_factories import make_plan, make_action

Acceptance Criteria

  • features/mocks/test_data_factories.py created with consolidated make_plan(), make_action(), and suppress_structlog_stdout() implementations
  • _make_plan() removed from plan_persistence_steps.py and repositories_coverage_steps.py; both import from test_data_factories
  • _make_action() removed from uow_lifecycle_steps.py and repositories_coverage_steps.py; both import from test_data_factories
  • _suppress_structlog_stdout() removed from tdd_sqlite_url_cwd_steps.py and tdd_session_list_missing_db_steps.py; both import from test_data_factories
  • Consolidated make_plan() uses datetime.now(UTC) as default (not hardcoded date)
  • All nox sessions pass without regressions
  • Coverage >= 97%

Subtasks

  • Create features/mocks/test_data_factories.py with consolidated factory functions
  • Migrate _make_plan() from plan_persistence_steps.py to test_data_factories.py
  • Migrate _make_plan() from repositories_coverage_steps.py to test_data_factories.py (reconcile different defaults)
  • Migrate _make_action() from uow_lifecycle_steps.py to test_data_factories.py
  • Migrate _make_action() from repositories_coverage_steps.py to test_data_factories.py
  • Migrate _suppress_structlog_stdout() from tdd_sqlite_url_cwd_steps.py to test_data_factories.py
  • Migrate _suppress_structlog_stdout() from tdd_session_list_missing_db_steps.py to test_data_factories.py
  • Update all step files to import from features.mocks.test_data_factories
  • Fix hardcoded datetime(2026, 3, 1, tzinfo=UTC) to use datetime.now(UTC) as default
  • Run nox -s unit_tests to confirm no regressions
  • Run nox -s coverage_report to confirm coverage >= 97%

Definition of Done

This issue should be closed when:

  • No duplicate factory functions remain across step files
  • All shared factories live in features/mocks/test_data_factories.py
  • Hardcoded dates replaced with dynamic defaults
  • All nox stages pass
  • Coverage >= 97%

Duplicate Check

Search 1 — open issues, keywords "duplicate factory", "factory pattern", "_make_plan", "_make_action", "shared factory", "step factory":
Searched open issues pages 1–12 for "duplicate factory", "factory pattern", "_make_plan", "_make_action", "_suppress_structlog", "test data factory", "shared factory", "step factory". No matches found.

Search 2 — cross-area analysis:

  • #10245 (module-level ULID generation): covers module-level ULID generation at import time, not duplicate factory functions. Different issue.
  • #10244 (environment.py refactoring): covers extracting DB init helpers from environment.py, not step file factory consolidation. Different issue.
  • #5844 (hardcoded /tmp paths in fixtures): covers fixture JSON files, not factory functions in step files. Different issue.
  • #9789 (test data realism): covers ULID validity in fixture JSON, not factory function duplication. Different issue.

Search 3 — closed issues, keywords "duplicate factory", "_make_plan", "_make_action", "test data factory":
Searched closed issues pages 1–11 for same keywords. No matches found.

Search 4 — keywords "consolidate factories", "shared test helpers", "test_data_factories":
No existing open or closed issue covers the specific pattern of duplicate factory functions across step files.

Search 5 — uncertainty check:
This finding is clearly distinct from all existing issues. The specific problem (duplicate _make_plan(), _make_action(), and _suppress_structlog_stdout() functions across step files) is not addressed by any existing issue. Confident this is a new gap.


Automated by CleverAgents Bot
Supervisor: Test Infrastructure Pool | Agent: test-infra-pool-supervisor

## Metadata | Field | Value | |---|---| | **Branch** | `test/auto-inf-6-consolidate-step-factories` | | **Commit Message** | `test(infra): consolidate duplicate test data factory functions from step files into shared features/mocks/ module` | | **Milestone** | v3.5.0 | | **Parent Epic** | — | --- ## Background and Context Several test data factory functions are duplicated verbatim across multiple BDD step files. When the domain model changes (e.g., a new required field is added to `Plan` or `Action`), every duplicate factory must be updated independently. Missing one update causes a silent inconsistency where some scenarios test with stale data while others test with current data. **Evidence — three duplicated factory groups:** ### 1. `_make_plan()` — duplicated in 2 files **`features/steps/plan_persistence_steps.py`** (lines 60–90): ```python def _make_plan( plan_id: str, action_name: str = "local/persist-action", phase: PlanPhase = PlanPhase.ACTION, # ... 10+ parameters ... ) -> Plan: """Build a Plan domain object for testing.""" now = datetime(2026, 3, 1, tzinfo=UTC) # HARDCODED DATE return Plan( identity=PlanIdentity( plan_id=plan_id, parent_plan_id=parent_plan_id, root_plan_id=root_plan_id, attempt=1, ), # ... 20+ fields ... ) ``` **`features/steps/repositories_coverage_steps.py`** (lines 60–90): ```python def _make_plan( plan_id: str, action_name: str = "local/test-action", # DIFFERENT DEFAULT phase: PlanPhase = PlanPhase.ACTION, # ... 10+ parameters ... ) -> Plan: """Build a Plan domain object for testing.""" now = datetime(2026, 3, 1, tzinfo=UTC) # SAME HARDCODED DATE return Plan(...) ``` The two implementations have different default values for `action_name` (`"local/persist-action"` vs `"local/test-action"`), creating subtle inconsistency. Both use the same hardcoded date `datetime(2026, 3, 1, tzinfo=UTC)`, which means tests cannot exercise time-based logic. ### 2. `_make_action()` — duplicated in 2 files **`features/steps/uow_lifecycle_steps.py`** (lines 60–80): ```python def _make_action( name: str = "local/test-action", state: str = "available", ) -> Action: """Create a minimal valid Action domain object.""" # ... 20 lines of boilerplate ... ``` **`features/steps/repositories_coverage_steps.py`** (lines 90–110): ```python def _make_action( name: str = "local/test-action", state: str = "available", ) -> Action: """Create a minimal valid Action domain object.""" # ... 20 lines of boilerplate (identical) ... ``` These are byte-for-byte identical. Any change to the `Action` domain model requires updating both files. ### 3. `_suppress_structlog_stdout()` — duplicated in 2 files **`features/steps/tdd_sqlite_url_cwd_steps.py`**: ```python @contextmanager def _suppress_structlog_stdout() -> Generator[None, None, None]: """Suppress structlog output to stdout during test.""" # ... implementation ... ``` **`features/steps/tdd_session_list_missing_db_steps.py`**: ```python @contextmanager def _suppress_structlog_stdout() -> Generator[None, None, None]: """Suppress structlog output to stdout during test.""" # ... identical implementation ... ``` --- ## Impact 1. **Maintenance burden**: When `Plan`, `Action`, or other domain models gain new required fields, every duplicate factory must be updated. Missing one update causes `TypeError` at test runtime. 2. **Inconsistency risk**: The two `_make_plan()` implementations already have different defaults (`action_name`). This means scenarios using `plan_persistence_steps.py` test with `"local/persist-action"` while scenarios using `repositories_coverage_steps.py` test with `"local/test-action"`, creating invisible inconsistency. 3. **Hardcoded dates**: Both `_make_plan()` implementations use `datetime(2026, 3, 1, tzinfo=UTC)` as a hardcoded date. Tests cannot exercise time-based logic (e.g., sorting by `created_at`, expiry checks). 4. **Coverage gaps**: Duplicate utilities in step files are not themselves tested, so coverage of the factory logic depends on which scenarios happen to exercise which factory. --- ## Expected Behavior Shared test data factories should live in `features/mocks/` (which already contains `test_uow_factory.py`, `fake_provider.py`, etc.) and be imported by step files that need them. **Proposed structure:** ``` features/mocks/ test_data_factories.py # NEW: shared Plan, Action, etc. factories test_uow_factory.py # existing fake_provider.py # existing ... ``` **Correct pattern:** ```python # features/mocks/test_data_factories.py from datetime import datetime, UTC def make_plan( plan_id: str, action_name: str = "local/test-action", phase: PlanPhase = PlanPhase.ACTION, created_at: datetime | None = None, # ... ) -> Plan: """Create a Plan domain object for testing. Args: created_at: If None, uses datetime.now(UTC) for realistic timestamps. """ now = created_at or datetime.now(UTC) return Plan(...) ``` Step files then import: ```python from features.mocks.test_data_factories import make_plan, make_action ``` --- ## Acceptance Criteria - [ ] `features/mocks/test_data_factories.py` created with consolidated `make_plan()`, `make_action()`, and `suppress_structlog_stdout()` implementations - [ ] `_make_plan()` removed from `plan_persistence_steps.py` and `repositories_coverage_steps.py`; both import from `test_data_factories` - [ ] `_make_action()` removed from `uow_lifecycle_steps.py` and `repositories_coverage_steps.py`; both import from `test_data_factories` - [ ] `_suppress_structlog_stdout()` removed from `tdd_sqlite_url_cwd_steps.py` and `tdd_session_list_missing_db_steps.py`; both import from `test_data_factories` - [ ] Consolidated `make_plan()` uses `datetime.now(UTC)` as default (not hardcoded date) - [ ] All nox sessions pass without regressions - [ ] Coverage >= 97% --- ## Subtasks - [ ] Create `features/mocks/test_data_factories.py` with consolidated factory functions - [ ] Migrate `_make_plan()` from `plan_persistence_steps.py` to `test_data_factories.py` - [ ] Migrate `_make_plan()` from `repositories_coverage_steps.py` to `test_data_factories.py` (reconcile different defaults) - [ ] Migrate `_make_action()` from `uow_lifecycle_steps.py` to `test_data_factories.py` - [ ] Migrate `_make_action()` from `repositories_coverage_steps.py` to `test_data_factories.py` - [ ] Migrate `_suppress_structlog_stdout()` from `tdd_sqlite_url_cwd_steps.py` to `test_data_factories.py` - [ ] Migrate `_suppress_structlog_stdout()` from `tdd_session_list_missing_db_steps.py` to `test_data_factories.py` - [ ] Update all step files to import from `features.mocks.test_data_factories` - [ ] Fix hardcoded `datetime(2026, 3, 1, tzinfo=UTC)` to use `datetime.now(UTC)` as default - [ ] Run `nox -s unit_tests` to confirm no regressions - [ ] Run `nox -s coverage_report` to confirm coverage >= 97% --- ## Definition of Done This issue should be closed when: - [ ] No duplicate factory functions remain across step files - [ ] All shared factories live in `features/mocks/test_data_factories.py` - [ ] Hardcoded dates replaced with dynamic defaults - [ ] All nox stages pass - [ ] Coverage >= 97% --- ### Duplicate Check **Search 1 — open issues, keywords "duplicate factory", "factory pattern", "_make_plan", "_make_action", "shared factory", "step factory":** Searched open issues pages 1–12 for "duplicate factory", "factory pattern", "_make_plan", "_make_action", "_suppress_structlog", "test data factory", "shared factory", "step factory". No matches found. **Search 2 — cross-area analysis:** - #10245 (module-level ULID generation): covers module-level ULID generation at import time, not duplicate factory functions. Different issue. - #10244 (environment.py refactoring): covers extracting DB init helpers from environment.py, not step file factory consolidation. Different issue. - #5844 (hardcoded /tmp paths in fixtures): covers fixture JSON files, not factory functions in step files. Different issue. - #9789 (test data realism): covers ULID validity in fixture JSON, not factory function duplication. Different issue. **Search 3 — closed issues, keywords "duplicate factory", "_make_plan", "_make_action", "test data factory":** Searched closed issues pages 1–11 for same keywords. No matches found. **Search 4 — keywords "consolidate factories", "shared test helpers", "test_data_factories":** No existing open or closed issue covers the specific pattern of duplicate factory functions across step files. **Search 5 — uncertainty check:** This finding is clearly distinct from all existing issues. The specific problem (duplicate `_make_plan()`, `_make_action()`, and `_suppress_structlog_stdout()` functions across step files) is not addressed by any existing issue. Confident this is a new gap. --- **Automated by CleverAgents Bot** Supervisor: Test Infrastructure Pool | Agent: test-infra-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#10304
No description provided.