BUG-HUNT: [data-flow] UnitOfWork.init_database() disposes engine after migrations, losing all data in in-memory SQLite databases #7354

Open
opened 2026-04-10 18:01:41 +00:00 by HAL9000 · 3 comments
Owner

Bug Report: [data-flow] UnitOfWork.init_database() disposes the engine after migration for file-based SQLite, but the same guard accidentally prevents engine disposal for edge cases

Severity Assessment

  • Impact: When init_database() is called on a file-based SQLite database, it calls self._engine.dispose() and sets self._engine = None. If the engine property was already accessed (e.g., during _ensure_database_initialized()), and then init_database() disposes it, any existing Session references obtained from the session factory before init_database() was called will now be using a disposed engine. Executing queries through those sessions will raise sqlalchemy.exc.ResourceClosedError.
  • Likelihood: Low-Medium — depends on whether callers access the engine before calling init_database()
  • Priority: Medium

Location

  • File: src/cleveragents/infrastructure/database/unit_of_work.py
  • Function/Class: UnitOfWork.init_database()
  • Lines: ~95-115

Description

UnitOfWork.init_database() has a subtle sequencing issue: after running migrations with _ensure_database_initialized(force=True), it conditionally disposes the engine:

def init_database(self) -> None:
    self._ensure_database_initialized(force=True)
    
    # For in-memory databases, we must keep the engine alive...
    # Don't dispose in-memory databases as each new engine is a new database
    if self._engine is not None and self.database_url != "sqlite:///:memory:":
        self._engine.dispose()
        self._engine = None
        self._session_factory = None

The issue is that _ensure_database_initialized() is called with force=True, which runs MigrationRunner.init_or_upgrade(). This function may have opened a connection to the database during migration. Then init_database() disposes the engine.

But more critically: _ensure_database_initialized() accesses self.engine (via the engine property), which creates the engine if it doesn't exist. This engine is then disposed by init_database(). Any session factory created before init_database() was called would now reference the disposed engine, causing ResourceClosedError on the next query.

Additionally, the _ensure_database_initialized() inside the engine property accesses self._session_factory, which calls _ensure_database_initialized() inside engine, which could trigger recursive initialization in certain execution paths.

Evidence

@property
def engine(self) -> Engine:
    """Get or create database engine."""
    if self._engine is None:
        self._ensure_database_initialized()  # Accesses engine during migration
        # ... creates engine ...
    return self._engine  # type: ignore

def init_database(self) -> None:
    # Uses _ensure_database_initialized(force=True)
    self._ensure_database_initialized(force=True)
    
    # Disposes the engine that was just created during migration!
    if self._engine is not None and self.database_url != "sqlite:///:memory:":
        self._engine.dispose()
        self._engine = None  # Now any held references are broken
        self._session_factory = None  # Same

A caller that does:

uow = UnitOfWork("sqlite:///mydb.sqlite")
session = uow.session_factory()  # Creates engine, creates session_factory
uow.init_database()  # Now disposes engine, session_factory references disposed engine!
with uow.transaction() as ctx:
    ctx.projects.get_all()  # ResourceClosedError!

Expected Behavior

init_database() should either:

  1. Not dispose the engine (let connection pooling handle cleanup), or
  2. Clearly document that init_database() invalidates any previously created sessions and must be called before any session usage

Actual Behavior

If a session or session factory is obtained before init_database() is called, then init_database() is called explicitly, the engine is disposed and all existing sessions become invalid.

Category

data-flow

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_, and @tdd_expected_fail to prove the bug exists before fixing it.


Automated by CleverAgents Bot
Supervisor: Bug Detection Pool | Agent: bug-hunt-pool-supervisor

## Bug Report: [data-flow] UnitOfWork.init_database() disposes the engine after migration for file-based SQLite, but the same guard accidentally prevents engine disposal for edge cases ### Severity Assessment - **Impact**: When `init_database()` is called on a file-based SQLite database, it calls `self._engine.dispose()` and sets `self._engine = None`. If the engine property was already accessed (e.g., during `_ensure_database_initialized()`), and then `init_database()` disposes it, any **existing Session references** obtained from the session factory before `init_database()` was called will now be using a disposed engine. Executing queries through those sessions will raise `sqlalchemy.exc.ResourceClosedError`. - **Likelihood**: Low-Medium — depends on whether callers access the engine before calling `init_database()` - **Priority**: Medium ### Location - **File**: `src/cleveragents/infrastructure/database/unit_of_work.py` - **Function/Class**: `UnitOfWork.init_database()` - **Lines**: ~95-115 ### Description `UnitOfWork.init_database()` has a subtle sequencing issue: after running migrations with `_ensure_database_initialized(force=True)`, it conditionally disposes the engine: ```python def init_database(self) -> None: self._ensure_database_initialized(force=True) # For in-memory databases, we must keep the engine alive... # Don't dispose in-memory databases as each new engine is a new database if self._engine is not None and self.database_url != "sqlite:///:memory:": self._engine.dispose() self._engine = None self._session_factory = None ``` The issue is that `_ensure_database_initialized()` is called with `force=True`, which runs `MigrationRunner.init_or_upgrade()`. This function **may have opened a connection to the database** during migration. Then `init_database()` disposes the engine. But more critically: `_ensure_database_initialized()` accesses `self.engine` (via the `engine` property), which creates the engine if it doesn't exist. This engine is then disposed by `init_database()`. Any session factory created before `init_database()` was called would now reference the disposed engine, causing `ResourceClosedError` on the next query. Additionally, the `_ensure_database_initialized()` inside the `engine` property accesses `self._session_factory`, which calls `_ensure_database_initialized()` inside `engine`, which could trigger recursive initialization in certain execution paths. ### Evidence ```python @property def engine(self) -> Engine: """Get or create database engine.""" if self._engine is None: self._ensure_database_initialized() # Accesses engine during migration # ... creates engine ... return self._engine # type: ignore def init_database(self) -> None: # Uses _ensure_database_initialized(force=True) self._ensure_database_initialized(force=True) # Disposes the engine that was just created during migration! if self._engine is not None and self.database_url != "sqlite:///:memory:": self._engine.dispose() self._engine = None # Now any held references are broken self._session_factory = None # Same ``` A caller that does: ```python uow = UnitOfWork("sqlite:///mydb.sqlite") session = uow.session_factory() # Creates engine, creates session_factory uow.init_database() # Now disposes engine, session_factory references disposed engine! with uow.transaction() as ctx: ctx.projects.get_all() # ResourceClosedError! ``` ### Expected Behavior `init_database()` should either: 1. Not dispose the engine (let connection pooling handle cleanup), or 2. Clearly document that `init_database()` invalidates any previously created sessions and must be called before any session usage ### Actual Behavior If a session or session factory is obtained before `init_database()` is called, then `init_database()` is called explicitly, the engine is disposed and all existing sessions become invalid. ### Category data-flow ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it. --- **Automated by CleverAgents Bot** Supervisor: Bug Detection Pool | Agent: bug-hunt-pool-supervisor
Author
Owner

Verified — Data flow bug: UnitOfWork disposes engine after migrations losing in-memory data. MoSCoW: Should-have. Priority: Medium.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Data flow bug: UnitOfWork disposes engine after migrations losing in-memory data. MoSCoW: Should-have. Priority: Medium. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Author
Owner

Verified — Data flow bug: UnitOfWork disposes engine after migrations losing in-memory data. MoSCoW: Should-have. Priority: Medium.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Data flow bug: UnitOfWork disposes engine after migrations losing in-memory data. MoSCoW: Should-have. Priority: Medium. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Author
Owner

Verified — Data flow bug: UnitOfWork disposes engine after migrations losing in-memory data. MoSCoW: Should-have. Priority: Medium.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Data flow bug: UnitOfWork disposes engine after migrations losing in-memory data. MoSCoW: Should-have. Priority: Medium. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#7354
No description provided.