BUG-HUNT: [data-integrity] SqliteChangeSetStore._plan_map is instance-level state — changeset IDs lost across instances #7731

Open
opened 2026-04-12 03:21:34 +00:00 by HAL9000 · 3 comments
Owner

Bug Report: Data Integrity — In-Memory _plan_map Lost Across SqliteChangeSetStore Instances

Severity Assessment

  • Impact: get() on a changeset ID returns None when called from a different SqliteChangeSetStore instance than the one that called start(); the changeset appears not to exist even though entries were persisted to the database
  • Likelihood: High — in any multi-instance or multi-request scenario (e.g., web server with multiple workers, or a new service instantiation after a restart), the in-memory map is empty
  • Priority: High

Location

  • File: src/cleveragents/infrastructure/database/changeset_repository.py
  • Class: SqliteChangeSetStore
  • Lines: 407–465

Description

SqliteChangeSetStore stores the mapping of changeset_id → plan_id in an instance-level dictionary self._plan_map. When start() is called, a new changeset ID is generated and the plan_id is recorded only in this in-memory dict. When get() is called on a different instance (or after a restart), self._plan_map is empty, so even though changeset entries exist in the database, the method returns None because it cannot find the plan_id:

def get(self, changeset_id: str) -> SpecChangeSet | None:
    entries = self._entry_repo.get_entries_for_changeset(changeset_id)
    if not entries:
        plan_id = self._plan_map.get(changeset_id, "")  # empty on new instance!
        if not plan_id:
            return None  # returns None even if changeset was started elsewhere

Evidence

# changeset_repository.py lines 407-415
def start(self, plan_id: str) -> str:
    if not plan_id:
        raise ValueError("plan_id must not be empty")
    changeset_id = str(ULID())
    self._plan_map[changeset_id] = plan_id  # stored ONLY in memory
    return changeset_id

# changeset_repository.py lines 427-448
def get(self, changeset_id: str) -> SpecChangeSet | None:
    ...
    entries = self._entry_repo.get_entries_for_changeset(changeset_id)
    if not entries:
        plan_id = self._plan_map.get(changeset_id, "")  # fails on new instance
        if not plan_id:
            return None  # BUG: returns None for valid empty changesets
        return SpecChangeSet(changeset_id=changeset_id, plan_id=plan_id)

Expected Behavior

A changeset ID created by start() should be retrievable by get() from any instance, since the changeset entries (and their associated plan_id) are persisted to the database.

Actual Behavior

If get() is called from a different SqliteChangeSetStore instance than the one that called start(), and the changeset has no entries yet (empty changeset), the method returns None instead of returning an empty SpecChangeSet. The plan_id mapping is lost because it only lives in self._plan_map.

Suggested Fix

Either:

  1. Persist a changeset header row to the database in start() (preferred — adds a changesets table), or
  2. Derive the plan_id from the entries themselves when get() is called, handling the empty-changeset case by querying a persisted header, or
  3. Document clearly that SqliteChangeSetStore is not safe across instances and require callers to always use the same instance

Option 1 is the correct fix for a database-backed store:

def start(self, plan_id: str) -> str:
    changeset_id = str(ULID())
    self._header_repo.save_header(changeset_id, plan_id)  # persist to DB
    return changeset_id

Category

data-integrity

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD.


Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: bug-hunter

## Bug Report: Data Integrity — In-Memory `_plan_map` Lost Across `SqliteChangeSetStore` Instances ### Severity Assessment - **Impact**: `get()` on a changeset ID returns `None` when called from a different `SqliteChangeSetStore` instance than the one that called `start()`; the changeset appears not to exist even though entries were persisted to the database - **Likelihood**: High — in any multi-instance or multi-request scenario (e.g., web server with multiple workers, or a new service instantiation after a restart), the in-memory map is empty - **Priority**: High ### Location - **File**: `src/cleveragents/infrastructure/database/changeset_repository.py` - **Class**: `SqliteChangeSetStore` - **Lines**: 407–465 ### Description `SqliteChangeSetStore` stores the mapping of `changeset_id → plan_id` in an instance-level dictionary `self._plan_map`. When `start()` is called, a new changeset ID is generated and the `plan_id` is recorded only in this in-memory dict. When `get()` is called on a different instance (or after a restart), `self._plan_map` is empty, so even though changeset entries exist in the database, the method returns `None` because it cannot find the `plan_id`: ```python def get(self, changeset_id: str) -> SpecChangeSet | None: entries = self._entry_repo.get_entries_for_changeset(changeset_id) if not entries: plan_id = self._plan_map.get(changeset_id, "") # empty on new instance! if not plan_id: return None # returns None even if changeset was started elsewhere ``` ### Evidence ```python # changeset_repository.py lines 407-415 def start(self, plan_id: str) -> str: if not plan_id: raise ValueError("plan_id must not be empty") changeset_id = str(ULID()) self._plan_map[changeset_id] = plan_id # stored ONLY in memory return changeset_id # changeset_repository.py lines 427-448 def get(self, changeset_id: str) -> SpecChangeSet | None: ... entries = self._entry_repo.get_entries_for_changeset(changeset_id) if not entries: plan_id = self._plan_map.get(changeset_id, "") # fails on new instance if not plan_id: return None # BUG: returns None for valid empty changesets return SpecChangeSet(changeset_id=changeset_id, plan_id=plan_id) ``` ### Expected Behavior A changeset ID created by `start()` should be retrievable by `get()` from any instance, since the changeset entries (and their associated `plan_id`) are persisted to the database. ### Actual Behavior If `get()` is called from a different `SqliteChangeSetStore` instance than the one that called `start()`, and the changeset has no entries yet (empty changeset), the method returns `None` instead of returning an empty `SpecChangeSet`. The `plan_id` mapping is lost because it only lives in `self._plan_map`. ### Suggested Fix Either: 1. Persist a changeset header row to the database in `start()` (preferred — adds a `changesets` table), or 2. Derive the `plan_id` from the entries themselves when `get()` is called, handling the empty-changeset case by querying a persisted header, or 3. Document clearly that `SqliteChangeSetStore` is not safe across instances and require callers to always use the same instance Option 1 is the correct fix for a database-backed store: ```python def start(self, plan_id: str) -> str: changeset_id = str(ULID()) self._header_repo.save_header(changeset_id, plan_id) # persist to DB return changeset_id ``` ### Category data-integrity ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: bug-hunter
HAL9000 added this to the v3.2.0 milestone 2026-04-12 03:41:19 +00:00
Author
Owner

Verified — Data integrity bug: SqliteChangeSetStore._plan_map is instance-level — changeset IDs lost across instances. MoSCoW: Must-have. Priority: High.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Data integrity bug: SqliteChangeSetStore._plan_map is instance-level — changeset IDs lost across instances. MoSCoW: Must-have. Priority: High. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Author
Owner

Verified — Data integrity bug: SqliteChangeSetStore._plan_map is instance-level — changeset IDs lost across instances. MoSCoW: Must-have. Priority: High.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Data integrity bug: SqliteChangeSetStore._plan_map is instance-level — changeset IDs lost across instances. MoSCoW: Must-have. Priority: High. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Author
Owner

Verified — Data integrity bug: SqliteChangeSetStore._plan_map is instance-level — changeset IDs lost across instances. MoSCoW: Must-have. Priority: High.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Data integrity bug: SqliteChangeSetStore._plan_map is instance-level — changeset IDs lost across instances. MoSCoW: Must-have. Priority: High. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#7731
No description provided.