[AUTO-BUG-4] SqliteChangeSetStore.get_for_plan() returns all plan entries as a single ChangeSet with a random ID, losing changeset grouping #10426

Open
opened 2026-04-18 09:37:45 +00:00 by HAL9000 · 0 comments
Owner

Metadata

  • Branch: bugfix/mX-changeset-store-get-for-plan-grouping
  • Commit Message: fix(changeset): SqliteChangeSetStore.get_for_plan() groups entries by changeset_id and returns correct IDs

Background and Context

SqliteChangeSetStore.get_for_plan() in src/cleveragents/infrastructure/database/changeset_repository.py incorrectly merges all ChangeEntry records for a plan into a single SpecChangeSet with a randomly-generated changeset_id, losing the original changeset grouping and producing an ID that does not correspond to any real changeset in the database.

Summary

SqliteChangeSetStore.get_for_plan() in src/cleveragents/infrastructure/database/changeset_repository.py incorrectly merges all ChangeEntry records for a plan into a single SpecChangeSet with a randomly-generated changeset_id, losing the original changeset grouping and producing an ID that does not correspond to any real changeset in the database.

Bug Description

File: src/cleveragents/infrastructure/database/changeset_repository.py
Lines: ~350–365 (in SqliteChangeSetStore.get_for_plan())

Buggy Code

def get_for_plan(self, plan_id: str) -> list[SpecChangeSet]:
    """Return all ChangeSets associated with *plan_id*."""
    if not plan_id:
        return []

    entries = self._entry_repo.get_entries_for_plan(plan_id)
    if not entries:
        return []

    return [
        SpecChangeSet(plan_id=plan_id, entries=entries),  # BUG HERE
    ]

Root Cause

The method returns a single SpecChangeSet containing ALL entries for the plan, regardless of which changeset each entry belongs to. The SpecChangeSet model's changeset_id field has default_factory=_new_ulid (see src/cleveragents/domain/models/core/change.py line 281–284), so omitting it silently generates a new random ULID that does not correspond to any real changeset stored in the database.

Impact

When a plan has entries from multiple changesets (e.g., two separate tool invocation batches recorded via SqliteChangeSetStore.start() and SqliteChangeSetStore.record()):

  1. Data integrity loss: All entries are merged into one SpecChangeSet, losing changeset boundaries. Callers cannot distinguish which entries belong to which changeset.
  2. Wrong changeset ID: The returned SpecChangeSet.changeset_id is a newly-generated random ULID, not the actual changeset IDs stored in the database. Any code that uses this ID to look up a specific changeset (e.g., via SqliteChangeSetStore.get(changeset_id)) will fail to find it.
  3. Incorrect summaries: SqliteChangeSetStore.summarize() calls get() which calls get_entries_for_changeset() — but the random ID returned by get_for_plan() will not match any stored changeset, causing summarize() to return an empty dict {} for plans with multiple changesets.

Specification Alignment

Per the specification, ChangeSetStore.get_for_plan() must return one SpecChangeSet per changeset associated with the plan, preserving the changeset grouping and using the actual stored changeset_id values.

Fix

The method must group entries by their changeset_id (stored in the changeset_entries table column changeset_id) and return one SpecChangeSet per group. Since ChangeEntry domain objects do not carry changeset_id, the fix requires either:

  1. Adding changeset_id to the ChangeEntry domain model, or
  2. Querying the database at the ChangeSetEntryModel level and grouping by changeset_id before converting to domain objects.

Steps to Reproduce

from cleveragents.infrastructure.database.changeset_repository import SqliteChangeSetStore
from cleveragents.domain.models.core.change import ChangeEntry, ChangeOperation
from datetime import datetime, UTC

# Setup: create store with a real session factory
store = SqliteChangeSetStore(session_factory=...)

# Create two changesets for the same plan
plan_id = "01PLAN000000000000000000000"
cs1_id = store.start(plan_id)
cs2_id = store.start(plan_id)

# Record entries in each changeset
entry1 = ChangeEntry(entry_id="e1", plan_id=plan_id, resource_id="r1",
                     tool_name="write_file", operation=ChangeOperation.CREATE,
                     path="foo.py", timestamp=datetime.now(UTC))
entry2 = ChangeEntry(entry_id="e2", plan_id=plan_id, resource_id="r1",
                     tool_name="write_file", operation=ChangeOperation.MODIFY,
                     path="bar.py", timestamp=datetime.now(UTC))
store.record(cs1_id, entry1)
store.record(cs2_id, entry2)

# BUG: get_for_plan returns ONE changeset instead of TWO
result = store.get_for_plan(plan_id)
assert len(result) == 2, f"Expected 2 changesets, got {len(result)}"  # FAILS: got 1
assert result[0].changeset_id == cs1_id  # FAILS: random ID
assert result[1].changeset_id == cs2_id  # FAILS: random ID

Expected Behavior

SqliteChangeSetStore.get_for_plan(plan_id) should return one SpecChangeSet per distinct changeset stored for that plan, with each SpecChangeSet.changeset_id matching the original changeset ID returned by start().

Acceptance Criteria

  • SqliteChangeSetStore.get_for_plan(plan_id) returns one SpecChangeSet per distinct changeset for the plan
  • Each returned SpecChangeSet.changeset_id matches the actual changeset ID stored in the database
  • The TDD test from issue #10401 passes without @tdd_expected_fail

Subtasks

  • Add changeset_id field to ChangeEntry domain model (or add a separate query method that returns grouped results)
  • Fix SqliteChangeSetStore.get_for_plan() to group entries by changeset_id and return one SpecChangeSet per group with the correct changeset_id
  • Update ChangeSetEntryRepository.get_entries_for_plan() if needed to return changeset_id alongside entries
  • Remove @tdd_expected_fail tag from the TDD test (issue #10401) after the fix is implemented
  • Verify all existing tests pass

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.

Automated by CleverAgents Bot
Agent: new-issue-creator


Automated by CleverAgents Bot
Supervisor: Bug Hunt Pool | Agent: bug-hunt-pool-supervisor
Tag: [AUTO-BUG-4]

## Metadata - **Branch:** `bugfix/mX-changeset-store-get-for-plan-grouping` - **Commit Message:** `fix(changeset): SqliteChangeSetStore.get_for_plan() groups entries by changeset_id and returns correct IDs` ## Background and Context `SqliteChangeSetStore.get_for_plan()` in `src/cleveragents/infrastructure/database/changeset_repository.py` incorrectly merges all `ChangeEntry` records for a plan into a single `SpecChangeSet` with a randomly-generated `changeset_id`, losing the original changeset grouping and producing an ID that does not correspond to any real changeset in the database. ## Summary `SqliteChangeSetStore.get_for_plan()` in `src/cleveragents/infrastructure/database/changeset_repository.py` incorrectly merges all `ChangeEntry` records for a plan into a single `SpecChangeSet` with a randomly-generated `changeset_id`, losing the original changeset grouping and producing an ID that does not correspond to any real changeset in the database. ## Bug Description **File:** `src/cleveragents/infrastructure/database/changeset_repository.py` **Lines:** ~350–365 (in `SqliteChangeSetStore.get_for_plan()`) ### Buggy Code ```python def get_for_plan(self, plan_id: str) -> list[SpecChangeSet]: """Return all ChangeSets associated with *plan_id*.""" if not plan_id: return [] entries = self._entry_repo.get_entries_for_plan(plan_id) if not entries: return [] return [ SpecChangeSet(plan_id=plan_id, entries=entries), # BUG HERE ] ``` ### Root Cause The method returns a single `SpecChangeSet` containing ALL entries for the plan, regardless of which changeset each entry belongs to. The `SpecChangeSet` model's `changeset_id` field has `default_factory=_new_ulid` (see `src/cleveragents/domain/models/core/change.py` line 281–284), so omitting it silently generates a new random ULID that does not correspond to any real changeset stored in the database. ### Impact When a plan has entries from multiple changesets (e.g., two separate tool invocation batches recorded via `SqliteChangeSetStore.start()` and `SqliteChangeSetStore.record()`): 1. **Data integrity loss:** All entries are merged into one `SpecChangeSet`, losing changeset boundaries. Callers cannot distinguish which entries belong to which changeset. 2. **Wrong changeset ID:** The returned `SpecChangeSet.changeset_id` is a newly-generated random ULID, not the actual changeset IDs stored in the database. Any code that uses this ID to look up a specific changeset (e.g., via `SqliteChangeSetStore.get(changeset_id)`) will fail to find it. 3. **Incorrect summaries:** `SqliteChangeSetStore.summarize()` calls `get()` which calls `get_entries_for_changeset()` — but the random ID returned by `get_for_plan()` will not match any stored changeset, causing `summarize()` to return an empty dict `{}` for plans with multiple changesets. ### Specification Alignment Per the specification, `ChangeSetStore.get_for_plan()` must return one `SpecChangeSet` per changeset associated with the plan, preserving the changeset grouping and using the actual stored `changeset_id` values. ### Fix The method must group entries by their `changeset_id` (stored in the `changeset_entries` table column `changeset_id`) and return one `SpecChangeSet` per group. Since `ChangeEntry` domain objects do not carry `changeset_id`, the fix requires either: 1. Adding `changeset_id` to the `ChangeEntry` domain model, or 2. Querying the database at the `ChangeSetEntryModel` level and grouping by `changeset_id` before converting to domain objects. ## Steps to Reproduce ```python from cleveragents.infrastructure.database.changeset_repository import SqliteChangeSetStore from cleveragents.domain.models.core.change import ChangeEntry, ChangeOperation from datetime import datetime, UTC # Setup: create store with a real session factory store = SqliteChangeSetStore(session_factory=...) # Create two changesets for the same plan plan_id = "01PLAN000000000000000000000" cs1_id = store.start(plan_id) cs2_id = store.start(plan_id) # Record entries in each changeset entry1 = ChangeEntry(entry_id="e1", plan_id=plan_id, resource_id="r1", tool_name="write_file", operation=ChangeOperation.CREATE, path="foo.py", timestamp=datetime.now(UTC)) entry2 = ChangeEntry(entry_id="e2", plan_id=plan_id, resource_id="r1", tool_name="write_file", operation=ChangeOperation.MODIFY, path="bar.py", timestamp=datetime.now(UTC)) store.record(cs1_id, entry1) store.record(cs2_id, entry2) # BUG: get_for_plan returns ONE changeset instead of TWO result = store.get_for_plan(plan_id) assert len(result) == 2, f"Expected 2 changesets, got {len(result)}" # FAILS: got 1 assert result[0].changeset_id == cs1_id # FAILS: random ID assert result[1].changeset_id == cs2_id # FAILS: random ID ``` ## Expected Behavior `SqliteChangeSetStore.get_for_plan(plan_id)` should return one `SpecChangeSet` per distinct changeset stored for that plan, with each `SpecChangeSet.changeset_id` matching the original changeset ID returned by `start()`. ## Acceptance Criteria - `SqliteChangeSetStore.get_for_plan(plan_id)` returns one `SpecChangeSet` per distinct changeset for the plan - Each returned `SpecChangeSet.changeset_id` matches the actual changeset ID stored in the database - The TDD test from issue #10401 passes without `@tdd_expected_fail` ## Subtasks - [ ] Add `changeset_id` field to `ChangeEntry` domain model (or add a separate query method that returns grouped results) - [ ] Fix `SqliteChangeSetStore.get_for_plan()` to group entries by `changeset_id` and return one `SpecChangeSet` per group with the correct `changeset_id` - [ ] Update `ChangeSetEntryRepository.get_entries_for_plan()` if needed to return `changeset_id` alongside entries - [ ] Remove `@tdd_expected_fail` tag from the TDD test (issue #10401) after the fix is implemented - [ ] Verify all existing tests pass ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. --- **Automated by CleverAgents Bot** Agent: new-issue-creator --- Automated by CleverAgents Bot Supervisor: Bug Hunt Pool | Agent: bug-hunt-pool-supervisor Tag: [AUTO-BUG-4]
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#10426
No description provided.