UAT: create_workspace_snapshot() diff metadata not persisted to database — lost on retrieval #4019

Open
opened 2026-04-06 08:40:38 +00:00 by freemo · 0 comments
Owner

Metadata

  • Branch: fix/checkpoint-diff-metadata-persistence
  • Commit Message: fix(checkpoint): persist diff metadata in create_workspace_snapshot() before returning
  • Milestone: None (Backlog)
  • Parent Epic: #3374

Background and Context

CheckpointService.create_workspace_snapshot() is designed to create a diff-based checkpoint that stores which files changed since the last checkpoint. This diff metadata (diff_paths, diff_based, diff_hash) is stored in checkpoint.metadata.extra.

Current Behavior (Bug)

In src/cleveragents/application/services/checkpoint_service.py, the create_workspace_snapshot() method:

  1. Calls self.create_checkpoint(...) — this stores the checkpoint to the repository (DB) with empty metadata.extra = {}
  2. Then modifies the returned in-memory object: checkpoint.metadata.extra["diff_paths"] = diff_paths
  3. Returns the modified in-memory object

The CheckpointRepository has no update() method, so the diff metadata is never persisted to the database. When the checkpoint is later retrieved via get_checkpoint(), the metadata.extra dict will be empty — the diff metadata is lost.

Code Location

# src/cleveragents/application/services/checkpoint_service.py, lines 438-454
checkpoint = self.create_checkpoint(  # ← stores to DB with empty extra
    plan_id=plan_id,
    sandbox_ref=sandbox_ref,
    ...
)

# Store diff manifest in metadata extra
checkpoint.metadata.extra["diff_paths"] = diff_paths  # ← modifies in-memory only!
checkpoint.metadata.extra["diff_based"] = True         # ← NOT persisted to DB
checkpoint.metadata.extra["diff_hash"] = hashlib.sha256(...).hexdigest()  # ← NOT persisted

Expected Behavior

The diff metadata should be included in the checkpoint when it is first stored to the repository, so that subsequent retrievals return the complete checkpoint with diff metadata intact.

Impact

  • CorrectionService and any code that retrieves workspace snapshots from the DB will see empty metadata.extra — the diff manifest is lost
  • The diff-based storage feature is effectively non-functional when using the database-backed CheckpointRepository
  • The bug is hidden in unit tests because the in-memory store returns the same object reference (not a copy), so the modification appears to persist

Steps to Reproduce

  1. Use CheckpointService with a real CheckpointRepository (database-backed)
  2. Call create_workspace_snapshot() with a plan that has a prior checkpoint
  3. Retrieve the checkpoint via get_checkpoint(snapshot.checkpoint_id)
  4. Observe that retrieved.metadata.extra is empty ({}) while the returned snapshot had diff_based=True

Subtasks

  • Investigate create_checkpoint() signature and CheckpointRepository interface
  • Compute diff metadata before calling create_checkpoint() and pass it via the metadata parameter at creation time, OR add update_metadata_extra() to CheckpointRepository
  • Update create_workspace_snapshot() to ensure diff metadata is persisted atomically on first write
  • Write/update BDD feature scenarios covering the persistence of diff metadata through a DB-backed repository
  • Verify fix with integration test using a real CheckpointRepository (no mocks)
  • Confirm retrieved.metadata.extra contains diff_paths, diff_based, and diff_hash after retrieval

Definition of Done

  • create_workspace_snapshot() persists diff metadata to the database at creation time
  • A checkpoint retrieved via get_checkpoint() after create_workspace_snapshot() returns metadata.extra with diff_paths, diff_based, and diff_hash populated
  • BDD unit test scenarios added/updated in features/ covering the fix
  • Integration test in robot/ verifies end-to-end persistence with a real repository
  • All nox stages pass
  • Coverage >= 97%

Backlog note: This issue was discovered during autonomous operation
on milestone v3.6.0. It does not block milestone completion and has been
placed in the backlog for human review and future milestone assignment.


Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/checkpoint-diff-metadata-persistence` - **Commit Message**: `fix(checkpoint): persist diff metadata in create_workspace_snapshot() before returning` - **Milestone**: None (Backlog) - **Parent Epic**: #3374 ## Background and Context `CheckpointService.create_workspace_snapshot()` is designed to create a diff-based checkpoint that stores which files changed since the last checkpoint. This diff metadata (`diff_paths`, `diff_based`, `diff_hash`) is stored in `checkpoint.metadata.extra`. ## Current Behavior (Bug) In `src/cleveragents/application/services/checkpoint_service.py`, the `create_workspace_snapshot()` method: 1. Calls `self.create_checkpoint(...)` — this stores the checkpoint to the repository (DB) with **empty** `metadata.extra = {}` 2. Then modifies the returned in-memory object: `checkpoint.metadata.extra["diff_paths"] = diff_paths` 3. Returns the modified in-memory object The `CheckpointRepository` has no `update()` method, so the diff metadata is **never persisted** to the database. When the checkpoint is later retrieved via `get_checkpoint()`, the `metadata.extra` dict will be empty — the diff metadata is lost. ## Code Location ```python # src/cleveragents/application/services/checkpoint_service.py, lines 438-454 checkpoint = self.create_checkpoint( # ← stores to DB with empty extra plan_id=plan_id, sandbox_ref=sandbox_ref, ... ) # Store diff manifest in metadata extra checkpoint.metadata.extra["diff_paths"] = diff_paths # ← modifies in-memory only! checkpoint.metadata.extra["diff_based"] = True # ← NOT persisted to DB checkpoint.metadata.extra["diff_hash"] = hashlib.sha256(...).hexdigest() # ← NOT persisted ``` ## Expected Behavior The diff metadata should be included in the checkpoint when it is first stored to the repository, so that subsequent retrievals return the complete checkpoint with diff metadata intact. ## Impact - `CorrectionService` and any code that retrieves workspace snapshots from the DB will see empty `metadata.extra` — the diff manifest is lost - The diff-based storage feature is effectively non-functional when using the database-backed `CheckpointRepository` - The bug is hidden in unit tests because the in-memory store returns the same object reference (not a copy), so the modification appears to persist ## Steps to Reproduce 1. Use `CheckpointService` with a real `CheckpointRepository` (database-backed) 2. Call `create_workspace_snapshot()` with a plan that has a prior checkpoint 3. Retrieve the checkpoint via `get_checkpoint(snapshot.checkpoint_id)` 4. Observe that `retrieved.metadata.extra` is empty (`{}`) while the returned snapshot had `diff_based=True` ## Subtasks - [ ] Investigate `create_checkpoint()` signature and `CheckpointRepository` interface - [ ] Compute diff metadata before calling `create_checkpoint()` and pass it via the `metadata` parameter at creation time, OR add `update_metadata_extra()` to `CheckpointRepository` - [ ] Update `create_workspace_snapshot()` to ensure diff metadata is persisted atomically on first write - [ ] Write/update BDD feature scenarios covering the persistence of diff metadata through a DB-backed repository - [ ] Verify fix with integration test using a real `CheckpointRepository` (no mocks) - [ ] Confirm `retrieved.metadata.extra` contains `diff_paths`, `diff_based`, and `diff_hash` after retrieval ## Definition of Done - [ ] `create_workspace_snapshot()` persists diff metadata to the database at creation time - [ ] A checkpoint retrieved via `get_checkpoint()` after `create_workspace_snapshot()` returns `metadata.extra` with `diff_paths`, `diff_based`, and `diff_hash` populated - [ ] BDD unit test scenarios added/updated in `features/` covering the fix - [ ] Integration test in `robot/` verifies end-to-end persistence with a real repository - [ ] All nox stages pass - [ ] Coverage >= 97% > **Backlog note:** This issue was discovered during autonomous operation > on milestone v3.6.0. It does not block milestone completion and has been > placed in the backlog for human review and future milestone assignment. --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-new-issue-creator
HAL9000 added this to the v3.5.0 milestone 2026-04-09 03:11:59 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#4019
No description provided.