BUG-HUNT: [data-integrity] CheckpointService.create_workspace_snapshot() mutates frozen Checkpoint metadata after database persistence #7522

Open
opened 2026-04-10 21:35:48 +00:00 by HAL9000 · 1 comment
Owner

Bug Report: [data-integrity] — Checkpoint Metadata Mutated After Database Persistence

Severity Assessment

  • Impact: create_workspace_snapshot() first calls create_checkpoint() which persists the checkpoint to the database (when a repository is wired), then mutates checkpoint.metadata.extra to add diff_paths, diff_based, and diff_hash. These metadata fields are never written back to the database, so the diff snapshot metadata is silently lost if the process restarts or the checkpoint is reloaded from persistence.
  • Likelihood: High — affects all callers of create_workspace_snapshot() when the database repository is configured (production mode).
  • Priority: Critical

Location

  • File: src/cleveragents/application/services/checkpoint_service.py
  • Function/Class: CheckpointService.create_workspace_snapshot
  • Lines: 403–467

Description

create_workspace_snapshot() creates a checkpoint via create_checkpoint(), which in production mode persists the checkpoint to the database via self._repository.create(checkpoint). The returned checkpoint object from _repository.create() reflects what was written to DB.

After creation and DB persistence, the method then mutates checkpoint.metadata.extra directly:

checkpoint = self.create_checkpoint(
    plan_id=plan_id,
    sandbox_ref=sandbox_ref,
    ...
)

# Store diff manifest in metadata extra — AFTER DB write!
checkpoint.metadata.extra["diff_paths"] = diff_paths
checkpoint.metadata.extra["diff_based"] = True
checkpoint.metadata.extra["diff_hash"] = hashlib.sha256(...).hexdigest()

These mutations are in-memory only. There is no subsequent self._repository.update(checkpoint) call. When the checkpoint is later retrieved from the database via get_checkpoint(checkpoint_id), the diff_paths, diff_based, and diff_hash fields will be absent.

This means rollback decisions based on diff metadata will operate on an empty or incorrect diff manifest after any service restart.

Evidence

# checkpoint_service.py lines 435-455
diff_paths = self._compute_diff_snapshot(plan_id, sandbox_ref)
diff_size = sum(len(p.encode()) for p in diff_paths)

checkpoint = self.create_checkpoint(  # <-- persists to DB here
    plan_id=plan_id,
    sandbox_ref=sandbox_ref,
    ...
)

# Store diff manifest in metadata extra  <-- mutates in-memory only, NOT saved to DB
checkpoint.metadata.extra["diff_paths"] = diff_paths
checkpoint.metadata.extra["diff_based"] = True
checkpoint.metadata.extra["diff_hash"] = hashlib.sha256(
    "|".join(sorted(diff_paths)).encode()
).hexdigest()

Expected Behavior

The diff metadata should be included in the checkpoint creation payload so it is persisted to the database in the initial create_checkpoint() call.

Actual Behavior

Diff metadata is only set in-memory after the DB write. Any subsequent retrieval of the checkpoint from the database will lack diff_paths, diff_based, and diff_hash fields.

Suggested Fix

Pass the diff metadata as part of the initial create_checkpoint() call by including it in the metadata.extra dict, or call self._repository.update(checkpoint) after mutating the metadata. The cleanest fix is:

# Compute diff before creating the checkpoint
diff_paths = self._compute_diff_snapshot(plan_id, sandbox_ref)
diff_hash = hashlib.sha256("|".join(sorted(diff_paths)).encode()).hexdigest()
diff_size = sum(len(p.encode()) for p in diff_paths)

# Include diff metadata in the initial creation
checkpoint = self.create_checkpoint(
    plan_id=plan_id,
    sandbox_ref=sandbox_ref,
    ...
    # Pass extra metadata through a new parameter or pre-populate
)

Alternatively, add an update() method to CheckpointRepository and call it after the mutation.

Category

data-integrity

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD with @tdd_expected_fail tags.


Automated by CleverAgents Bot
Supervisor: Bug Hunt Pool | Agent: bug-hunt-pool-supervisor

## Bug Report: [data-integrity] — Checkpoint Metadata Mutated After Database Persistence ### Severity Assessment - **Impact**: `create_workspace_snapshot()` first calls `create_checkpoint()` which persists the checkpoint to the database (when a repository is wired), then mutates `checkpoint.metadata.extra` to add `diff_paths`, `diff_based`, and `diff_hash`. These metadata fields are **never written back to the database**, so the diff snapshot metadata is silently lost if the process restarts or the checkpoint is reloaded from persistence. - **Likelihood**: High — affects all callers of `create_workspace_snapshot()` when the database repository is configured (production mode). - **Priority**: Critical ### Location - **File**: `src/cleveragents/application/services/checkpoint_service.py` - **Function/Class**: `CheckpointService.create_workspace_snapshot` - **Lines**: 403–467 ### Description `create_workspace_snapshot()` creates a checkpoint via `create_checkpoint()`, which in production mode persists the checkpoint to the database via `self._repository.create(checkpoint)`. The returned `checkpoint` object from `_repository.create()` reflects what was written to DB. After creation and DB persistence, the method then mutates `checkpoint.metadata.extra` directly: ```python checkpoint = self.create_checkpoint( plan_id=plan_id, sandbox_ref=sandbox_ref, ... ) # Store diff manifest in metadata extra — AFTER DB write! checkpoint.metadata.extra["diff_paths"] = diff_paths checkpoint.metadata.extra["diff_based"] = True checkpoint.metadata.extra["diff_hash"] = hashlib.sha256(...).hexdigest() ``` These mutations are in-memory only. There is no subsequent `self._repository.update(checkpoint)` call. When the checkpoint is later retrieved from the database via `get_checkpoint(checkpoint_id)`, the `diff_paths`, `diff_based`, and `diff_hash` fields will be absent. This means rollback decisions based on diff metadata will operate on an empty or incorrect diff manifest after any service restart. ### Evidence ```python # checkpoint_service.py lines 435-455 diff_paths = self._compute_diff_snapshot(plan_id, sandbox_ref) diff_size = sum(len(p.encode()) for p in diff_paths) checkpoint = self.create_checkpoint( # <-- persists to DB here plan_id=plan_id, sandbox_ref=sandbox_ref, ... ) # Store diff manifest in metadata extra <-- mutates in-memory only, NOT saved to DB checkpoint.metadata.extra["diff_paths"] = diff_paths checkpoint.metadata.extra["diff_based"] = True checkpoint.metadata.extra["diff_hash"] = hashlib.sha256( "|".join(sorted(diff_paths)).encode() ).hexdigest() ``` ### Expected Behavior The diff metadata should be included in the checkpoint creation payload so it is persisted to the database in the initial `create_checkpoint()` call. ### Actual Behavior Diff metadata is only set in-memory after the DB write. Any subsequent retrieval of the checkpoint from the database will lack `diff_paths`, `diff_based`, and `diff_hash` fields. ### Suggested Fix Pass the diff metadata as part of the initial `create_checkpoint()` call by including it in the `metadata.extra` dict, or call `self._repository.update(checkpoint)` after mutating the metadata. The cleanest fix is: ```python # Compute diff before creating the checkpoint diff_paths = self._compute_diff_snapshot(plan_id, sandbox_ref) diff_hash = hashlib.sha256("|".join(sorted(diff_paths)).encode()).hexdigest() diff_size = sum(len(p.encode()) for p in diff_paths) # Include diff metadata in the initial creation checkpoint = self.create_checkpoint( plan_id=plan_id, sandbox_ref=sandbox_ref, ... # Pass extra metadata through a new parameter or pre-populate ) ``` Alternatively, add an `update()` method to `CheckpointRepository` and call it after the mutation. ### Category data-integrity ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD with `@tdd_expected_fail` tags. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunt Pool | Agent: bug-hunt-pool-supervisor
HAL9000 added this to the v3.3.0 milestone 2026-04-10 23:07:13 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • Priority: Critical — CheckpointService mutates frozen domain model. Checkpoint rollback is a core M4 feature.
  • Milestone: v3.3.0 (M4: Corrections + Subplans) — Checkpoint creation/rollback is core to M4 acceptance criteria
  • Story Points: 3 (M) — Bug fix with clear reproduction
  • MoSCoW: Must Have — Checkpoint functionality must work correctly for M4 delivery

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

Issue triaged by project owner: - **State**: Verified - **Priority**: Critical — CheckpointService mutates frozen domain model. Checkpoint rollback is a core M4 feature. - **Milestone**: v3.3.0 (M4: Corrections + Subplans) — Checkpoint creation/rollback is core to M4 acceptance criteria - **Story Points**: 3 (M) — Bug fix with clear reproduction - **MoSCoW**: Must Have — Checkpoint functionality must work correctly for M4 delivery --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#7522
No description provided.