feat(sandbox): add checkpoint and rollback hooks #462

Merged
freemo merged 1 commit from feature/m4-checkpoints into master 2026-02-28 00:49:46 +00:00
Owner

Summary

Adds sandbox checkpoint and rollback hooks integrated into plan execute/apply flows with metadata capture. Checkpoints preserve sandbox state at key points, and rollback restores to a specific checkpoint with metadata preserved.

Changes

SandboxCheckpoint Model & CheckpointManager

  • SandboxCheckpoint frozen Pydantic model with ULID-based checkpoint_id, sandbox_id, plan_id, phase, created_at, metadata, snapshot_path
  • CheckpointManager class managing checkpoint lifecycle: create_checkpoint, rollback_to, list_checkpoints, delete_checkpoint
  • Uses filesystem-based snapshots for state preservation

Plan Execute/Apply Integration

  • Pre-execute checkpoint creation in plan_executor.py
  • Post-execute checkpoint on success
  • Pre-apply checkpoint in plan_apply_service.py
  • Automatic rollback attempt on execute/apply failure
  • Optional injection — hooks silently skip if no CheckpointManager

Documentation

  • New docs/reference/sandbox.md with checkpoint lifecycle diagram and rollback behavior

Tests

  • Behave: 12 scenarios in features/sandbox_checkpoints.feature (all pass)
  • Robot: 5 test cases in robot/sandbox_checkpoint_smoke.robot (all pass)
  • ASV: 3 benchmarks in benchmarks/sandbox_checkpoint_bench.py

Quality Gates

  • lint: PASS
  • typecheck: PASS (0 errors, 0 warnings)
  • unit_tests: PASS
  • integration_tests: PASS

Closes #183

## Summary Adds sandbox checkpoint and rollback hooks integrated into plan execute/apply flows with metadata capture. Checkpoints preserve sandbox state at key points, and rollback restores to a specific checkpoint with metadata preserved. ## Changes ### SandboxCheckpoint Model & CheckpointManager - `SandboxCheckpoint` frozen Pydantic model with ULID-based `checkpoint_id`, `sandbox_id`, `plan_id`, `phase`, `created_at`, `metadata`, `snapshot_path` - `CheckpointManager` class managing checkpoint lifecycle: `create_checkpoint`, `rollback_to`, `list_checkpoints`, `delete_checkpoint` - Uses filesystem-based snapshots for state preservation ### Plan Execute/Apply Integration - Pre-execute checkpoint creation in `plan_executor.py` - Post-execute checkpoint on success - Pre-apply checkpoint in `plan_apply_service.py` - Automatic rollback attempt on execute/apply failure - Optional injection — hooks silently skip if no CheckpointManager ### Documentation - New `docs/reference/sandbox.md` with checkpoint lifecycle diagram and rollback behavior ### Tests - **Behave**: 12 scenarios in `features/sandbox_checkpoints.feature` (all pass) - **Robot**: 5 test cases in `robot/sandbox_checkpoint_smoke.robot` (all pass) - **ASV**: 3 benchmarks in `benchmarks/sandbox_checkpoint_bench.py` ## Quality Gates - lint: PASS - typecheck: PASS (0 errors, 0 warnings) - unit_tests: PASS - integration_tests: PASS Closes #183
freemo force-pushed feature/m4-checkpoints from 0f4eb80678
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 13s
CI / quality (pull_request) Successful in 19s
CI / build (pull_request) Successful in 20s
CI / typecheck (pull_request) Successful in 31s
CI / security (pull_request) Successful in 43s
CI / integration_tests (pull_request) Successful in 2m47s
CI / unit_tests (pull_request) Successful in 19m8s
CI / docker (pull_request) Successful in 39s
CI / benchmark-regression (pull_request) Successful in 19m44s
CI / coverage (pull_request) Failing after 46m17s
to 63e6d23946
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / build (pull_request) Successful in 15s
CI / quality (pull_request) Successful in 27s
CI / typecheck (pull_request) Successful in 33s
CI / security (pull_request) Successful in 50s
CI / integration_tests (pull_request) Successful in 2m45s
CI / unit_tests (pull_request) Failing after 12m46s
CI / docker (pull_request) Has been skipped
CI / benchmark-regression (pull_request) Successful in 25m47s
CI / coverage (pull_request) Successful in 43m12s
2026-02-27 19:35:46 +00:00
Compare
freemo scheduled this pull request to auto merge when all checks succeed 2026-02-27 19:35:59 +00:00
freemo force-pushed feature/m4-checkpoints from 63e6d23946
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / build (pull_request) Successful in 15s
CI / quality (pull_request) Successful in 27s
CI / typecheck (pull_request) Successful in 33s
CI / security (pull_request) Successful in 50s
CI / integration_tests (pull_request) Successful in 2m45s
CI / unit_tests (pull_request) Failing after 12m46s
CI / docker (pull_request) Has been skipped
CI / benchmark-regression (pull_request) Successful in 25m47s
CI / coverage (pull_request) Successful in 43m12s
to 8fc08d1069
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 14s
CI / build (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 30s
CI / security (pull_request) Successful in 50s
CI / typecheck (pull_request) Successful in 59s
CI / integration_tests (pull_request) Successful in 2m48s
CI / unit_tests (pull_request) Successful in 16m8s
CI / docker (pull_request) Successful in 39s
CI / benchmark-regression (pull_request) Successful in 20m59s
CI / coverage (pull_request) Has been cancelled
2026-02-27 21:09:00 +00:00
Compare
freemo force-pushed feature/m4-checkpoints from 8fc08d1069
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 14s
CI / build (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 30s
CI / security (pull_request) Successful in 50s
CI / typecheck (pull_request) Successful in 59s
CI / integration_tests (pull_request) Successful in 2m48s
CI / unit_tests (pull_request) Successful in 16m8s
CI / docker (pull_request) Successful in 39s
CI / benchmark-regression (pull_request) Successful in 20m59s
CI / coverage (pull_request) Has been cancelled
to 8096744b13
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 14s
CI / quality (pull_request) Successful in 17s
CI / build (pull_request) Successful in 17s
CI / typecheck (pull_request) Successful in 36s
CI / security (pull_request) Successful in 49s
CI / integration_tests (pull_request) Successful in 4m30s
CI / unit_tests (pull_request) Successful in 22m6s
CI / docker (pull_request) Successful in 1m3s
CI / benchmark-regression (pull_request) Successful in 26m58s
CI / coverage (pull_request) Successful in 1h33m32s
2026-02-27 21:34:07 +00:00
Compare
freemo force-pushed feature/m4-checkpoints from 8096744b13
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 14s
CI / quality (pull_request) Successful in 17s
CI / build (pull_request) Successful in 17s
CI / typecheck (pull_request) Successful in 36s
CI / security (pull_request) Successful in 49s
CI / integration_tests (pull_request) Successful in 4m30s
CI / unit_tests (pull_request) Successful in 22m6s
CI / docker (pull_request) Successful in 1m3s
CI / benchmark-regression (pull_request) Successful in 26m58s
CI / coverage (pull_request) Successful in 1h33m32s
to 0ca1303927
Some checks failed
CI / lint (pull_request) Successful in 23s
CI / quality (pull_request) Successful in 29s
CI / security (pull_request) Successful in 54s
CI / typecheck (pull_request) Successful in 59s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 25s
CI / integration_tests (pull_request) Successful in 4m33s
CI / benchmark-regression (pull_request) Successful in 22m32s
CI / unit_tests (pull_request) Successful in 30m28s
CI / docker (pull_request) Successful in 39s
CI / coverage (pull_request) Successful in 1h39m42s
CI / lint (push) Successful in 12s
CI / quality (push) Successful in 20s
CI / build (push) Successful in 24s
CI / security (push) Successful in 28s
CI / typecheck (push) Successful in 1m0s
CI / benchmark-regression (push) Has been skipped
CI / integration_tests (push) Successful in 4m35s
CI / benchmark-publish (push) Successful in 13m36s
CI / coverage (push) Has been cancelled
CI / unit_tests (push) Has been cancelled
CI / docker (push) Has been cancelled
2026-02-27 23:08:57 +00:00
Compare
freemo merged commit 0ca1303927 into master 2026-02-28 00:49:46 +00:00
freemo deleted branch feature/m4-checkpoints 2026-02-28 00:49:46 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
Reference
cleveragents/cleveragents-core!462
No description provided.