TDD: checkpoint rollback is simulated — does not execute real git reset (bug #822) #839

Closed
opened 2026-03-13 21:15:19 +00:00 by freemo · 4 comments
Owner

Background and Context

Bug #822 (bug(plan): checkpoint rollback is simulated — does not execute real git reset) identifies a defect where CheckpointService.rollback() returns a successful RollbackResult but does not actually execute git reset --hard (or equivalent) against the repository. The method constructs a result object with checkpoint metadata but skips the actual filesystem/git operation. Per the mandatory TDD bug-fix workflow (CONTRIBUTING.md §Bug Fix Workflow), a failing test must be written before the fix is implemented.

This issue tracks the TDD test-writing phase for bug #822.

Current Behavior

No test exists that verifies the file system state actually changes after a checkpoint rollback operation.

Expected Behavior

A Behave BDD scenario and/or Robot Framework test tagged with @tdd_expected_fail, @tdd_bug, and @tdd_bug_822 should exist that:

  1. Creates a checkpoint of the current state
  2. Modifies files after the checkpoint
  3. Invokes CheckpointService.rollback() targeting the checkpoint
  4. Asserts the file system state matches the checkpoint state (files reverted)
  5. Currently fails (proving the bug exists), but passes CI via the @tdd_expected_fail inversion

Acceptance Criteria

  • At least one Behave scenario in features/tdd_checkpoint_real_rollback.feature
  • Scenario tagged with @tdd_expected_fail @tdd_bug @tdd_bug_822
  • At least one Robot Framework test in robot/tdd_checkpoint_real_rollback.robot
  • Test tagged with tdd_expected_fail, tdd_bug, tdd_bug_822
  • Full test suite passes (nox -s unit_tests integration_tests)
  • ruff check and ruff format pass

Metadata

  • Commit message: test(plan): TDD failing tests for checkpoint real rollback (bug #822)
  • Branch name: tdd/m6-checkpoint-real-rollback
  • Type: Testing
  • Priority: Critical
  • MoSCoW: Must have
  • Points: 2
  • Milestone: v3.5.0

Subtasks

  • Write Behave scenario exercising checkpoint → modify → rollback → verify file state
  • Write Robot Framework test exercising same flow with observable file changes
  • Apply @tdd_expected_fail tags
  • Verify tests fail (bug present) but CI passes (inversion working)

Definition of Done

  • TDD tests exist and are tagged correctly
  • Tests demonstrate the bug (fail without @tdd_expected_fail)
  • Full CI suite passes with the inversion
  • PR merged to master
## Background and Context Bug #822 (`bug(plan): checkpoint rollback is simulated — does not execute real git reset`) identifies a defect where `CheckpointService.rollback()` returns a successful `RollbackResult` but does not actually execute `git reset --hard` (or equivalent) against the repository. The method constructs a result object with checkpoint metadata but skips the actual filesystem/git operation. Per the mandatory TDD bug-fix workflow (CONTRIBUTING.md §Bug Fix Workflow), a failing test must be written **before** the fix is implemented. This issue tracks the TDD test-writing phase for bug #822. ## Current Behavior No test exists that verifies the file system state actually changes after a checkpoint rollback operation. ## Expected Behavior A Behave BDD scenario and/or Robot Framework test tagged with `@tdd_expected_fail`, `@tdd_bug`, and `@tdd_bug_822` should exist that: 1. Creates a checkpoint of the current state 2. Modifies files after the checkpoint 3. Invokes `CheckpointService.rollback()` targeting the checkpoint 4. Asserts the file system state matches the checkpoint state (files reverted) 5. Currently **fails** (proving the bug exists), but passes CI via the `@tdd_expected_fail` inversion ## Acceptance Criteria - [x] At least one Behave scenario in `features/tdd_checkpoint_real_rollback.feature` - [x] Scenario tagged with `@tdd_expected_fail @tdd_bug @tdd_bug_822` - [x] At least one Robot Framework test in `robot/tdd_checkpoint_real_rollback.robot` - [x] Test tagged with `tdd_expected_fail`, `tdd_bug`, `tdd_bug_822` - [x] Full test suite passes (`nox -s unit_tests integration_tests`) - [x] `ruff check` and `ruff format` pass ## Metadata - **Commit message**: `test(plan): TDD failing tests for checkpoint real rollback (bug #822)` - **Branch name**: `tdd/m6-checkpoint-real-rollback` - **Type**: Testing - **Priority**: Critical - **MoSCoW**: Must have - **Points**: 2 - **Milestone**: v3.5.0 ## Subtasks - [x] Write Behave scenario exercising checkpoint → modify → rollback → verify file state - [x] Write Robot Framework test exercising same flow with observable file changes - [x] Apply `@tdd_expected_fail` tags - [x] Verify tests fail (bug present) but CI passes (inversion working) ## Definition of Done - TDD tests exist and are tagged correctly - Tests demonstrate the bug (fail without `@tdd_expected_fail`) - Full CI suite passes with the inversion - PR merged to `master`
freemo added this to the v3.5.0 milestone 2026-03-13 21:15:56 +00:00
Member

Implementation Notes

Design Decisions

Behave scenario structure: Two scenarios that exercise the two main aspects of the bug:

  1. Content rollback — Creates a checkpoint, modifies file content, rolls back, asserts content reverted. Fails because CheckpointService.rollback_to_checkpoint() at checkpoint_service.CheckpointService.rollback_to_checkpoint (commit f880c805) only constructs a RollbackResult with changed_paths=["restored:<sha>"] without running git reset --hard.
  2. Added-file rollback — Creates a checkpoint, adds a new file, rolls back, asserts the new file is removed. Same root cause.

Robot test structure: Mirrors the Behave scenarios using the established Run Process + sentinel pattern. Python helper (helper_tdd_checkpoint_real_rollback.py) imports domain services directly to set up the git repo and checkpoint infrastructure.

Tag compliance: All scenarios carry @tdd_expected_fail @tdd_bug @tdd_bug_822 @mock_only. Robot tests carry tdd_expected_fail, tdd_bug, tdd_bug_822. Inversion works correctly — tests fail (proving bug exists) but CI reports them as passed.

Verification Results

Nox Session Result
lint PASS
typecheck PASS (0 errors)
unit_tests PASS (380 features, 10808 scenarios)
coverage_report 98% (>= 97% threshold)

Files Created

File Lines Purpose
features/tdd_checkpoint_real_rollback.feature 26 2 Behave scenarios
features/steps/tdd_checkpoint_real_rollback_steps.py 183 Step definitions with full type annotations
robot/tdd_checkpoint_real_rollback.robot 34 2 Robot test cases
robot/helper_tdd_checkpoint_real_rollback.py 207 Python helper with 2 subcommands

Commit

f880c8050cd1e2909e9a061cb412e146b3dd37bf on branch tdd/m6-checkpoint-real-rollback

## Implementation Notes ### Design Decisions **Behave scenario structure:** Two scenarios that exercise the two main aspects of the bug: 1. **Content rollback** — Creates a checkpoint, modifies file content, rolls back, asserts content reverted. Fails because `CheckpointService.rollback_to_checkpoint()` at `checkpoint_service.CheckpointService.rollback_to_checkpoint` (commit `f880c805`) only constructs a `RollbackResult` with `changed_paths=["restored:<sha>"]` without running `git reset --hard`. 2. **Added-file rollback** — Creates a checkpoint, adds a new file, rolls back, asserts the new file is removed. Same root cause. **Robot test structure:** Mirrors the Behave scenarios using the established `Run Process` + sentinel pattern. Python helper (`helper_tdd_checkpoint_real_rollback.py`) imports domain services directly to set up the git repo and checkpoint infrastructure. **Tag compliance:** All scenarios carry `@tdd_expected_fail @tdd_bug @tdd_bug_822 @mock_only`. Robot tests carry `tdd_expected_fail`, `tdd_bug`, `tdd_bug_822`. Inversion works correctly — tests fail (proving bug exists) but CI reports them as passed. ### Verification Results | Nox Session | Result | |---|---| | `lint` | PASS | | `typecheck` | PASS (0 errors) | | `unit_tests` | PASS (380 features, 10808 scenarios) | | `coverage_report` | 98% (>= 97% threshold) | ### Files Created | File | Lines | Purpose | |---|---|---| | `features/tdd_checkpoint_real_rollback.feature` | 26 | 2 Behave scenarios | | `features/steps/tdd_checkpoint_real_rollback_steps.py` | 183 | Step definitions with full type annotations | | `robot/tdd_checkpoint_real_rollback.robot` | 34 | 2 Robot test cases | | `robot/helper_tdd_checkpoint_real_rollback.py` | 207 | Python helper with 2 subcommands | ### Commit `f880c8050cd1e2909e9a061cb412e146b3dd37bf` on branch `tdd/m6-checkpoint-real-rollback`
Member

Post-Crash Recovery Notes

Context

The system crashed while reviewing/updating PR #929. The branch had been pushed with a merge commit (f29e3a5e Merge branch 'master' into tdd/m6-checkpoint-real-rollback) which violates the project's no-merge-commit rule. Additionally, the PR had an empty description and was missing dependency links.

Recovery Actions

1. Branch rebase (merge commit removal)

  • Cherry-picked the original feature commit f880c805 onto a fresh checkout of origin/master
  • Eliminated the merge commit entirely — branch now has exactly one clean commit

2. Code improvements incorporated
Before the crash, work-in-progress improvements were staged but not committed:

  • Extracted shared features/mocks/git_helpers.py with run_git() and get_head_sha() to avoid duplication
  • Refactored features/steps/tdd_checkpoint_real_rollback_steps.py to use shared helpers
  • Refactored robot/helper_tdd_checkpoint_real_rollback.py to use shared helpers
  • Added encoding="utf-8" to all Path.write_text() / Path.read_text() calls
  • Moved cleanup registration earlier in step_create_workspace() to prevent tmpdir leaks on partial failure
  • Added @tdd tag to feature file
  • Added try/except cleanup in _create_workspace() in Robot helper
  • Improved error handling in Robot helper dispatcher (exit code 2 for usage/crash vs 1 for test failure)

These improvements were incorporated into the single commit via --amend.

3. Discarded changes
Unstaged changes to robot/cli_core.robot and robot/scientific_paper_e2e_test.robot were discarded — these were reverting upstream changes by other developers and are being handled by separate agents.

4. Lint fix
Ruff I001 import-sorting violation in robot/helper_tdd_checkpoint_real_rollback.py was fixed (ruff required features.mocks import to sort before cleveragents).

5. PR metadata updated

  • Added comprehensive PR description to PR #929
  • Added dependency link: issue #839 depends on PR #929
  • Type/Testing label and v3.5.0 milestone were already correctly set

Final Commit

c9dd3ba386fea13bab7d51157928927a865436d3 on branch tdd/m6-checkpoint-real-rollback

CI Results (all passing)

Check Result
lint PASS (14s)
typecheck PASS (47s)
unit_tests PASS (3m0s)
integration_tests PASS (4m39s)
coverage PASS (5m23s)
security PASS (31s)
quality PASS (1m25s)
benchmark-regression PASS (35m25s)
build PASS (16s)
docker PASS (36s)
e2e_tests PASS (24s)
## Post-Crash Recovery Notes ### Context The system crashed while reviewing/updating PR #929. The branch had been pushed with a merge commit (`f29e3a5e Merge branch 'master' into tdd/m6-checkpoint-real-rollback`) which violates the project's no-merge-commit rule. Additionally, the PR had an empty description and was missing dependency links. ### Recovery Actions **1. Branch rebase (merge commit removal)** - Cherry-picked the original feature commit `f880c805` onto a fresh checkout of `origin/master` - Eliminated the merge commit entirely — branch now has exactly one clean commit **2. Code improvements incorporated** Before the crash, work-in-progress improvements were staged but not committed: - Extracted shared `features/mocks/git_helpers.py` with `run_git()` and `get_head_sha()` to avoid duplication - Refactored `features/steps/tdd_checkpoint_real_rollback_steps.py` to use shared helpers - Refactored `robot/helper_tdd_checkpoint_real_rollback.py` to use shared helpers - Added `encoding="utf-8"` to all `Path.write_text()` / `Path.read_text()` calls - Moved cleanup registration earlier in `step_create_workspace()` to prevent tmpdir leaks on partial failure - Added `@tdd` tag to feature file - Added try/except cleanup in `_create_workspace()` in Robot helper - Improved error handling in Robot helper dispatcher (exit code 2 for usage/crash vs 1 for test failure) These improvements were incorporated into the single commit via `--amend`. **3. Discarded changes** Unstaged changes to `robot/cli_core.robot` and `robot/scientific_paper_e2e_test.robot` were discarded — these were reverting upstream changes by other developers and are being handled by separate agents. **4. Lint fix** Ruff I001 import-sorting violation in `robot/helper_tdd_checkpoint_real_rollback.py` was fixed (ruff required `features.mocks` import to sort before `cleveragents`). **5. PR metadata updated** - Added comprehensive PR description to PR #929 - Added dependency link: issue #839 depends on PR #929 - `Type/Testing` label and `v3.5.0` milestone were already correctly set ### Final Commit `c9dd3ba386fea13bab7d51157928927a865436d3` on branch `tdd/m6-checkpoint-real-rollback` ### CI Results (all passing) | Check | Result | |---|---| | lint | PASS (14s) | | typecheck | PASS (47s) | | unit_tests | PASS (3m0s) | | integration_tests | PASS (4m39s) | | coverage | PASS (5m23s) | | security | PASS (31s) | | quality | PASS (1m25s) | | benchmark-regression | PASS (35m25s) | | build | PASS (16s) | | docker | PASS (36s) | | e2e_tests | PASS (24s) |
Author
Owner

Dependency (TDD workflow): This TDD test issue blocks bug fix #822. This issue must be completed and merged first so the @tdd_expected_fail test exists before the fix is implemented.

Blocks: #822

**Dependency (TDD workflow):** This TDD test issue blocks bug fix #822. This issue must be completed and merged first so the `@tdd_expected_fail` test exists before the fix is implemented. **Blocks:** #822
Author
Owner

PM Acknowledgment — Day 34

Solid work, Brent. The 2-scenario structure (content rollback + added-file rollback) correctly captures both dimensions of the bug. The recovery was thorough — good call extracting the shared git_helpers.py and adding encoding declarations during the rebased commit. All CI sessions passing.

PR #929 is now in review. @hamza.khyari has been assigned as peer reviewer. This is on the critical path — once #929 merges, the bug fix for #822 can proceed.

**PM Acknowledgment — Day 34** Solid work, Brent. The 2-scenario structure (content rollback + added-file rollback) correctly captures both dimensions of the bug. The recovery was thorough — good call extracting the shared `git_helpers.py` and adding encoding declarations during the rebased commit. All CI sessions passing. PR #929 is now in review. @hamza.khyari has been assigned as peer reviewer. This is on the critical path — once #929 merges, the bug fix for #822 can proceed.
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#839
No description provided.