TDD: UAT: agents plan rollback does not discard decisions or invalidate child plans after checkpoint — only filesystem is reverted #3327

Open
opened 2026-04-05 09:52:26 +00:00 by freemo · 1 comment
Owner

Metadata

  • Branch: tdd/m3.3-checkpoint-rollback-completeness
  • Commit Message: test(checkpoint): add tdd test capturing incomplete rollback of decisions and child plans
  • Milestone: v3.3.0
  • Parent Epic: #368

Background and Context

This is the TDD issue-capture counterpart to the bug issue for incomplete agents plan rollback behavior. Per the Bug Fix Workflow in CONTRIBUTING.md, every Type/Bug issue requires a corresponding Type/Testing issue whose sole deliverable is a failing test (tagged @tdd_expected_fail) that proves the bug exists before the fix is implemented.

The bug: CheckpointService.rollback_to_checkpoint() only reverts the sandbox filesystem via git reset --hard. It does not discard decisions, invalidate child plans, mark tool calls as undone, or transition the plan's processing state — all of which are required by docs/specification.md lines 15952–15953.

Deliverable

A single test (Behave scenario or Robot Framework test case) that:

  1. Creates a plan with a sandbox and a checkpoint
  2. Records decisions, spawns child plans, and records tool calls after the checkpoint
  3. Calls agents plan rollback to roll back to the checkpoint
  4. Asserts that decisions, child plans, and tool call records from after the checkpoint are discarded/invalidated/marked undone
  5. Fails on the current codebase (proving the bug exists), tagged with @tdd_expected_fail so CI passes

Subtasks

  • Write Behave scenario (or Robot test) that reproduces the incomplete rollback behavior
  • Tag the test with @tdd_issue, @tdd_issue_<BUG_ISSUE_NUMBER>, and @tdd_expected_fail
  • Verify the test fails on master (without the fix) and CI passes due to @tdd_expected_fail inversion
  • Open PR from tdd/m3.3-checkpoint-rollback-completeness to master
  • PR reviewed and merged

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
  • All nox stages pass (the @tdd_expected_fail tag ensures CI passes despite the failing assertion).
  • Coverage >= 97%.

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-new-issue-creator

## Metadata - **Branch**: `tdd/m3.3-checkpoint-rollback-completeness` - **Commit Message**: `test(checkpoint): add tdd test capturing incomplete rollback of decisions and child plans` - **Milestone**: v3.3.0 - **Parent Epic**: #368 ## Background and Context This is the TDD issue-capture counterpart to the bug issue for incomplete `agents plan rollback` behavior. Per the Bug Fix Workflow in CONTRIBUTING.md, every `Type/Bug` issue requires a corresponding `Type/Testing` issue whose sole deliverable is a failing test (tagged `@tdd_expected_fail`) that proves the bug exists before the fix is implemented. The bug: `CheckpointService.rollback_to_checkpoint()` only reverts the sandbox filesystem via `git reset --hard`. It does not discard decisions, invalidate child plans, mark tool calls as undone, or transition the plan's processing state — all of which are required by `docs/specification.md` lines 15952–15953. ## Deliverable A single test (Behave scenario or Robot Framework test case) that: 1. Creates a plan with a sandbox and a checkpoint 2. Records decisions, spawns child plans, and records tool calls after the checkpoint 3. Calls `agents plan rollback` to roll back to the checkpoint 4. Asserts that decisions, child plans, and tool call records from after the checkpoint are discarded/invalidated/marked undone 5. **Fails** on the current codebase (proving the bug exists), tagged with `@tdd_expected_fail` so CI passes ## Subtasks - [ ] Write Behave scenario (or Robot test) that reproduces the incomplete rollback behavior - [ ] Tag the test with `@tdd_issue`, `@tdd_issue_<BUG_ISSUE_NUMBER>`, and `@tdd_expected_fail` - [ ] Verify the test fails on `master` (without the fix) and CI passes due to `@tdd_expected_fail` inversion - [ ] Open PR from `tdd/m3.3-checkpoint-rollback-completeness` to `master` - [ ] PR reviewed and merged ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. - All nox stages pass (the `@tdd_expected_fail` tag ensures CI passes despite the failing assertion). - Coverage >= 97%. --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-new-issue-creator
freemo added this to the v3.3.0 milestone 2026-04-05 10:02:11 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • Priority: High — This is the TDD counterpart to #3326. Per CONTRIBUTING.md Bug Fix Workflow, the failing test must be written before the fix. Downgrading from Critical to High since this is a test, not the fix itself.
  • Milestone: v3.3.0 — Same milestone as the bug (#3326).
  • MoSCoW: Must Have — TDD tests are mandatory per the Bug Fix Workflow. This test must be merged before #3326 can be worked on.
  • Parent Epic: #368 (Subplans & Parallelism)

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Issue triaged by project owner: - **State**: Verified - **Priority**: High — This is the TDD counterpart to #3326. Per CONTRIBUTING.md Bug Fix Workflow, the failing test must be written before the fix. Downgrading from Critical to High since this is a test, not the fix itself. - **Milestone**: v3.3.0 — Same milestone as the bug (#3326). - **MoSCoW**: Must Have — TDD tests are mandatory per the Bug Fix Workflow. This test must be merged before #3326 can be worked on. - **Parent Epic**: #368 (Subplans & Parallelism) --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
freemo removed this from the v3.3.0 milestone 2026-04-06 23:59:42 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#3327
No description provided.