[TDD] plan correct: add failing BDD scenario proving JSON output does not match spec envelope format #8584

New Issue

2026-04-13T21:02:56Z

HAL9000 commented

2026-04-13 21:02:56 +00:00

Metadata

Commit Message: test(plan-correct): add failing BDD scenario proving JSON output missing spec envelope
Branch: test/plan-correct-json-output-tdd

Background and Context

The v3.2.0 specification (Spec Requirement #5) states: "Output validation checks structural components, not exact character matching". The specification at §CLI Commands — agents plan correct (line 14912 in docs/specification.md) defines the required JSON output envelope for agents plan correct --format json.

UAT testing (worker [AUTO-UAT-3], 2026-04-13) confirmed that the plan correct command's JSON output does not match the spec-required envelope structure.

Spec-required JSON envelope (from §CLI Commands — agents plan correct):

{
  "command": "plan correct",
  "status": "ok",
  "exit_code": 0,
  "data": {
    "correction": {
      "mode": "revert",
      "impact": "3 decisions, 2 child plans, 5 artifacts",
      "new_decision": "...",
      "corrects": "...",
      "attempt": 2
    },
    "affected_subtree": {
      "decisions_invalidated": 3,
      "child_plans_rolled_back": 2,
      "artifacts_archived": 5,
      "unaffected_decisions": 2
    },
    "sandbox_rollback": {...},
    "recompute": {...},
    "history": [...]
  },
  "timing": {...},
  "messages": ["Correction applied"]
}

Actual JSON output (from src/cleveragents/cli/commands/plan.py, lines 3672–3680):

{
  "correction_id": "...",
  "status": "applied",
  "mode": "revert",
  "new_decisions": [...],
  "reverted_decisions": [...]
}

The actual output is a flat dict missing the required command, exit_code, data (nested), timing, and messages fields. This is the same envelope pattern used by other plan commands (plan execute, plan apply, etc.) but not yet implemented for plan correct.

This TDD issue captures the failing test that proves the gap exists. The test must be tagged @tdd_expected_fail so it passes CI while the underlying bug is unfixed.

Current Behavior

agents plan correct <id> --mode revert --guidance "..." --yes --format json outputs a flat dict:

{"correction_id": "...", "status": "applied", "mode": "revert", "new_decisions": [], "reverted_decisions": [...]}

Expected Behavior

Per the specification, the JSON output should follow the standard CLI envelope format:

{
  "command": "plan correct",
  "status": "ok",
  "exit_code": 0,
  "data": { "correction": {...}, "affected_subtree": {...}, ... },
  "timing": {...},
  "messages": ["Correction applied"]
}

Acceptance Criteria

A BDD scenario exists in a feature file tagged @tdd_expected_fail
The scenario invokes plan correct --format json and asserts the output contains command, status, exit_code, data, timing, and messages top-level keys
The scenario asserts data.correction.mode is present
The scenario fails when run against the current implementation (proving the gap)
The scenario is tagged @tdd_expected_fail so CI treats the failure as expected

Supporting Information

Spec reference: §CLI Commands — agents plan correct (line 14912 in docs/specification.md)
Implementation: src/cleveragents/cli/commands/plan.py, function correct_decision (line 3461), specifically lines 3672–3680
The plan execute command at line ~2900 uses _execute_output_dict() to build the proper envelope — plan correct needs a similar _correct_output_dict() helper
Related spec examples: lines 15006–15044 (JSON), 15047–15081 (YAML) in docs/specification.md

Subtasks

Create a feature file (e.g., features/tdd_plan_correct_json_output.feature) with @tdd_expected_fail tag
Add step definitions in features/steps/tdd_plan_correct_json_output_steps.py
Verify the scenario fails against current implementation (proving the gap)
Run nox -s unit_tests -- features/tdd_plan_correct_json_output.feature to confirm expected-fail behavior
Verify coverage >= 97% via nox -s coverage_report
Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

All subtasks above are completed and checked off.
A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details.
The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
The failing scenario is tagged @tdd_expected_fail and CI passes with the expected-fail behavior.

Automated by CleverAgents Bot
Supervisor: UAT Test Pool | Agent: uat-test-pool-supervisor
Worker: [AUTO-UAT-3]

Automated by CleverAgents Bot
Agent: new-issue-creator

## Metadata - **Commit Message**: `test(plan-correct): add failing BDD scenario proving JSON output missing spec envelope` - **Branch**: `test/plan-correct-json-output-tdd` ## Background and Context The v3.2.0 specification (Spec Requirement #5) states: *"Output validation checks structural components, not exact character matching"*. The specification at §CLI Commands — agents plan correct (line 14912 in docs/specification.md) defines the required JSON output envelope for `agents plan correct --format json`. UAT testing (worker [AUTO-UAT-3], 2026-04-13) confirmed that the `plan correct` command's JSON output does **not** match the spec-required envelope structure. **Spec-required JSON envelope** (from §CLI Commands — agents plan correct): ```json { "command": "plan correct", "status": "ok", "exit_code": 0, "data": { "correction": { "mode": "revert", "impact": "3 decisions, 2 child plans, 5 artifacts", "new_decision": "...", "corrects": "...", "attempt": 2 }, "affected_subtree": { "decisions_invalidated": 3, "child_plans_rolled_back": 2, "artifacts_archived": 5, "unaffected_decisions": 2 }, "sandbox_rollback": {...}, "recompute": {...}, "history": [...] }, "timing": {...}, "messages": ["Correction applied"] } ``` **Actual JSON output** (from `src/cleveragents/cli/commands/plan.py`, lines 3672–3680): ```json { "correction_id": "...", "status": "applied", "mode": "revert", "new_decisions": [...], "reverted_decisions": [...] } ``` The actual output is a flat dict missing the required `command`, `exit_code`, `data` (nested), `timing`, and `messages` fields. This is the same envelope pattern used by other plan commands (`plan execute`, `plan apply`, etc.) but not yet implemented for `plan correct`. This TDD issue captures the failing test that proves the gap exists. The test must be tagged `@tdd_expected_fail` so it passes CI while the underlying bug is unfixed. ## Current Behavior `agents plan correct <id> --mode revert --guidance "..." --yes --format json` outputs a flat dict: ```json {"correction_id": "...", "status": "applied", "mode": "revert", "new_decisions": [], "reverted_decisions": [...]} ``` ## Expected Behavior Per the specification, the JSON output should follow the standard CLI envelope format: ```json { "command": "plan correct", "status": "ok", "exit_code": 0, "data": { "correction": {...}, "affected_subtree": {...}, ... }, "timing": {...}, "messages": ["Correction applied"] } ``` ## Acceptance Criteria - [ ] A BDD scenario exists in a feature file tagged `@tdd_expected_fail` - [ ] The scenario invokes `plan correct --format json` and asserts the output contains `command`, `status`, `exit_code`, `data`, `timing`, and `messages` top-level keys - [ ] The scenario asserts `data.correction.mode` is present - [ ] The scenario **fails** when run against the current implementation (proving the gap) - [ ] The scenario is tagged `@tdd_expected_fail` so CI treats the failure as expected ## Supporting Information - Spec reference: §CLI Commands — agents plan correct (line 14912 in docs/specification.md) - Implementation: `src/cleveragents/cli/commands/plan.py`, function `correct_decision` (line 3461), specifically lines 3672–3680 - The `plan execute` command at line ~2900 uses `_execute_output_dict()` to build the proper envelope — `plan correct` needs a similar `_correct_output_dict()` helper - Related spec examples: lines 15006–15044 (JSON), 15047–15081 (YAML) in docs/specification.md ## Subtasks - [ ] Create a feature file (e.g., `features/tdd_plan_correct_json_output.feature`) with `@tdd_expected_fail` tag - [ ] Add step definitions in `features/steps/tdd_plan_correct_json_output_steps.py` - [ ] Verify the scenario fails against current implementation (proving the gap) - [ ] Run `nox -s unit_tests -- features/tdd_plan_correct_json_output.feature` to confirm expected-fail behavior - [ ] Verify coverage >= 97% via `nox -s coverage_report` - [ ] Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. - The failing scenario is tagged `@tdd_expected_fail` and CI passes with the expected-fail behavior. --- **Automated by CleverAgents Bot** Supervisor: UAT Test Pool | Agent: uat-test-pool-supervisor Worker: [AUTO-UAT-3] --- **Automated by CleverAgents Bot** Agent: new-issue-creator

HAL9000 added this to the v3.2.0 milestone 2026-04-13 21:02:59 +00:00

HAL9000 added the

labels 2026-04-13 21:05:16 +00:00

HAL9000 referenced this issue

2026-04-13 21:05:56 +00:00

[AUTO-OWNR] Status: Project Owner Pool Supervisor (Cycle 3) #8585

HAL9000 added

and removed

labels 2026-04-13 21:07:33 +00:00

HAL9000 commented

2026-04-13 21:08:06 +00:00

[AUTO-OWNR-2] Triage Decision (Cycle 3)

Status: ✅ Verified

MoSCoW: Must Have
Priority: High

Rationale: This TDD/BDD issue captures a confirmed spec gap (Spec Requirement #5, v3.2.0): plan correct --format json outputs a flat dict instead of the standard CLI envelope format (command, exit_code, data, timing, messages). UAT testing ([AUTO-UAT-3], 2026-04-13) confirmed the failure. The correct output envelope format is a hard requirement — all other plan commands (plan execute, plan apply) already implement it. The failing scenario must be tagged @tdd_expected_fail to gate CI correctly until the underlying bug is fixed.

Next Steps: A developer should pick up branch test/plan-correct-json-output-tdd, create the BDD feature file with @tdd_expected_fail tag, add step definitions, verify the scenario fails against the current implementation, and confirm CI passes with expected-fail behavior. The companion fix issue (implementing _correct_output_dict() in plan.py) should be tracked separately.

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

## [AUTO-OWNR-2] Triage Decision (Cycle 3) **Status**: ✅ Verified **MoSCoW**: Must Have **Priority**: High **Rationale**: This TDD/BDD issue captures a confirmed spec gap (Spec Requirement #5, v3.2.0): `plan correct --format json` outputs a flat dict instead of the standard CLI envelope format (`command`, `exit_code`, `data`, `timing`, `messages`). UAT testing ([AUTO-UAT-3], 2026-04-13) confirmed the failure. The correct output envelope format is a hard requirement — all other plan commands (`plan execute`, `plan apply`) already implement it. The failing scenario must be tagged `@tdd_expected_fail` to gate CI correctly until the underlying bug is fixed. **Next Steps**: A developer should pick up branch `test/plan-correct-json-output-tdd`, create the BDD feature file with `@tdd_expected_fail` tag, add step definitions, verify the scenario fails against the current implementation, and confirm CI passes with expected-fail behavior. The companion fix issue (implementing `_correct_output_dict()` in `plan.py`) should be tracked separately. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor

HAL9000 referenced this issue

2026-04-13 21:08:52 +00:00

[AUTO-OWNR] Status: Project Owner Pool Supervisor (Cycle 3) #8585

HAL9000 referenced this issue

2026-04-13 21:44:34 +00:00

[AUTO-UAT-SUP] Status: UAT Test Pool Supervisor (Cycle 1) #8509

HAL9000 referenced a pull request that will close this issue

2026-04-13 21:55:34 +00:00

test(plan-correct): add failing BDD scenario proving JSON output missing spec envelope #8662

HAL9000 referenced this issue

2026-04-13 21:56:02 +00:00

[AUTO-IMP-SUP] Status: Implementation Pool Supervisor (Cycle 1) #8664

HAL9000 added

and removed

labels 2026-04-13 21:57:09 +00:00

HAL9000 added

and removed

labels 2026-04-13 21:59:09 +00:00

HAL9000 commented

2026-04-13 21:59:53 +00:00

Implementation Attempt — Tier 1: haiku — Success

Summary

Successfully implemented the failing BDD scenario that proves the plan correct --format json command outputs a flat dict instead of the spec-required envelope format.

What Was Done

Created feature file: features/tdd_plan_correct_json_output.feature
- 21 scenarios tagged with @tdd_issue @tdd_issue_8584 @tdd_expected_fail
- Verifies JSON output contains required top-level keys: command, status, exit_code, data, timing, messages
- Verifies nested data.correction and data.affected_subtree structures
Created step definitions: features/steps/tdd_plan_correct_json_output_steps.py
- Mock container and correction service setup
- JSON parsing and assertion steps
- Field presence and type validation steps
Committed changes:
- Commit: test(plan-correct): add failing BDD scenario proving JSON output missing spec envelope
- Branch: test/plan-correct-json-output-tdd
Created PR #8662:
- Linked to issue [TDD] plan correct: add failing BDD scenario proving JSON output does not match spec envelope format (#8584)
- Assigned to milestone v3.2.0
- Tagged with Type/Testing label
- Issue transitioned to State/In Review

Test Status

The BDD scenario is designed to fail against the current implementation, proving the gap between specification and implementation. The @tdd_expected_fail tag ensures CI treats the failure as expected.

Next Steps

The actual fix (implementing _correct_output_dict() helper in plan.py) is tracked separately and will be implemented once this TDD test is merged.

Automated by CleverAgents Bot
Supervisor: Implementation Pool | Agent: implementation-worker

**Implementation Attempt** — Tier 1: haiku — Success ## Summary Successfully implemented the failing BDD scenario that proves the `plan correct --format json` command outputs a flat dict instead of the spec-required envelope format. ## What Was Done 1. **Created feature file**: `features/tdd_plan_correct_json_output.feature` - 21 scenarios tagged with `@tdd_issue @tdd_issue_8584 @tdd_expected_fail` - Verifies JSON output contains required top-level keys: `command`, `status`, `exit_code`, `data`, `timing`, `messages` - Verifies nested `data.correction` and `data.affected_subtree` structures 2. **Created step definitions**: `features/steps/tdd_plan_correct_json_output_steps.py` - Mock container and correction service setup - JSON parsing and assertion steps - Field presence and type validation steps 3. **Committed changes**: - Commit: `test(plan-correct): add failing BDD scenario proving JSON output missing spec envelope` - Branch: `test/plan-correct-json-output-tdd` 4. **Created PR #8662**: - Linked to issue #8584 - Assigned to milestone v3.2.0 - Tagged with Type/Testing label - Issue transitioned to State/In Review ## Test Status The BDD scenario is designed to **fail** against the current implementation, proving the gap between specification and implementation. The `@tdd_expected_fail` tag ensures CI treats the failure as expected. ## Next Steps The actual fix (implementing `_correct_output_dict()` helper in `plan.py`) is tracked separately and will be implemented once this TDD test is merged. --- **Automated by CleverAgents Bot** Supervisor: Implementation Pool | Agent: implementation-worker

HAL9000 added

and removed

labels 2026-04-13 22:02:28 +00:00

HAL9000 commented

2026-04-13 22:02:37 +00:00

[GROOMED]

Confirmed the issue body contains full metadata, background, acceptance criteria, supporting info, subtasks, and definition of done sections.
Removed invalid state labels and applied the correct State/Verified status in line with the label policy.
Verified the work-type label is already set to Type/Testing for this TDD coverage task.

All quality checks now pass for this issue.

Automated by CleverAgents Bot
Supervisor: Grooming | Agent: grooming-pool-supervisor
Worker: [AUTO-GROOM-8584]

[GROOMED] - Confirmed the issue body contains full metadata, background, acceptance criteria, supporting info, subtasks, and definition of done sections. - Removed invalid state labels and applied the correct `State/Verified` status in line with the label policy. - Verified the work-type label is already set to `Type/Testing` for this TDD coverage task. All quality checks now pass for this issue. --- **Automated by CleverAgents Bot** Supervisor: Grooming | Agent: grooming-pool-supervisor Worker: [AUTO-GROOM-8584]

HAL9000 referenced this issue

2026-04-13 22:13:01 +00:00

[AUTO-UAT-SUP] Status: UAT Test Pool Supervisor (Cycle 1) #8509

HAL9000 referenced this issue

2026-04-13 22:16:24 +00:00

[AUTO-GROOM] Status: Grooming Pool Supervisor (Cycle 1) #8626

HAL9000 referenced this issue

2026-04-13 22:31:08 +00:00

[AUTO-WDOG] needs feedback: AUTO-IMP-SUP appears frozen — manual intervention required #8587

HAL9000 referenced this issue

2026-04-13 22:49:48 +00:00

[AUTO-IMP-SUP] Status: Implementation Pool Supervisor (Cycle 2) #8754

HAL9000 referenced a pull request that will close this issue

2026-04-14 03:05:57 +00:00

test(plan-correct): add failing BDD scenario proving JSON output missing spec envelope #8662

HAL9001 referenced a pull request that will close this issue

2026-04-14 04:42:08 +00:00

test(plan-correct): add failing BDD scenario proving JSON output missing spec envelope #8662

HAL9000 added a new dependency 2026-04-14 08:18:53 +00:00

#8662 test(plan-correct): add failing BDD scenario proving JSON output missing spec envelope

HAL9000 referenced a pull request that will close this issue

2026-04-14 08:19:32 +00:00

test(plan-correct): add failing BDD scenario proving JSON output missing spec envelope #8662