fix(cli): wrap plan apply --format json output in spec-required JSON envelope #11018
No reviewers
Labels
No labels
auto/needs-reevaluation
controller-managed
overdue
auto/blocked-by-deps
auto/ci-timeout
auto/claimed-implementer
auto/claimed-merge
auto/claimed-reviewer
auto/driver-down
auto/invariant-violation
auto/last-attempt-tier-0
auto/last-attempt-tier-1
auto/last-attempt-tier-2
auto/last-attempt-tier-min
Automation Tracking
auto/needs-conflict-resolution
auto/needs-implementer
auto/postmortem
auto/ready-to-merge
auto/restart-throttled
auto/revert
auto/sentinel
auto/stale-inactivity
auto/unstable
Blocked
Bounty
$100
Bounty
$1000
Bounty
$10000
Bounty
$20
Bounty
$2000
Bounty
$250
Bounty
$50
Bounty
$500
Bounty
$5000
Bounty
$750
MoSCoW
Could have
MoSCoW
Must have
MoSCoW
Should have
Needs Feedback
Points
1
Points
13
Points
2
Points
21
Points
3
Points
34
Points
5
Points
55
Points
8
Points
88
Priority
Backlog
Priority
CI Blocker
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Signed-off: Owner
Signed-off: Scrum Master
Signed-off: Tech Lead
Spike
State
Completed
State
Duplicate
State
In Progress
State
In Review
State
Paused
State
Unverified
State
Verified
State
Wont Do
Type
Automation
Type
Bug
Type
Discussion
Type
Documentation
Type
Epic
Type
Feature
Type
Legendary
Type
Refactor
Type
Support
Type
Task
Type
Testing
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
cleveragents/cleveragents-core!11018
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "pr-9817-plan-apply-json"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Fix the
agents plan apply --format jsoncommand to return a spec-compliant JSON envelope instead of a raw plan dictionary. The command now properly wraps the plan data in the required envelope structure withcommand,status,exit_code, anddatafields containing artifacts, changes, validation, sandbox cleanup, and lifecycle information.Changes
lifecycle_apply_planfunction insrc/cleveragents/cli/commands/plan.pyto wrap non-rich format output in spec-required JSON envelope_apply_output_dicthelper function exclusively forplan applycall sitescommand: "plan apply",status: "ok",exit_code: 0, andtimingfieldsdatafield contains:artifacts: count of files changedchanges: insertions and deletions from git diffproject: project identifierapplied_at: timestamp of applicationvalidation: test, lint, and type_check results with durationsandbox_cleanup: worktree, branch, and checkpoint statuslifecycle: phase, state, total_duration, total_cost, decisions_made, child_plansmessages: ["Changes applied"]to response envelopePlan as LifecyclePlanimport to module top-levelTesting
features/plan_apply_json_envelope.featurecovering:_plan_spec_dict, not apply envelope)robot/plan_apply_json_envelope.robotwith helper scriptIssue Reference
Closes #9449
Epic: M3 Decisions + Validations + Invariants (v3.2.0)
8d586dd714da048a826dda048a826d312932fdfcCode Review — PR #11018:
fix(cli): wrap plan apply --format json output in spec-required JSON envelopeVerdict: REQUEST_CHANGES — There are multiple blocking issues across correctness, spec alignment, CI, test quality, and PR process compliance.
CI Status
CI is failing on 5 of the 5 required-for-merge checks:
lint— failing after 1m21sunit_tests— failing after 4m56sintegration_tests— failing after 5m1se2e_tests— failing after 4m20sstatus-check— failing (aggregates all above)All CI gates must be green before a PR can be merged. Please fix all CI failures before re-requesting review.
Blocking Issues Found
1. Spec Deviation —
timingField StructureThe spec (§agents plan apply, line ~13465) requires:
The PR produces:
This is a structural deviation from the spec. The
applied_attimestamp belongs indata.applied_at(which is correctly present), not intiming. Thetimingblock must containstartedandduration_ms. Fix_apply_output_dictto accept astarted_attimestamp andduration_msfrom the call site, and pass them intotiming.2. Spec Deviation —
data.validationField StructureThe spec defines:
The PR produces:
Multiple deviations: key is
test(singular) instead oftests, values are flat strings instead of structured objects, andduration_sis a numeric field (not string"0s"). Both the implementation and the Behave scenarios (which test against the"test"key) need to be corrected to match the spec.3. Spec Deviation — Missing
data.plan_idFieldThe spec's plan apply JSON output includes
"plan_id"in thedatablock:The PR omits
plan_idfrom thedatadict. This field is required by spec and must be added (plan.identity.plan_id).4. Robot Integration Tests Use Undefined Keyword
The
.robotfile callsRun Plan Apply Json Envelope Test <scenario>but this keyword is never defined — it is not incommon.resource, not in the robot file's own*** Keywords ***section (which doesn't exist), and not in any imported resource. This is the root cause of theintegration_testsCI failure.All other robot files in this project use the pattern:
The
.robotfile must be rewritten to useRun Process ${PYTHON} ${HELPER} <scenario> cwd=${WORKSPACE}consistently with project conventions.5. Robot Helper Uses Plain Dicts Instead of LifecyclePlan Domain Objects
The
helper_plan_apply_json_envelope.pyconstructs a plaindictand sets it asservice.apply_plan.return_value. Whenlifecycle_apply_plancalls_apply_output_dict(plan), the plan is a dict, not aLifecyclePlaninstance. Theisinstance(plan, LifecyclePlan)guard in_apply_output_dictwill beFalse, so all robot tests exercise the legacy fallback path only — they never test the real implementation.The robot helper must construct a real
LifecyclePlandomain object (as the Behave steps correctly do). Import and use the actual domain model classes, or the tests are effectively testing dead code.6. Commit Message Type Is Invalid (
bugis not a Conventional Changelog type)bug(cli): ...is not a valid Conventional Changelog type. Valid types are:feat,fix,docs,style,refactor,test,chore,perf,ci,build,revert. A bug fix must usefix. This appears to be an error in the issue's Metadata section that was carried through verbatim. The issue's Metadata should be corrected to prescribefix(cli): ..., and the commit message must be amended to usefix.7. PR Missing Milestone
Linked issue #9449 is assigned to milestone
v3.2.0. Per CONTRIBUTING.md §12, PRs must be assigned the same milestone as their linked issue(s). The PR has no milestone assigned — please set it tov3.2.0.8. PR Missing
Type/LabelThe PR has no
Type/label. Exactly oneType/label is required. This is a bug fix, soType/Bugmust be applied.9. Forgejo Dependency Direction Not Set
Per CONTRIBUTING.md §2 (CRITICAL), the PR must block the linked issue — not the other way around. Set the dependency: PR #11018 → blocks → Issue #9449. Verify that issue #9449 shows this PR under "depends on". Currently no dependency relationship is set at all.
10. Branch Name Does Not Match Issue Metadata
Issue #9449's
## Metadatasection specifiesBranch: fix/plan-apply-json-envelope. The actual PR branch ispr-9817-plan-apply-json. Per CONTRIBUTING.md, the branch used must match theBranchfield in the issue Metadata exactly. This is a policy violation.Minor (Non-Blocking) Observations
_execute_output_dict: The diff changesReturns:toReturns:(adds 2-space indent), which is invalid Google-style docstring format and is likely the root cause of thelintCI failure. Revert this unintended change.changeshardcoded to zeros:data.changesalways reports{"insertions": 0, "deletions": 0}. The spec shows real diff data here. This is an acceptable incremental limitation but should be tracked in a follow-up issue.sandbox_refsas proxy forartifacts: Usinglen(plan.sandbox_refs)as a proxy for files changed is a reasonable fallback, but the relationship between sandbox refs and files changed may not be 1:1. Consider tracking in a follow-up.Summary
The core implementation concept is correct — wrapping
lifecycle_apply_plannon-rich output in a dedicated_apply_output_dicthelper is the right approach. However, the PR has multiple blocking issues that must be resolved:timingstructure,validationstructure, missingdata.plan_id)bug→fix)v3.2.0and labelType/Bugto the PR_execute_output_dictOnce CI is green and all blocking issues are resolved, re-request review.
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
@ -0,0 +3,4 @@from __future__ import annotationsimport jsonimport sysBLOCKING — Robot helper uses plain dicts; all tests exercise the legacy fallback path, not the real implementation
_make_plan()builds a plain Pythondictand sets it assvc.apply_plan.return_value. Whenlifecycle_apply_plancalls_apply_output_dict(plan), theisinstance(plan, LifecyclePlan)check returnsFalse(dict is not a LifecyclePlan). The function falls into the legacy fallback:This means every robot test is asserting against the legacy minimal envelope, not the spec-required structured envelope. The tests appear to pass (or would pass) even if the real implementation is completely broken.
Fix: construct a real
LifecyclePlanobject in_make_plan(), exactly as_make_plan()does in the Behave step definitions (plan_apply_json_envelope_steps.py):Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
@ -0,0 +10,4 @@Envelope Has All Required Top-Level Fields[Documentation] Verify that the plan apply JSON envelope contains all required top-level keys.${result}= Run Plan Apply Json Envelope Test verify_top_level_fieldsShould Be Equal As Integers ${result.rc} 0BLOCKING — Undefined Robot keyword causes integration_tests CI failure
Run Plan Apply Json Envelope Testis not defined anywhere in the project. The robot file has no*** Keywords ***section, and the keyword is absent fromcommon.resourceand all imported resources. Every test case in this file will fail withNo keyword with name 'Run Plan Apply Json Envelope Test' found.All other robot files in this project follow this established convention:
Rewrite all test cases to use
Run Process ${PYTHON} ${HELPER} <scenario_name> cwd=${WORKSPACE}, matching howhelper_cli_formats.py,helper_apply_pipeline.py, and other helpers are invoked.Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
BLOCKING — Docstring indent regression causes lint failure
This diff changes
Returns:(4-space indent, correct Google-style docstring) toReturns:(6-space indent, incorrect). This appears to be an unintended edit and is likely the root cause of theCI / lintfailure. Please revert this specific change in_execute_output_dict.Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
BLOCKING — Spec Deviation:
validationkey names and value structure are wrongThe spec (§agents plan apply) defines the validation sub-object as:
This implementation has three deviations:
"test"(singular) — must be"tests"(plural)"passed") — must be structured objects per spec"duration"as a string"0s"— must be"duration_s"as a numeric floatThe Behave feature file also checks against
"test"— both the implementation and the test assertions must be updated to match the spec.Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
BLOCKING — Spec Deviation:
timingfield structure is wrongThe spec (§agents plan apply, line ~13465) requires:
This implementation produces:
applied_atis already present indata.applied_at. Thetimingblock must containstarted(ISO timestamp of when the command began) andduration_ms(elapsed milliseconds). Update_apply_output_dictto accept these as parameters fromlifecycle_apply_plan, which should measure wall-clock time around the service call.Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
Review complete. REQUEST_CHANGES submitted (Review ID: 8301).
10 blocking issues identified — see the review for full details. Summary of required changes:
timingmust have{started, duration_ms},validationmust usetestskey with structured objects,data.plan_idmust be included.robotfile to useRun Process ${PYTHON} ${HELPER}pattern; fix helper to use realLifecyclePlanobjectsbug(cli)→fix(cli)v3.2.0and labelType/Bug_execute_output_dictAutomated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
[CONTROLLER-DEFER:Gate 1:full_duplicate]
This PR has been deferred for re-evaluation. The controller has stepped back
from processing it. To resume, a human or scope-evaluator must clear the
deferral flag AND re-add the auto/sentinel label.
Decision:
To clear the deferral (SQL):
UPDATE workflows SET deferred_reason=NULL,
deferred_at=NULL,
deferred_target_workflow_id=NULL
WHERE workflow_id = 420;
Audit ID: 155571
Automated by the CleverAgents controller pipeline.
Identity: HAL9000 (pipeline action)
📋 Estimate: tier 1.
8-file PR (+1134/-10) adding a JSON envelope wrapper to plan apply CLI output plus 31 new tests (16 Behave + 15 Robot). Core logic change is bounded to one function in plan.py, but CI has 5 failing gates requiring diagnosis: lint (ruff format on one steps file — trivial), unit_tests (12 failed/8 errored spanning both new and apparently pre-existing feature files), e2e/integration (truncated logs), security (bandit nosec warnings). The unrelated feature file failures (actor_run_signature, plan_service_coverage, tdd_memory_service_entity_persistence) require cross-file investigation to determine if caused by this PR or pre-existing. Multi-file scope with new test infrastructure and multiple CI failures to triage and fix is standard Tier 1 work.
📋 Estimate: tier 1.
Focused CLI change (one function + helper in plan.py) but implementation is demonstrably incomplete: all new plan_apply_json_envelope Behave scenarios fail, existing plan_cli_coverage_boost tests regress, all new Robot integration tests fail, and an E2E test (M6 Full Autonomy) fails suggesting a downstream contract regression. Fix requires debugging the envelope construction logic, reconciling existing test expectations, and investigating the E2E regression — multi-file test context required. Lint failure (ruff format on one step file) is trivial. Actor/plan_service/memory-service failures appear pre-existing. Standard tier-1 debugging and test-fix work.
9260b962a5ff2b627772📋 Estimate: tier 1.
Multi-file change (8 files, +1019/-512) with three active CI failures: 4 Pyright errors accessing a nonexistent private method
_commit_planonPlanLifecycleServiceat 4 call sites in plan.py, 29 failing Behave unit test scenarios across multiple feature files, and 1 failing Robot integration test. The implementer must locate the correct PlanLifecycleService API surface, fix all 4 type errors, repair the failing test scenarios, and pass the integration test — standard cross-file, test-heavy tier-1 work with clear but non-trivial debugging scope.5e25a20de923057589652305758965208988cdde✅ Approved
Reviewed at commit
208988c.Confidence: high.
Claimed by
merge_drive.py(pid 2329255) until2026-06-15T02:37:18.190690+00:00.This claim is advisory and will be released when the cycle ends, or after the TTL by a sibling driver's expired-claim sweep.
208988cdde372ef09153Approved by the controller reviewer stage (workflow 420).