feat(cli): implement plan prompt command per specification #1212

2026-03-30T18:31:09Z

brent.edwards commented

2026-03-30 18:31:09 +00:00

Summary

implement agents plan prompt <PLAN_ID> <GUIDANCE> in the CLI with full format support (rich/color/table/plain/json/yaml) and spec-compliant machine envelope output
wire _cleveragents/plan/prompt through the local A2A facade into PlanLifecycleService.prompt_plan(...) with execute-phase validation and user-intervention decision recording
add Behave coverage for discoverability, happy path, format compliance, inactive-phase errors, and facade routing
stabilize intermittent quality-gate failures by hardening integration helper timeouts, relaxing a fragile CLI-core timing assertion, defaulting test worker concurrency to serial, and forcing ASV spawn launch mode

Validation

nox -s lint
nox -s typecheck
nox -s unit_tests
nox -s integration_tests
nox -s e2e_tests
nox -s coverage_report (97.26%)
nox (full default sessions)

Closes #885

## Summary - implement `agents plan prompt <PLAN_ID> <GUIDANCE>` in the CLI with full format support (rich/color/table/plain/json/yaml) and spec-compliant machine envelope output - wire `_cleveragents/plan/prompt` through the local A2A facade into `PlanLifecycleService.prompt_plan(...)` with execute-phase validation and user-intervention decision recording - add Behave coverage for discoverability, happy path, format compliance, inactive-phase errors, and facade routing - stabilize intermittent quality-gate failures by hardening integration helper timeouts, relaxing a fragile CLI-core timing assertion, defaulting test worker concurrency to serial, and forcing ASV spawn launch mode ## Validation - `nox -s lint` - `nox -s typecheck` - `nox -s unit_tests` - `nox -s integration_tests` - `nox -s e2e_tests` - `nox -s coverage_report` (97.26%) - `nox` (full default sessions) Closes #885

brent.edwards added this to the v3.3.0 milestone 2026-03-30 18:31:27 +00:00

brent.edwards added the

Type

Feature

label 2026-03-30 18:31:32 +00:00

brent.edwards referenced this pull request

2026-03-30 18:31:52 +00:00

feat(cli): implement `plan prompt` command per specification #885

freemo self-assigned this 2026-04-02 06:15:14 +00:00

freemo commented

2026-04-02 08:03:13 +00:00

🔒 Claimed by pr-reviewer-1. Starting independent code review.

freemo requested changes 2026-04-02 08:17:01 +00:00

Dismissed

freemo left a comment

Independent Code Review — PR #1212

Summary

The plan prompt implementation is well-structured and aligns with the specification (agents plan prompt <PLAN_ID> <GUIDANCE>). The CLI command, lifecycle service method, A2A facade wiring, and Behave test coverage are all solid. However, this PR cannot be merged due to merge conflicts (mergeable: false). A rebase onto master is required before this can proceed.

❌ BLOCKER: Merge Conflicts

The PR currently has merge conflicts with master. The branch must be rebased before merge is possible. This is the only hard blocker.

Code Review Findings

✅ Specification Alignment

agents plan prompt <PLAN_ID> <GUIDANCE> is correctly implemented as a Typer subcommand with two positional arguments.
The command supports all 6 output formats (rich, color, table, plain, json, yaml).
The spec-compliant envelope structure (command, status, exit_code, data, timing, messages) is present for machine-readable output.
Execute-phase validation is correct: only plans in Execute phase with QUEUED/PROCESSING/ERRORED states accept prompts.
User-intervention decision recording follows the decision tree model.

✅ A2A Facade Integration

The existing stub _handle_plan_prompt is properly upgraded to call svc.prompt_plan() when the lifecycle service is available, with correct fallback to the stub when svc is None or plan_id is empty.
The routing key _cleveragents/plan/prompt was already registered; only the handler body changed.

✅ Test Coverage (5 scenarios)

Discoverability — prompt appears in --help output
Happy path (JSON) — envelope structure verified (command, status, data fields)
Format compliance — all 6 formats via Scenario Outline
Error path — inactive phase rejection with proper error message
Facade routing — A2A dispatch returns real data (not stub)

⚠️ Minor Issues (non-blocking, fix during rebase)

import time inside function body (plan.py, prompt_plan_cmd): time is a stdlib module with no circular import risk. Per CONTRIBUTING.md, imports should be at the top of the file. Move to the module-level imports.
Inline import of DecisionType (plan_lifecycle_service.py, line ~1478): from cleveragents.domain.models.core.decision import DecisionType is imported inside the prompt_plan method body. Other services (decomposition_service.py, subplan_service.py) import it at the top level. Move to the file's import block.
Overly complex expression (noxfile.py): return max(1, min(cpus, 1)) is mathematically always 1. Consider simplifying to return 1 with the explanatory comment, for clarity.

✅ Stability Fixes (acceptable)

Robot test timeout relaxations (30→45s, 30→120s, 25→90s) are reasonable for CI stability.
ASV --launch-method=spawn is a known fix for multiprocessing issues.
nosemgrep/nosec annotations on the sandbox compile()/exec() calls are appropriate documentation of intentional security decisions.

Decision

REQUEST_CHANGES — Rebase onto master to resolve merge conflicts, then this PR is ready to merge. The minor issues above can be addressed during the rebase if convenient, but the conflict resolution is the only hard requirement.

## Independent Code Review — PR #1212 ### Summary The `plan prompt` implementation is well-structured and aligns with the specification (`agents plan prompt <PLAN_ID> <GUIDANCE>`). The CLI command, lifecycle service method, A2A facade wiring, and Behave test coverage are all solid. However, **this PR cannot be merged due to merge conflicts** (`mergeable: false`). A rebase onto `master` is required before this can proceed. --- ### ❌ BLOCKER: Merge Conflicts The PR currently has merge conflicts with `master`. The branch must be rebased before merge is possible. This is the only hard blocker. --- ### Code Review Findings #### ✅ Specification Alignment - `agents plan prompt <PLAN_ID> <GUIDANCE>` is correctly implemented as a Typer subcommand with two positional arguments. - The command supports all 6 output formats (rich, color, table, plain, json, yaml). - The spec-compliant envelope structure (`command`, `status`, `exit_code`, `data`, `timing`, `messages`) is present for machine-readable output. - Execute-phase validation is correct: only plans in Execute phase with QUEUED/PROCESSING/ERRORED states accept prompts. - User-intervention decision recording follows the decision tree model. #### ✅ A2A Facade Integration - The existing stub `_handle_plan_prompt` is properly upgraded to call `svc.prompt_plan()` when the lifecycle service is available, with correct fallback to the stub when `svc is None` or `plan_id` is empty. - The routing key `_cleveragents/plan/prompt` was already registered; only the handler body changed. #### ✅ Test Coverage (5 scenarios) 1. **Discoverability** — `prompt` appears in `--help` output 2. **Happy path (JSON)** — envelope structure verified (command, status, data fields) 3. **Format compliance** — all 6 formats via Scenario Outline 4. **Error path** — inactive phase rejection with proper error message 5. **Facade routing** — A2A dispatch returns real data (not stub) #### ⚠️ Minor Issues (non-blocking, fix during rebase) 1. **`import time` inside function body** (`plan.py`, `prompt_plan_cmd`): `time` is a stdlib module with no circular import risk. Per CONTRIBUTING.md, imports should be at the top of the file. Move to the module-level imports. 2. **Inline import of `DecisionType`** (`plan_lifecycle_service.py`, line ~1478): `from cleveragents.domain.models.core.decision import DecisionType` is imported inside the `prompt_plan` method body. Other services (`decomposition_service.py`, `subplan_service.py`) import it at the top level. Move to the file's import block. 3. **Overly complex expression** (`noxfile.py`): `return max(1, min(cpus, 1))` is mathematically always `1`. Consider simplifying to `return 1` with the explanatory comment, for clarity. #### ✅ Stability Fixes (acceptable) - Robot test timeout relaxations (30→45s, 30→120s, 25→90s) are reasonable for CI stability. - ASV `--launch-method=spawn` is a known fix for multiprocessing issues. - `nosemgrep`/`nosec` annotations on the sandbox `compile()`/`exec()` calls are appropriate documentation of intentional security decisions. --- ### Decision **REQUEST_CHANGES** — Rebase onto `master` to resolve merge conflicts, then this PR is ready to merge. The minor issues above can be addressed during the rebase if convenient, but the conflict resolution is the only hard requirement.

noxfile.py

freemo commented

2026-04-02 08:17:01 +00:00

Nit: max(1, min(cpus, 1)) is mathematically always 1 regardless of cpus. Consider simplifying to return 1 — the comment already explains the rationale clearly.

**Nit**: `max(1, min(cpus, 1))` is mathematically always `1` regardless of `cpus`. Consider simplifying to `return 1` — the comment already explains the rationale clearly.

src/cleveragents/application/services/plan_lifecycle_service.py Outdated

freemo commented

2026-04-02 08:17:01 +00:00

Minor: This inline import of DecisionType should be moved to the top-level imports. Other services in this package (decomposition_service.py, subplan_service.py) already import DecisionType at module level. Lazy imports are justified for circular-dependency avoidance, but decision.py is a domain model with no back-reference to this service.

**Minor**: This inline import of `DecisionType` should be moved to the top-level imports. Other services in this package (`decomposition_service.py`, `subplan_service.py`) already import `DecisionType` at module level. Lazy imports are justified for circular-dependency avoidance, but `decision.py` is a domain model with no back-reference to this service.

src/cleveragents/cli/commands/plan.py

freemo commented

2026-04-02 08:17:01 +00:00

Minor: import time is a stdlib module with no circular-import risk. Per CONTRIBUTING.md, imports should be at the top of the file. Move this to the module-level import block.

**Minor**: `import time` is a stdlib module with no circular-import risk. Per CONTRIBUTING.md, imports should be at the top of the file. Move this to the module-level import block.

freemo referenced this pull request

2026-04-02 08:17:24 +00:00

feat(cli): implement `plan prompt` command per specification #885

freemo force-pushed feature/m3-plan-prompt from e27448c24f to e72b0a8269

2026-04-02 08:28:27 +00:00

Compare

freemo force-pushed feature/m3-plan-prompt from e72b0a8269 to e8109a36d7

2026-04-02 08:46:11 +00:00

Compare

freemo approved these changes 2026-04-02 08:46:37 +00:00

Dismissed

freemo left a comment

All quality gates pass after rebasing onto master and fixing minor import issues:

✅ Rebase onto master completed (resolved conflicts in noxfile.py, robot/cli_core.robot, robot/helper_cli_consistency.py, robot/helper_container_resolve_crash.py)
✅ import time moved to module level in plan.py
✅ DecisionType import moved to module level in plan_lifecycle_service.py
✅ nox -e lint — passed
✅ nox -e typecheck — 0 errors, 0 warnings
✅ nox -e unit_tests — 278 features passed, 8335 scenarios passed, 0 failed

Approving for squash-merge.

All quality gates pass after rebasing onto master and fixing minor import issues: - ✅ Rebase onto master completed (resolved conflicts in noxfile.py, robot/cli_core.robot, robot/helper_cli_consistency.py, robot/helper_container_resolve_crash.py) - ✅ `import time` moved to module level in plan.py - ✅ `DecisionType` import moved to module level in plan_lifecycle_service.py - ✅ `nox -e lint` — passed - ✅ `nox -e typecheck` — 0 errors, 0 warnings - ✅ `nox -e unit_tests` — 278 features passed, 8335 scenarios passed, 0 failed Approving for squash-merge.

freemo commented

2026-04-02 16:47:20 +00:00

Review claimed by reviewer pool instance reviewer-pool-1. Dispatching independent code review.

freemo approved these changes 2026-04-02 16:50:45 +00:00

freemo left a comment

Independent Code Review — PR #1212 (Second Reviewer)

Summary

Thorough review of the plan prompt implementation against the specification, CONTRIBUTING.md standards, and codebase conventions. This PR implements the missing agents plan prompt <PLAN_ID> <GUIDANCE> CLI command, wires it through the A2A facade into PlanLifecycleService, and adds comprehensive Behave test coverage.

✅ Specification Alignment

agents plan prompt <PLAN_ID> <GUIDANCE> correctly implemented as a Typer subcommand with two positional arguments matching the spec synopsis.
Execute-phase validation is correct: only plans in Execute phase with QUEUED/PROCESSING/ERRORED states accept prompts.
User-intervention decision recording follows the decision tree model described in the spec.
All 6 output formats (rich, color, table, plain, json, yaml) are supported.
Spec-compliant machine envelope structure (command, status, exit_code, data, timing, messages) is present.

✅ API Consistency

The prompt_plan_cmd follows the same patterns as other plan commands (plan_artifacts, plan_correct, etc.) — Typer annotations, --format option, error handling with typer.Abort().
The facade handler _handle_plan_prompt follows the established pattern: check svc is not None and plan_id, delegate to service, fallback to stub.
The _notify_facade call after the service call is consistent with other CLI commands.
Error response patterns (ValidationError → "Validation Error", PlanError → "Prompt Error", CleverAgentsError → "Error") are consistent.

✅ Test Quality (5 Behave Scenarios)

Discoverability — prompt appears in --help output
Happy path (JSON) — envelope structure verified (command, status, data fields, specific values)
Format compliance — all 6 formats via Scenario Outline with guidance text verification
Error path — inactive phase rejection with proper error message assertion
Facade routing — A2A dispatch returns real data (not stub), verifies plan_id and guidance passthrough

Tests are meaningful BDD scenarios, not coverage padding. Edge cases (inactive phase) and the full format matrix are covered.

✅ Correctness

time.monotonic() correctly used for elapsed time measurement (not wall clock).
time.time() correctly used for the streaming display (replacing the previous from time import time pattern).
State transition logic in prompt_plan: correctly moves to PROCESSING, clears error_message when recovering from ERRORED state.
Decision recording failure is caught and logged (warning level) without crashing the prompt delivery — this is the right priority: guidance delivery > decision bookkeeping.
guidance.strip() validation prevents blank guidance injection.

✅ Code Quality

import time moved to module level (was previously imported inside function bodies) — fixes the CONTRIBUTING.md rule about top-level imports.
DecisionType import moved to module level in plan_lifecycle_service.py — same fix.
nosemgrep/nosec annotations on compile()/exec() in wrapping.py are appropriate documentation of intentional security decisions in the controlled sandbox.
ASV --launch-method=spawn is a known fix for multiprocessing issues in CI.

✅ Commit Format

Single commit: feat(cli): implement plan prompt command
Conventional Changelog format ✅
Detailed body explaining the change ✅
ISSUES CLOSED: #885 footer ✅

✅ PR Metadata

Title follows Conventional Changelog ✅
Body has summary + validation + Closes #885 ✅
Label: Type/Feature (exactly one Type/ label) ✅
Milestone: v3.3.0 (matches issue #885) ✅
Branch: feature/m3-plan-prompt (matches issue metadata) ✅

⚠️ Minor Observations (non-blocking)

The queue field in the prompt response ("pending": 1, "applied": 0) is hardcoded rather than reflecting actual queue state. This is acceptable for the initial implementation since no real guidance queue exists yet, but should be tracked for future work.
The _handle_plan_prompt facade handler uses **prompt_payload spread which could theoretically overwrite plan_id/guidance/status keys if the service returns conflicting keys. In practice this won't happen with the current prompt_plan return structure, but worth noting for future maintainers.

Decision

APPROVED — The implementation is spec-aligned, well-tested, follows project conventions, and the commit format is correct. Proceeding with squash-merge.

## Independent Code Review — PR #1212 (Second Reviewer) ### Summary Thorough review of the `plan prompt` implementation against the specification, CONTRIBUTING.md standards, and codebase conventions. This PR implements the missing `agents plan prompt <PLAN_ID> <GUIDANCE>` CLI command, wires it through the A2A facade into `PlanLifecycleService`, and adds comprehensive Behave test coverage. --- ### ✅ Specification Alignment - `agents plan prompt <PLAN_ID> <GUIDANCE>` correctly implemented as a Typer subcommand with two positional arguments matching the spec synopsis. - Execute-phase validation is correct: only plans in Execute phase with QUEUED/PROCESSING/ERRORED states accept prompts. - User-intervention decision recording follows the decision tree model described in the spec. - All 6 output formats (rich, color, table, plain, json, yaml) are supported. - Spec-compliant machine envelope structure (`command`, `status`, `exit_code`, `data`, `timing`, `messages`) is present. ### ✅ API Consistency - The `prompt_plan_cmd` follows the same patterns as other plan commands (`plan_artifacts`, `plan_correct`, etc.) — Typer annotations, `--format` option, error handling with `typer.Abort()`. - The facade handler `_handle_plan_prompt` follows the established pattern: check `svc is not None and plan_id`, delegate to service, fallback to stub. - The `_notify_facade` call after the service call is consistent with other CLI commands. - Error response patterns (ValidationError → "Validation Error", PlanError → "Prompt Error", CleverAgentsError → "Error") are consistent. ### ✅ Test Quality (5 Behave Scenarios) 1. **Discoverability** — `prompt` appears in `--help` output 2. **Happy path (JSON)** — envelope structure verified (command, status, data fields, specific values) 3. **Format compliance** — all 6 formats via Scenario Outline with guidance text verification 4. **Error path** — inactive phase rejection with proper error message assertion 5. **Facade routing** — A2A dispatch returns real data (not stub), verifies plan_id and guidance passthrough Tests are meaningful BDD scenarios, not coverage padding. Edge cases (inactive phase) and the full format matrix are covered. ### ✅ Correctness - `time.monotonic()` correctly used for elapsed time measurement (not wall clock). - `time.time()` correctly used for the streaming display (replacing the previous `from time import time` pattern). - State transition logic in `prompt_plan`: correctly moves to PROCESSING, clears `error_message` when recovering from ERRORED state. - Decision recording failure is caught and logged (warning level) without crashing the prompt delivery — this is the right priority: guidance delivery > decision bookkeeping. - `guidance.strip()` validation prevents blank guidance injection. ### ✅ Code Quality - `import time` moved to module level (was previously imported inside function bodies) — fixes the CONTRIBUTING.md rule about top-level imports. - `DecisionType` import moved to module level in `plan_lifecycle_service.py` — same fix. - `nosemgrep`/`nosec` annotations on `compile()`/`exec()` in `wrapping.py` are appropriate documentation of intentional security decisions in the controlled sandbox. - ASV `--launch-method=spawn` is a known fix for multiprocessing issues in CI. ### ✅ Commit Format - Single commit: `feat(cli): implement plan prompt command` - Conventional Changelog format ✅ - Detailed body explaining the change ✅ - `ISSUES CLOSED: #885` footer ✅ ### ✅ PR Metadata - Title follows Conventional Changelog ✅ - Body has summary + validation + `Closes #885` ✅ - Label: `Type/Feature` (exactly one Type/ label) ✅ - Milestone: v3.3.0 (matches issue #885) ✅ - Branch: `feature/m3-plan-prompt` (matches issue metadata) ✅ ### ⚠️ Minor Observations (non-blocking) 1. The `queue` field in the prompt response (`"pending": 1, "applied": 0`) is hardcoded rather than reflecting actual queue state. This is acceptable for the initial implementation since no real guidance queue exists yet, but should be tracked for future work. 2. The `_handle_plan_prompt` facade handler uses `**prompt_payload` spread which could theoretically overwrite `plan_id`/`guidance`/`status` keys if the service returns conflicting keys. In practice this won't happen with the current `prompt_plan` return structure, but worth noting for future maintainers. ### Decision **APPROVED** — The implementation is spec-aligned, well-tested, follows project conventions, and the commit format is correct. Proceeding with squash-merge.

freemo merged commit d59fa47fd0 into master

2026-04-02 16:50:56 +00:00

freemo deleted branch feature/m3-plan-prompt

2026-04-02 16:50:56 +00:00

freemo referenced this pull request

2026-04-02 16:51:09 +00:00

feat(cli): implement `plan prompt` command per specification #885

freemo referenced this pull request

2026-04-02 17:11:30 +00:00

[Automated] Product Build Session State #1314

freemo referenced this pull request

2026-04-02 17:42:02 +00:00