fix(cli): add missing --format flag to action create + wire real LLM actors into plan executor #963

Merged
freemo merged 3 commits from bugfix/m2-action-format-plan-executor into master 2026-03-16 00:08:09 +00:00
Owner

Summary

Two critical bug fixes required for the M2 E2E acceptance test (robot/e2e/m2_acceptance.robot) to pass:

Bug #1: action create missing --format flag (#959)

The action create CLI command was the only action subcommand missing the --format/-f parameter. All other action subcommands (list, show, archive) already accepted --format and routed through _print_action(). Running action create --config action.yaml --format plain failed with a Typer unrecognized-option error.

Fix: Added the fmt parameter to the create() function signature and wired it to _print_action().

Bug #2: plan execute always used stub actors (#960)

The plan execute CLI command only performed phase transitions (Strategize → Execute) without ever invoking PlanExecutor to drive the strategize or execute actors. PlanExecutor.__init__ unconditionally created StrategizeStubActor() and ExecuteStubActor() which parse text locally and return empty changesets — no real LLM call was made.

Fix:

  • Added _get_plan_executor() helper that resolves ProviderRegistry from the DI container and constructs LLMStrategizeActor / LLMExecuteActor for real LLM invocations
  • Updated execute_plan CLI to detect plan phase/state and run the appropriate actor:
    • Strategize/queued → run strategize actor → transition to Execute
    • Strategize/complete → phase transition only (backward compat)
    • Execute/queued → run execute actor → mark complete
  • New llm_actors.py module with LLMStrategizeActor and LLMExecuteActor that resolve provider/model actor names to live LangChain LLM instances
  • PlanExecutor.__init__ now accepts optional strategize_actor and execute_actor parameters

Existing mock-based BDD tests remain backward-compatible via duck-typing fallback.

Testing

  • Added Behave BDD scenarios for --format plain and --format json on action create
  • Added Behave BDD scenarios testing custom actor injection into PlanExecutor
  • All changes validated against M2 E2E test expectations

Issues Closed

## Summary Two critical bug fixes required for the M2 E2E acceptance test (`robot/e2e/m2_acceptance.robot`) to pass: ### Bug #1: `action create` missing `--format` flag (#959) The `action create` CLI command was the only action subcommand missing the `--format`/`-f` parameter. All other action subcommands (`list`, `show`, `archive`) already accepted `--format` and routed through `_print_action()`. Running `action create --config action.yaml --format plain` failed with a Typer unrecognized-option error. **Fix:** Added the `fmt` parameter to the `create()` function signature and wired it to `_print_action()`. ### Bug #2: `plan execute` always used stub actors (#960) The `plan execute` CLI command only performed phase transitions (Strategize → Execute) without ever invoking `PlanExecutor` to drive the strategize or execute actors. `PlanExecutor.__init__` unconditionally created `StrategizeStubActor()` and `ExecuteStubActor()` which parse text locally and return empty changesets — no real LLM call was made. **Fix:** - Added `_get_plan_executor()` helper that resolves `ProviderRegistry` from the DI container and constructs `LLMStrategizeActor` / `LLMExecuteActor` for real LLM invocations - Updated `execute_plan` CLI to detect plan phase/state and run the appropriate actor: - `Strategize/queued` → run strategize actor → transition to Execute - `Strategize/complete` → phase transition only (backward compat) - `Execute/queued` → run execute actor → mark complete - New `llm_actors.py` module with `LLMStrategizeActor` and `LLMExecuteActor` that resolve `provider/model` actor names to live LangChain LLM instances - `PlanExecutor.__init__` now accepts optional `strategize_actor` and `execute_actor` parameters Existing mock-based BDD tests remain backward-compatible via duck-typing fallback. ## Testing - Added Behave BDD scenarios for `--format plain` and `--format json` on `action create` - Added Behave BDD scenarios testing custom actor injection into `PlanExecutor` - All changes validated against M2 E2E test expectations ## Issues Closed - Closes #959 - Closes #960
test(e2e): E2E acceptance criteria for M2 (v3.1.0) — actor compiler and LLM integration
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / quality (pull_request) Successful in 20s
CI / build (pull_request) Successful in 19s
CI / security (pull_request) Successful in 35s
CI / typecheck (pull_request) Successful in 36s
CI / e2e_tests (pull_request) Failing after 52s
CI / integration_tests (pull_request) Successful in 2m58s
CI / unit_tests (pull_request) Successful in 3m36s
CI / docker (pull_request) Successful in 35s
CI / coverage (pull_request) Successful in 4m41s
CI / benchmark-regression (pull_request) Failing after 40m8s
d879ba1f96
Add Robot Framework E2E test suite robot/e2e/m2_acceptance.robot exercising
M2 acceptance criteria with zero mocking. Test creates a temp git repo with
sample project files, registers a custom actor via CLI, sets up resource and
project, creates an action referencing the actor, and runs the full plan
lifecycle (use → execute strategize → execute → diff → apply). Validates
actor YAML compilation, skill registry, tool lifecycle, and LLM integration
through real CLI invocations with real provider API keys. Uses flexible
structural assertions and expected_rc=None for LLM-dependent commands.

ISSUES CLOSED: #742
freemo force-pushed bugfix/m2-action-format-plan-executor from d808c010d9
Some checks failed
CI / lint (pull_request) Successful in 26s
CI / typecheck (pull_request) Successful in 43s
CI / security (pull_request) Successful in 1m3s
CI / benchmark-publish (pull_request) Has been skipped
CI / quality (pull_request) Successful in 55s
CI / build (pull_request) Successful in 24s
CI / e2e_tests (pull_request) Successful in 2m34s
CI / unit_tests (pull_request) Successful in 5m33s
CI / integration_tests (pull_request) Successful in 5m51s
CI / docker (pull_request) Successful in 57s
CI / coverage (pull_request) Successful in 5m58s
CI / benchmark-regression (pull_request) Has been cancelled
to 0a0e2796a5
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 21s
CI / quality (pull_request) Successful in 30s
CI / build (pull_request) Successful in 21s
CI / typecheck (pull_request) Successful in 1m6s
CI / security (pull_request) Successful in 1m8s
CI / e2e_tests (pull_request) Successful in 2m7s
CI / unit_tests (pull_request) Successful in 3m17s
CI / integration_tests (pull_request) Successful in 3m42s
CI / docker (pull_request) Successful in 55s
CI / coverage (pull_request) Successful in 5m58s
CI / benchmark-regression (pull_request) Successful in 38m7s
2026-03-15 23:18:11 +00:00
Compare
freemo force-pushed bugfix/m2-action-format-plan-executor from 0a0e2796a5
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 21s
CI / quality (pull_request) Successful in 30s
CI / build (pull_request) Successful in 21s
CI / typecheck (pull_request) Successful in 1m6s
CI / security (pull_request) Successful in 1m8s
CI / e2e_tests (pull_request) Successful in 2m7s
CI / unit_tests (pull_request) Successful in 3m17s
CI / integration_tests (pull_request) Successful in 3m42s
CI / docker (pull_request) Successful in 55s
CI / coverage (pull_request) Successful in 5m58s
CI / benchmark-regression (pull_request) Successful in 38m7s
to dfa05a6909
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 26s
CI / quality (pull_request) Successful in 32s
CI / build (pull_request) Successful in 35s
CI / security (pull_request) Successful in 52s
CI / typecheck (pull_request) Successful in 56s
CI / e2e_tests (pull_request) Successful in 1m36s
CI / unit_tests (pull_request) Successful in 3m25s
CI / integration_tests (pull_request) Successful in 3m55s
CI / docker (pull_request) Successful in 57s
CI / coverage (pull_request) Successful in 6m54s
CI / lint (push) Successful in 14s
CI / quality (push) Successful in 35s
CI / typecheck (push) Successful in 39s
CI / security (push) Successful in 51s
CI / benchmark-regression (push) Has been skipped
CI / build (push) Successful in 21s
CI / e2e_tests (push) Successful in 2m14s
CI / unit_tests (push) Successful in 5m9s
CI / integration_tests (push) Successful in 5m29s
CI / docker (push) Successful in 1m6s
CI / coverage (push) Successful in 6m12s
CI / benchmark-publish (push) Successful in 20m51s
CI / benchmark-regression (pull_request) Successful in 37m16s
2026-03-16 00:00:15 +00:00
Compare
freemo scheduled this pull request to auto merge when all checks succeed 2026-03-16 00:00:45 +00:00
freemo merged commit dfa05a6909 into master 2026-03-16 00:08:09 +00:00
freemo deleted branch bugfix/m2-action-format-plan-executor 2026-03-16 00:08:09 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core!963
No description provided.