fix(cli): wire real LLM actors into plan executor for production execution #960

Closed
opened 2026-03-15 22:13:51 +00:00 by freemo · 0 comments
Owner

Metadata

  • Commit Message: fix(cli): wire real LLM actors into plan executor for production execution
  • Branch: bugfix/m2-action-format-plan-executor

Background

The _get_plan_executor() function in plan.py creates a PlanExecutor with only the lifecycle service. The PlanExecutor.__init__ unconditionally creates StrategizeStubActor() and ExecuteStubActor() — stub actors that parse definition_of_done text locally and return empty ChangeSets without any LLM invocation. This means plan execute in production ALWAYS uses stubs instead of real LLM providers.

The real LLM execution infrastructure already exists: ProviderRegistry.create_ai_provider()LangChainChatProviderPlanGenerationGraph. The PlanService._resolve_ai_provider_for_actor() resolves actor names (e.g., openai/gpt-4) to real LLM provider instances. But PlanExecutor never uses any of this.

The M2 E2E acceptance test (PR #793) exercises the full plan lifecycle with real LLM API keys and expects real code generation. With stub actors, the changeset is always empty and no actual LLM calls are made.

Current Behavior

plan execute always uses StrategizeStubActor and ExecuteStubActor, regardless of what actors are configured on the plan's action. No real LLM calls are ever made. The ExecuteStubActor returns an empty ChangeSet with zero entries.

Expected Behavior

plan execute should resolve the plan's strategy_actor and execution_actor names to real LLM providers via the existing ProviderRegistry / ActorService infrastructure, invoke the LLM for strategy generation and code execution, and return results with actual content.

Acceptance Criteria

  • plan execute calls real LLM providers when valid actor names and API keys are configured
  • Strategy phase produces non-empty decisions from the LLM
  • Execute phase produces a ChangeSet with actual file entries from the LLM
  • PlanExecutor accepts optional real actor implementations (no hardcoded stubs)
  • _get_plan_executor() uses the DI container to wire real actors
  • Existing tests continue to pass (stubs remain default when no provider is available)
  • Coverage >= 97% maintained

Subtasks

  • Modify PlanExecutor.__init__ to accept optional strategize_actor and execute_actor parameters
  • Create LLMStrategizeActor that resolves actor name → ProviderRegistry → real LLM for strategy
  • Create LLMExecuteActor that resolves actor name → ProviderRegistry → real LLM for execution
  • Update _get_plan_executor() in plan.py to use DI container for real actor wiring
  • Tests (Behave): Add BDD scenarios for plan executor with injected actors
  • Tests (Robot): Verify existing Robot integration tests still pass
  • Update CHANGELOG.md
  • Verify coverage >= 97% via nox -s coverage_report
  • Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
## Metadata - **Commit Message**: `fix(cli): wire real LLM actors into plan executor for production execution` - **Branch**: `bugfix/m2-action-format-plan-executor` ## Background The `_get_plan_executor()` function in `plan.py` creates a `PlanExecutor` with only the lifecycle service. The `PlanExecutor.__init__` unconditionally creates `StrategizeStubActor()` and `ExecuteStubActor()` — stub actors that parse `definition_of_done` text locally and return empty `ChangeSet`s without any LLM invocation. This means `plan execute` in production ALWAYS uses stubs instead of real LLM providers. The real LLM execution infrastructure already exists: `ProviderRegistry.create_ai_provider()` → `LangChainChatProvider` → `PlanGenerationGraph`. The `PlanService._resolve_ai_provider_for_actor()` resolves actor names (e.g., `openai/gpt-4`) to real LLM provider instances. But `PlanExecutor` never uses any of this. The M2 E2E acceptance test (PR #793) exercises the full plan lifecycle with real LLM API keys and expects real code generation. With stub actors, the changeset is always empty and no actual LLM calls are made. ## Current Behavior `plan execute` always uses `StrategizeStubActor` and `ExecuteStubActor`, regardless of what actors are configured on the plan's action. No real LLM calls are ever made. The `ExecuteStubActor` returns an empty `ChangeSet` with zero entries. ## Expected Behavior `plan execute` should resolve the plan's `strategy_actor` and `execution_actor` names to real LLM providers via the existing `ProviderRegistry` / `ActorService` infrastructure, invoke the LLM for strategy generation and code execution, and return results with actual content. ## Acceptance Criteria - [ ] `plan execute` calls real LLM providers when valid actor names and API keys are configured - [ ] Strategy phase produces non-empty decisions from the LLM - [ ] Execute phase produces a ChangeSet with actual file entries from the LLM - [ ] `PlanExecutor` accepts optional real actor implementations (no hardcoded stubs) - [ ] `_get_plan_executor()` uses the DI container to wire real actors - [ ] Existing tests continue to pass (stubs remain default when no provider is available) - [ ] Coverage >= 97% maintained ## Subtasks - [ ] Modify `PlanExecutor.__init__` to accept optional `strategize_actor` and `execute_actor` parameters - [ ] Create `LLMStrategizeActor` that resolves actor name → ProviderRegistry → real LLM for strategy - [ ] Create `LLMExecuteActor` that resolves actor name → ProviderRegistry → real LLM for execution - [ ] Update `_get_plan_executor()` in `plan.py` to use DI container for real actor wiring - [ ] Tests (Behave): Add BDD scenarios for plan executor with injected actors - [ ] Tests (Robot): Verify existing Robot integration tests still pass - [ ] Update CHANGELOG.md - [ ] Verify coverage >= 97% via `nox -s coverage_report` - [ ] Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done.
freemo added this to the v3.1.0 milestone 2026-03-15 22:13:59 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#960
No description provided.