feat(actor): add plan_subplan tool and decision emission #463

2026-02-27T08:12:01Z

aditya commented

2026-02-27 08:12:01 +00:00

Description

Implements the builtin/plan-subplan tool that strategy actors call to emit
SUBPLAN_SPAWN or SUBPLAN_PARALLEL_SPAWN decisions when decomposing a large
plan into coordinated child plans. Validates the subplan payload, applies
sensible defaults, generates human-readable rationale text, and optionally
persists the emitted Decision via an injected DecisionService.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Refactoring (no functional changes)
Documentation update
Test improvements
CI/CD changes

Quality Checklist

Code follows the project's coding standards (see CONTRIBUTING.md)
All public/protected methods have argument validation
Static typing is complete (no Any unless justified)
nox -s typecheck passes with no errors
nox -s lint passes with no errors
Unit tests written/updated (Behave scenarios in features/)
Integration tests written/updated (Robot suites in robot/) if applicable
Coverage remains above 97% (nox -s coverage_report)
No security issues introduced (nox -s security_scan)
No dead code introduced (nox -s dead_code)
Documentation updated if behavior changed

Testing

All tests written before implementation (TDD). Tests cover validation,
defaults, serial vs parallel decision types, rationale generation, decision
service injection, registry registration, and ToolRunner execution.

Test Commands Run

nox -s unit_tests       # 20 Behave scenarios, 70 steps — all passed
nox -s typecheck        # 0 errors, 0 warnings
nox -s lint             # 0 errors (ruff, bandit)
nox -s coverage_report  # 100% coverage on subplan_tool.py
python -m robot robot/plan_subplan_tool.robot  # 9 smoke tests — all passed

Related Issues

ISSUES CLOSED #198

Implementation Notes

New files

File	Purpose
`src/cleveragents/tool/builtins/subplan_tool.py`	Core tool — `SubplanPayload`, `make_plan_subplan_spec`, `PLAN_SUBPLAN_SPEC`
`features/plan_subplan_tool.feature`	Behave BDD scenarios (20 scenarios)
`features/steps/plan_subplan_tool_steps.py`	Step definitions with `_StubDecisionService`
`robot/plan_subplan_tool.robot`	Robot Framework smoke suite (9 tests)
`robot/helper_plan_subplan_tool.py`	Robot helper script
`benchmarks/subplan_actor_tool_bench.py`	ASV benchmarks (5 suites)
`examples/actors/strategy_with_subplan.yaml`	Actor YAML example with serial + parallel payload examples

Modified files

src/cleveragents/tool/builtins/__init__.py — added register_subplan_tool, SubplanPayload, make_plan_subplan_spec, ALL_SUBPLAN_TOOLS exports
CHANGELOG.md — added #198 entry

Architecture decisions

Optional DecisionService injection via factory: The tool itself is a
pure function. make_plan_subplan_spec(decision_service=None) creates a
ToolSpec whose handler is a closure bound to the service. In production the
runtime injects the real DecisionService; in tests a list-based stub is used,
keeping 100% of tool logic testable without a database.

SubplanPayload as the single validation layer: All input constraints
(non-empty goal, at least one of resource_scopes/project_ref, max_parallel
1–50, valid merge_strategy enum, valid context_view value) are enforced by a
frozen Pydantic model. Any validation failure returns {"success": False, "error": ...} rather than raising, so the ToolRunner never sees an unhandled exception.

Decision type driven by parallel flag: parallel=False emits
SUBPLAN_SPAWN; parallel=True emits SUBPLAN_PARALLEL_SPAWN. The
dependencies list (sibling subplan ULIDs) is only meaningful for parallel
spawns but is accepted and stored in both cases for forward compatibility.

Rationale auto-generation: _build_rationale() constructs a multi-line
human-readable string from the payload (goal, targeted resources, execution
mode, dependencies, context view override). This text is stored on the
Decision.rationale field and surfaced in plan explain output.

## Description Implements the `builtin/plan-subplan` tool that strategy actors call to emit `SUBPLAN_SPAWN` or `SUBPLAN_PARALLEL_SPAWN` decisions when decomposing a large plan into coordinated child plans. Validates the subplan payload, applies sensible defaults, generates human-readable rationale text, and optionally persists the emitted `Decision` via an injected `DecisionService`. ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] Refactoring (no functional changes) - [x] Documentation update - [ ] Test improvements - [ ] CI/CD changes ## Quality Checklist - [x] Code follows the project's coding standards (see CONTRIBUTING.md) - [x] All public/protected methods have argument validation - [x] Static typing is complete (no `Any` unless justified) - [x] `nox -s typecheck` passes with no errors - [x] `nox -s lint` passes with no errors - [x] Unit tests written/updated (Behave scenarios in `features/`) - [x] Integration tests written/updated (Robot suites in `robot/`) if applicable - [x] Coverage remains above 97% (`nox -s coverage_report`) - [x] No security issues introduced (`nox -s security_scan`) - [x] No dead code introduced (`nox -s dead_code`) - [x] Documentation updated if behavior changed ## Testing All tests written before implementation (TDD). Tests cover validation, defaults, serial vs parallel decision types, rationale generation, decision service injection, registry registration, and ToolRunner execution. ### Test Commands Run ```bash nox -s unit_tests # 20 Behave scenarios, 70 steps — all passed nox -s typecheck # 0 errors, 0 warnings nox -s lint # 0 errors (ruff, bandit) nox -s coverage_report # 100% coverage on subplan_tool.py python -m robot robot/plan_subplan_tool.robot # 9 smoke tests — all passed ``` ## Related Issues ISSUES CLOSED #198 ## Implementation Notes ### New files | File | Purpose | |---|---| | `src/cleveragents/tool/builtins/subplan_tool.py` | Core tool — `SubplanPayload`, `make_plan_subplan_spec`, `PLAN_SUBPLAN_SPEC` | | `features/plan_subplan_tool.feature` | Behave BDD scenarios (20 scenarios) | | `features/steps/plan_subplan_tool_steps.py` | Step definitions with `_StubDecisionService` | | `robot/plan_subplan_tool.robot` | Robot Framework smoke suite (9 tests) | | `robot/helper_plan_subplan_tool.py` | Robot helper script | | `benchmarks/subplan_actor_tool_bench.py` | ASV benchmarks (5 suites) | | `examples/actors/strategy_with_subplan.yaml` | Actor YAML example with serial + parallel payload examples | ### Modified files - `src/cleveragents/tool/builtins/__init__.py` — added `register_subplan_tool`, `SubplanPayload`, `make_plan_subplan_spec`, `ALL_SUBPLAN_TOOLS` exports - `CHANGELOG.md` — added `#198` entry ### Architecture decisions **Optional `DecisionService` injection via factory:** The tool itself is a pure function. `make_plan_subplan_spec(decision_service=None)` creates a `ToolSpec` whose handler is a closure bound to the service. In production the runtime injects the real `DecisionService`; in tests a list-based stub is used, keeping 100% of tool logic testable without a database. **`SubplanPayload` as the single validation layer:** All input constraints (non-empty goal, at least one of `resource_scopes`/`project_ref`, `max_parallel` 1–50, valid `merge_strategy` enum, valid `context_view` value) are enforced by a frozen Pydantic model. Any validation failure returns `{"success": False, "error": ...}` rather than raising, so the ToolRunner never sees an unhandled exception. **Decision type driven by `parallel` flag:** `parallel=False` emits `SUBPLAN_SPAWN`; `parallel=True` emits `SUBPLAN_PARALLEL_SPAWN`. The `dependencies` list (sibling subplan ULIDs) is only meaningful for parallel spawns but is accepted and stored in both cases for forward compatibility. **Rationale auto-generation:** `_build_rationale()` constructs a multi-line human-readable string from the payload (goal, targeted resources, execution mode, dependencies, context view override). This text is stored on the `Decision.rationale` field and surfaced in `plan explain` output.

aditya added this to the v3.4.0 milestone 2026-02-27 08:12:01 +00:00

aditya added the

Type

Feature

label 2026-02-27 08:12:01 +00:00

aditya added 1 commit 2026-02-27 08:12:02 +00:00

feat(actor): add plan_subplan tool and decision emission

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / lint (pull_request) Successful in 16s

Details

CI / build (pull_request) Successful in 19s

Details

CI / quality (pull_request) Successful in 19s

Details

CI / security (pull_request) Successful in 32s

Details

CI / typecheck (pull_request) Successful in 44s

Details

CI / integration_tests (pull_request) Successful in 2m46s

Details

CI / unit_tests (pull_request) Failing after 19m21s

Details

CI / docker (pull_request) Has been skipped

Details

CI / benchmark-regression (pull_request) Successful in 21m3s

Details

CI / coverage (pull_request) Has been cancelled

Details

b888afab71

Add builtin/plan-subplan tool for strategy actors to emit SUBPLAN_SPAWN
or SUBPLAN_PARALLEL_SPAWN decisions when decomposing a plan into child
plans. Implements all acceptance criteria from issue #198:

- SubplanPayload (Pydantic) validates goal, resource_scopes/project_ref
  (at least one required), merge_strategy, max_parallel (1-50), parallel
  flag, dependencies, and context_view override.
- Defaults: merge_strategy=git_three_way, max_parallel=5, parallel=False,
  dependencies=[]. Omitted fields inherit sensible values automatically.
- make_plan_subplan_spec(decision_service=None) factory supports optional
  DecisionService injection for persistent decision recording.
- _build_rationale() generates human-readable rationale text (goal, scope,
  execution mode, dependencies, context_view) surfaced in plan explain.
- register_subplan_tool() added to tool/builtins/__init__.py for bulk
  registration. PLAN_SUBPLAN_SPEC exported as the default ready-to-use spec.
- Actor YAML example (examples/actors/strategy_with_subplan.yaml) with
  annotated serial and parallel spawn payload examples.
- Behave BDD: 20 scenarios, 70 steps covering validation, defaults,
  decision type, rationale, service injection, registry, and ToolRunner.
- Robot Framework: 9 smoke tests via robot/plan_subplan_tool.robot.
- ASV benchmarks: benchmarks/subplan_actor_tool_bench.py (5 suites).
- Coverage: 100% on subplan_tool.py. Lint, typecheck, and security clean.

ISSUES CLOSED: #198

aditya requested review from CoreRasurae 2026-02-27 08:12:13 +00:00

aditya added 1 commit 2026-02-27 08:46:33 +00:00

Merge branch 'master' into feature/m5-subplan-actor

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / lint (pull_request) Successful in 16s

Details

CI / build (pull_request) Successful in 18s

Details

CI / quality (pull_request) Successful in 18s

Details

CI / typecheck (pull_request) Successful in 33s

Details

CI / security (pull_request) Successful in 42s

Details

CI / integration_tests (pull_request) Successful in 2m40s

Details

CI / unit_tests (pull_request) Failing after 19m35s

Details

CI / docker (pull_request) Has been skipped

Details

CI / benchmark-regression (pull_request) Successful in 19m36s

Details

CI / coverage (pull_request) Has been cancelled

Details

2b5b73c689

aditya added 1 commit 2026-02-27 09:28:24 +00:00

fix(tests): update actor_examples count to 7 after subplan example

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / lint (pull_request) Successful in 13s

Details

CI / build (pull_request) Successful in 16s

Details

CI / quality (pull_request) Successful in 17s

Details

CI / typecheck (pull_request) Successful in 31s

Details

CI / security (pull_request) Successful in 35s

Details

CI / integration_tests (pull_request) Successful in 2m50s

Details

CI / unit_tests (pull_request) Successful in 20m54s

Details

CI / docker (pull_request) Successful in 39s

Details

CI / benchmark-regression (pull_request) Successful in 26m10s

Details

CI / coverage (pull_request) Successful in 48m2s

Details

131be21e80

Refs: #198

aditya force-pushed feature/m5-subplan-actor from 131be21e80 to 4c4bd74d94

2026-02-27 09:36:55 +00:00

Compare

aditya force-pushed feature/m5-subplan-actor from 4c4bd74d94 to 131be21e80

2026-02-27 09:38:36 +00:00

Compare

CoreRasurae commented

2026-02-27 11:31:07 +00:00

Code Review — Bugs & Security Findings

Reviewed commits b888afab (feat) and 131be21e (fix) on feature/m5-subplan-actor against issue #198 acceptance criteria and docs/specification.md.

BUGS

B1 [High] `parent_decision_id` accepted but never returned to caller

File: src/cleveragents/tool/builtins/subplan_tool.py:268-281

The parent_decision_id field is accepted in SubplanPayload (line 149), stored in the Decision object (line 255), but the handler's return dict (lines 268-281) omits it entirely. The calling actor has no way to confirm the parent linkage was recorded.

The corresponding Behave scenario ("parent_decision_id is included in the decision when provided" at features/plan_subplan_tool.feature:112-116) only asserts success=True and decision_type — it never verifies parent_decision_id is present in the result, making the test name misleading and the assertion vacuous.

Fix: Add "parent_decision_id": payload.parent_decision_id to the return dict at line 281, and update the test to assert its value.

B2 [Medium] "Verify all examples are documented" scenario not updated

File: features/actor_examples.feature:665-677

The fix commit (131be21e) correctly updated the "Verify all example files exist" scenario (line 654) to expect 7 files including strategy_with_subplan.yaml. However, the adjacent "Verify all examples are documented" scenario (line 665) was not updated to check that strategy_with_subplan.yaml is referenced in docs/reference/actors_examples.md. Additionally, docs/reference/actors_examples.md does not appear to have been updated to reference the new example.

B3 [Medium] Specification naming discrepancy

Spec: docs/specification.md §Child Plan Spawning Mechanism:

"The local/create-subplan tool is independently registered and referenced by name in the actor graph node."

Implementation: builtin/plan-subplan (subplan_tool.py:76)

These differ in both namespace (local/ vs builtin/) and name (create-subplan vs plan-subplan). Either the specification or the implementation should be updated for consistency.

SECURITY

S1 [High] No ULID format validation on `plan_id`, `dependencies`, `parent_decision_id`

File: src/cleveragents/tool/builtins/subplan_tool.py:115,141-143,149-151

SubplanPayload accepts arbitrary strings for plan_id (only min_length=1), dependencies (no validation), and parent_decision_id (no validation). These fields semantically represent ULIDs.

The Decision model in the same codebase (domain/models/core/decision.py) rigorously validates ULID fields with a regex pattern (^[0-9A-HJKMNP-TV-Z]{26}$) and dedicated @field_validator methods. This inconsistency means:

Malformed plan_id values pass the tool handler, then fail inside Decision(**decision_kwargs), producing a confusing error message about the Decision model rather than about the tool input.
Malformed dependencies entries flow into the rationale text completely unchecked.

Fix: Add @field_validator methods on plan_id, dependencies, and parent_decision_id in SubplanPayload enforcing the same ULID regex pattern used by the Decision model.

S2 [Medium] No path validation on `resource_scopes`

File: src/cleveragents/tool/builtins/subplan_tool.py:119-121

resource_scopes accepts arbitrary strings with no format or sanitization checks. Values like [""], ["../../etc/passwd"], or extremely long strings are accepted and flow into the Decision rationale and downstream processing. Early validation (non-empty strings, no path traversal sequences) would prevent garbage from entering the decision tree.

S3 [Low] Overly broad exception handling masks programming errors

File: src/cleveragents/tool/builtins/subplan_tool.py:226

try:
    payload = SubplanPayload(**inputs)
except Exception as exc:  # catches everything
    return {"success": False, "error": str(exc)}

This catches ValidationError (intended) but also TypeError, AttributeError, KeyError, and any other exception that might indicate a bug in the validation code itself. These bugs would silently become user-facing error messages instead of surfacing as real failures.

Fix: Catch pydantic.ValidationError and ValueError explicitly.

## Code Review — Bugs & Security Findings Reviewed commits `b888afab` (feat) and `131be21e` (fix) on `feature/m5-subplan-actor` against issue #198 acceptance criteria and `docs/specification.md`. --- ### BUGS #### B1 [High] `parent_decision_id` accepted but never returned to caller **File:** `src/cleveragents/tool/builtins/subplan_tool.py:268-281` The `parent_decision_id` field is accepted in `SubplanPayload` (line 149), stored in the `Decision` object (line 255), but the handler's return dict (lines 268-281) **omits it entirely**. The calling actor has no way to confirm the parent linkage was recorded. The corresponding Behave scenario ("parent_decision_id is included in the decision when provided" at `features/plan_subplan_tool.feature:112-116`) only asserts `success=True` and `decision_type` — it never verifies `parent_decision_id` is present in the result, making the test name misleading and the assertion vacuous. **Fix:** Add `"parent_decision_id": payload.parent_decision_id` to the return dict at line 281, and update the test to assert its value. --- #### B2 [Medium] "Verify all examples are documented" scenario not updated **File:** `features/actor_examples.feature:665-677` The fix commit (`131be21e`) correctly updated the "Verify all example files exist" scenario (line 654) to expect 7 files including `strategy_with_subplan.yaml`. However, the adjacent "Verify all examples are documented" scenario (line 665) was **not** updated to check that `strategy_with_subplan.yaml` is referenced in `docs/reference/actors_examples.md`. Additionally, `docs/reference/actors_examples.md` does not appear to have been updated to reference the new example. --- #### B3 [Medium] Specification naming discrepancy **Spec:** `docs/specification.md` §Child Plan Spawning Mechanism: > *"The `local/create-subplan` tool is independently registered and referenced by name in the actor graph node."* **Implementation:** `builtin/plan-subplan` (`subplan_tool.py:76`) These differ in both namespace (`local/` vs `builtin/`) and name (`create-subplan` vs `plan-subplan`). Either the specification or the implementation should be updated for consistency. --- ### SECURITY #### S1 [High] No ULID format validation on `plan_id`, `dependencies`, `parent_decision_id` **File:** `src/cleveragents/tool/builtins/subplan_tool.py:115,141-143,149-151` `SubplanPayload` accepts arbitrary strings for `plan_id` (only `min_length=1`), `dependencies` (no validation), and `parent_decision_id` (no validation). These fields semantically represent ULIDs. The `Decision` model **in the same codebase** (`domain/models/core/decision.py`) rigorously validates ULID fields with a regex pattern (`^[0-9A-HJKMNP-TV-Z]{26}$`) and dedicated `@field_validator` methods. This inconsistency means: - Malformed `plan_id` values pass the tool handler, then fail inside `Decision(**decision_kwargs)`, producing a confusing error message about the Decision model rather than about the tool input. - Malformed `dependencies` entries flow into the rationale text completely unchecked. **Fix:** Add `@field_validator` methods on `plan_id`, `dependencies`, and `parent_decision_id` in `SubplanPayload` enforcing the same ULID regex pattern used by the `Decision` model. --- #### S2 [Medium] No path validation on `resource_scopes` **File:** `src/cleveragents/tool/builtins/subplan_tool.py:119-121` `resource_scopes` accepts arbitrary strings with no format or sanitization checks. Values like `[""]`, `["../../etc/passwd"]`, or extremely long strings are accepted and flow into the Decision rationale and downstream processing. Early validation (non-empty strings, no path traversal sequences) would prevent garbage from entering the decision tree. --- #### S3 [Low] Overly broad exception handling masks programming errors **File:** `src/cleveragents/tool/builtins/subplan_tool.py:226` ```python try: payload = SubplanPayload(**inputs) except Exception as exc: # catches everything return {"success": False, "error": str(exc)} ``` This catches `ValidationError` (intended) but also `TypeError`, `AttributeError`, `KeyError`, and any other exception that might indicate a bug in the validation code itself. These bugs would silently become user-facing error messages instead of surfacing as real failures. **Fix:** Catch `pydantic.ValidationError` and `ValueError` explicitly.

freemo added the

labels 2026-03-01 04:36:09 +00:00

aditya force-pushed feature/m5-subplan-actor from cc72ec0936 to eb4e6f59d9

2026-03-02 09:06:37 +00:00

Compare

aditya added 1 commit 2026-03-02 12:37:52 +00:00

Merge branch 'master' into feature/m5-subplan-actor

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / lint (pull_request) Successful in 16s

Details

CI / build (pull_request) Successful in 16s

Details

CI / quality (pull_request) Successful in 23s

Details

CI / security (pull_request) Successful in 35s

Details

CI / typecheck (pull_request) Successful in 36s

Details

CI / unit_tests (pull_request) Successful in 2m17s

Details

CI / integration_tests (pull_request) Successful in 2m50s

Details

CI / docker (pull_request) Successful in 38s

Details

CI / coverage (pull_request) Successful in 3m35s

Details

CI / benchmark-regression (pull_request) Successful in 23m2s

Details

d2f9e4a43b

hamza.khyari commented

2026-03-03 14:01:22 +00:00

P1 — Must Fix

P1-1: Sham rationale test — asserts non-empty, not goal presence

features/steps/plan_subplan_tool_steps.py:302-305

The feature scenario "Rationale is auto-generated for serial subplan" claims to verify the goal is mentioned in the rationale, but the step step_then_rationale_goal only asserts assert rationale (truthy / non-empty). Any non-empty string passes.

@then("the plan_subplan result rationale should mention the goal")
def step_then_rationale_goal(ctx: Context) -> None:
    r = ctx.result
    rationale = r.get("rationale") or ""
    assert rationale, "Expected non-empty rationale"

Fix: Assert the actual goal text appears in the rationale. The When step for this scenario passes goal="Refactor models", so:

def step_then_rationale_goal(ctx: Context) -> None:
    r = ctx.result
    rationale = r.get("rationale") or ""
    goal = r.get("goal") or ""
    assert goal and goal in rationale, (
        f"Expected rationale to contain goal {goal!r}, got: {rationale!r}"
    )

P1-2: source="builtin" silently discarded — ToolSpec has no source field

src/cleveragents/tool/builtins/subplan_tool.py:427

return ToolSpec(
    ...
    source="builtin",  # <-- silently ignored
)

ToolSpec (defined in tool/runtime.py:39) has no source field. Pydantic v2 defaults to extra="ignore", so this kwarg is silently dropped. Any downstream code that tries to inspect spec.source will get AttributeError. If the intent is to tag built-in tools by origin, source must be added to ToolSpec, or this parameter should be removed.

Fix: Either add source: str = Field(default="custom") to ToolSpec, or remove the source="builtin" kwarg.

P2 — Should Fix

P2-1: side_effects value diverges from spec

src/cleveragents/tool/builtins/subplan_tool.py:424

The spec (specification.md:18101) defines side_effects: [spawn_subplan] for the create-subplan tool. The implementation uses ["emit_decision"]. While the field is free-form, using a consistent vocabulary with the spec aids auditability and tool-capability queries.

side_effects=["emit_decision"],  # spec says: [spawn_subplan]

Fix: Use ["spawn_subplan"] or ["spawn_subplan", "emit_decision"] to align with the spec vocabulary.

P2-2: Dead TYPE_CHECKING import guard

src/cleveragents/tool/builtins/subplan_tool.py:81-82

if TYPE_CHECKING:
    pass

The block was set up but contains only pass. Either add the intended type-only imports or remove the dead guard.

P2-3: Non-deterministic context_view enum order in JSON schema

src/cleveragents/tool/builtins/subplan_tool.py:92-94, 396

_VALID_CONTEXT_VIEWS is a frozenset, whose iteration order is non-deterministic. The JSON schema enum at line 396 unpacks it via [*list(_VALID_CONTEXT_VIEWS), None], producing a different enum order on each Python process restart. This can cause spurious schema-diff noise in CI.

_VALID_CONTEXT_VIEWS: frozenset[str] = frozenset(
    {"strategist", "executor", "reviewer", "full"}
)

Fix: Use a tuple or sorted() to produce a deterministic order:

_VALID_CONTEXT_VIEWS: tuple[str, ...] = (
    "executor", "full", "reviewer", "strategist"
)

P3 — Nit

P3-1: Weak assertion on parallel rationale scenario

features/plan_subplan_tool.feature:98-101

The scenario "Rationale for parallel subplan mentions parallel execution" only asserts that the rationale contains "parallel". It doesn't verify success=True or decision_type="subplan_parallel_spawn". Other parallel scenarios do check these, so this isn't a gap in coverage, but the scenario itself is under-asserted for its title.

P3-2: Subplan payload fields not structurally persisted on Decision

src/cleveragents/tool/builtins/subplan_tool.py:298-309

The Decision object does not carry resource_scopes, merge_strategy, max_parallel, dependencies, or context_view in structured fields — they only appear in the free-text rationale. This means the plan explain tree can display them (via rationale text) but they're not queryable for programmatic use. Worth a follow-up ticket if structured querying is needed.

Summary

Severity	Count
P1 (Must Fix)	2
P2 (Should Fix)	3
P3 (Nit)	2

### P1 — Must Fix **P1-1: Sham rationale test — asserts non-empty, not goal presence** `features/steps/plan_subplan_tool_steps.py:302-305` The feature scenario "Rationale is auto-generated for serial subplan" claims to verify the goal is mentioned in the rationale, but the step `step_then_rationale_goal` only asserts `assert rationale` (truthy / non-empty). Any non-empty string passes. ```python @then("the plan_subplan result rationale should mention the goal") def step_then_rationale_goal(ctx: Context) -> None: r = ctx.result rationale = r.get("rationale") or "" assert rationale, "Expected non-empty rationale" ``` **Fix:** Assert the actual goal text appears in the rationale. The `When` step for this scenario passes `goal="Refactor models"`, so: ```python def step_then_rationale_goal(ctx: Context) -> None: r = ctx.result rationale = r.get("rationale") or "" goal = r.get("goal") or "" assert goal and goal in rationale, ( f"Expected rationale to contain goal {goal!r}, got: {rationale!r}" ) ``` **P1-2: `source="builtin"` silently discarded — `ToolSpec` has no `source` field** `src/cleveragents/tool/builtins/subplan_tool.py:427` ```python return ToolSpec( ... source="builtin", # <-- silently ignored ) ``` `ToolSpec` (defined in `tool/runtime.py:39`) has no `source` field. Pydantic v2 defaults to `extra="ignore"`, so this kwarg is silently dropped. Any downstream code that tries to inspect `spec.source` will get `AttributeError`. If the intent is to tag built-in tools by origin, `source` must be added to `ToolSpec`, or this parameter should be removed. **Fix:** Either add `source: str = Field(default="custom")` to `ToolSpec`, or remove the `source="builtin"` kwarg. --- ### P2 — Should Fix **P2-1: `side_effects` value diverges from spec** `src/cleveragents/tool/builtins/subplan_tool.py:424` The spec (`specification.md:18101`) defines `side_effects: [spawn_subplan]` for the create-subplan tool. The implementation uses `["emit_decision"]`. While the field is free-form, using a consistent vocabulary with the spec aids auditability and tool-capability queries. ```python side_effects=["emit_decision"], # spec says: [spawn_subplan] ``` **Fix:** Use `["spawn_subplan"]` or `["spawn_subplan", "emit_decision"]` to align with the spec vocabulary. **P2-2: Dead `TYPE_CHECKING` import guard** `src/cleveragents/tool/builtins/subplan_tool.py:81-82` ```python if TYPE_CHECKING: pass ``` The block was set up but contains only `pass`. Either add the intended type-only imports or remove the dead guard. **P2-3: Non-deterministic `context_view` enum order in JSON schema** `src/cleveragents/tool/builtins/subplan_tool.py:92-94, 396` `_VALID_CONTEXT_VIEWS` is a `frozenset`, whose iteration order is non-deterministic. The JSON schema `enum` at line 396 unpacks it via `[*list(_VALID_CONTEXT_VIEWS), None]`, producing a different enum order on each Python process restart. This can cause spurious schema-diff noise in CI. ```python _VALID_CONTEXT_VIEWS: frozenset[str] = frozenset( {"strategist", "executor", "reviewer", "full"} ) ``` **Fix:** Use a `tuple` or `sorted()` to produce a deterministic order: ```python _VALID_CONTEXT_VIEWS: tuple[str, ...] = ( "executor", "full", "reviewer", "strategist" ) ``` --- ### P3 — Nit **P3-1: Weak assertion on parallel rationale scenario** `features/plan_subplan_tool.feature:98-101` The scenario "Rationale for parallel subplan mentions parallel execution" only asserts that the rationale contains `"parallel"`. It doesn't verify `success=True` or `decision_type="subplan_parallel_spawn"`. Other parallel scenarios do check these, so this isn't a gap in coverage, but the scenario itself is under-asserted for its title. **P3-2: Subplan payload fields not structurally persisted on Decision** `src/cleveragents/tool/builtins/subplan_tool.py:298-309` The `Decision` object does not carry `resource_scopes`, `merge_strategy`, `max_parallel`, `dependencies`, or `context_view` in structured fields — they only appear in the free-text `rationale`. This means the `plan explain` tree can display them (via rationale text) but they're not queryable for programmatic use. Worth a follow-up ticket if structured querying is needed. --- ### Summary | Severity | Count | |---|---| | P1 (Must Fix) | 2 | | P2 (Should Fix) | 3 | | P3 (Nit) | 2 |

aditya added 1 commit 2026-03-03 14:43:52 +00:00

fix(actor): address second round code review on plan_subplan tool

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / lint (pull_request) Successful in 14s

Details

CI / build (pull_request) Successful in 15s

Details

CI / quality (pull_request) Successful in 17s

Details

CI / typecheck (pull_request) Successful in 31s

Details

CI / security (pull_request) Successful in 45s

Details

CI / unit_tests (pull_request) Successful in 1m42s

Details

CI / docker (pull_request) Successful in 39s

Details

CI / integration_tests (pull_request) Successful in 3m3s

Details

CI / coverage (pull_request) Successful in 4m13s

Details

CI / benchmark-regression (pull_request) Has been cancelled

Details

c4cda5d6d8

- Fix rationale test to assert goal text appears in rationale rather than
  only checking non-empty (P1-1)
- Align side_effects value to specification: use ["spawn_subplan"] instead
  of ["emit_decision"] per docs/specification.md §18259 (P2-1)
- Remove dead TYPE_CHECKING import guard that contained only pass (P2-2)
- Replace frozenset with tuple for _VALID_CONTEXT_VIEWS to ensure
  deterministic JSON schema enum ordering and eliminate CI noise (P2-3)
- Simplify validation error message and schema unpacking now that
  context_view values are in a pre-sorted tuple

aditya added 1 commit 2026-03-03 14:57:58 +00:00

Merge branch 'master' into feature/m5-subplan-actor

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / lint (pull_request) Successful in 15s

Details

CI / build (pull_request) Successful in 17s

Details

CI / quality (pull_request) Successful in 18s

Details

CI / security (pull_request) Successful in 31s

Details

CI / typecheck (pull_request) Successful in 36s

Details

CI / unit_tests (pull_request) Successful in 2m6s

Details

CI / docker (pull_request) Successful in 38s

Details

CI / integration_tests (pull_request) Successful in 2m50s

Details

CI / coverage (pull_request) Successful in 4m9s

Details

CI / benchmark-regression (pull_request) Has been cancelled

Details

5f0bc97d46

# Conflicts:
#	CHANGELOG.md

hamza.khyari approved these changes 2026-03-03 17:35:49 +00:00

Dismissed

freemo commented

2026-03-04 01:04:35 +00:00

Closing as duplicate of #198 (same title, same scope). #198 is assigned to @aditya with active branch feature/m5-subplan-actor.

Closing as duplicate of #198 (same title, same scope). #198 is assigned to @aditya with active branch `feature/m5-subplan-actor`.

freemo closed this pull request

2026-03-04 01:04:35 +00:00

aditya reopened this pull request

2026-03-05 06:29:04 +00:00

aditya added 1 commit 2026-03-05 06:37:23 +00:00

Merge branch 'master' into feature/m5-subplan-actor

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / lint (pull_request) Successful in 14s

Details

CI / build (pull_request) Successful in 15s

Details

CI / quality (pull_request) Successful in 17s

Details

CI / security (pull_request) Successful in 33s

Details

CI / typecheck (pull_request) Successful in 35s

Details

CI / unit_tests (pull_request) Successful in 2m3s

Details

CI / docker (pull_request) Successful in 42s

Details

CI / integration_tests (pull_request) Successful in 3m0s

Details

CI / coverage (pull_request) Successful in 4m26s

Details

CI / benchmark-regression (pull_request) Successful in 27m53s

Details

7ab5d8661a

aditya dismissed hamza.khyari's review 2026-03-05 06:37:23 +00:00

Reason:

New commits pushed, approval review dismissed automatically according to repository settings

aditya commented

2026-03-05 07:09:17 +00:00

Not duplicate. This is THE #198 PR. I'm assigned @aditya, on feature/m5-subplan-actor
Already approved.

Not duplicate. This is THE #198 PR. I'm assigned @aditya, on `feature/m5-subplan-actor` Already approved.

aditya merged commit fff6bb3833 into master

2026-03-05 07:12:22 +00:00

aditya deleted branch feature/m5-subplan-actor

2026-03-05 07:12:22 +00:00

aditya referenced this issue from a commit

2026-03-05 07:12:23 +00:00

Merge pull request 'feat(actor): add plan_subplan tool and decision emission' (#463) from feature/m5-subplan-actor into master

Sign in to join this conversation.

4 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: cleveragents/cleveragents-core#463

feat(actor): add plan_subplan tool and decision emission #463

Description

Type of Change

Quality Checklist

Testing

Test Commands Run

Related Issues

Implementation Notes

New files

Modified files

Architecture decisions

Code Review — Bugs & Security Findings

BUGS

B1 [High] parent_decision_id accepted but never returned to caller

B2 [Medium] "Verify all examples are documented" scenario not updated

B3 [Medium] Specification naming discrepancy

SECURITY

S1 [High] No ULID format validation on plan_id, dependencies, parent_decision_id

S2 [Medium] No path validation on resource_scopes

S3 [Low] Overly broad exception handling masks programming errors

P1 — Must Fix

P2 — Should Fix

P3 — Nit

Summary

B1 [High] `parent_decision_id` accepted but never returned to caller

S1 [High] No ULID format validation on `plan_id`, `dependencies`, `parent_decision_id`

S2 [Medium] No path validation on `resource_scopes`