feat(subplan): execute and merge subplans #444

Merged
CoreRasurae merged 1 commit from feature/m4-subplan-execution into master 2026-03-01 09:57:17 +00:00
Member

Summary

Implements subplan execution scheduling and merge strategy infrastructure for the plan lifecycle system.

Changes

  • SubplanExecutionService: Scheduler supporting sequential, parallel, and dependency-ordered execution modes with configurable concurrency limits and fail-fast behavior
  • SubplanMergeService: Four merge strategies (git three-way, sequential apply, fail on conflict, last wins) for combining sandbox outputs from completed subplans
  • BDD Tests: 21 Behave scenarios covering all execution modes, merge strategies, failure handling, and retry logic
  • Robot Tests: 11 integration test cases for subplan execution and merge summaries
  • ASV Benchmarks: Performance benchmarks for scheduler overhead
  • Documentation: Reference documentation at docs/reference/subplans.md

Motivation

Subplan execution is a core capability for breaking down complex plans into manageable units of work. The scheduling infrastructure enables sequential, parallel, and dependency-aware execution while the merge system combines results safely using configurable strategies.

Key Design Decisions

  • Split into two services (SubplanExecutionService + SubplanMergeService) following SRP
  • Parallel execution uses asyncio.gather with semaphore-based concurrency limiting
  • Dependency-ordered mode performs topological sort and executes independent subplans concurrently
  • All services use dependency injection for testability

Closes #184

## Summary Implements subplan execution scheduling and merge strategy infrastructure for the plan lifecycle system. ### Changes - **SubplanExecutionService**: Scheduler supporting sequential, parallel, and dependency-ordered execution modes with configurable concurrency limits and fail-fast behavior - **SubplanMergeService**: Four merge strategies (git three-way, sequential apply, fail on conflict, last wins) for combining sandbox outputs from completed subplans - **BDD Tests**: 21 Behave scenarios covering all execution modes, merge strategies, failure handling, and retry logic - **Robot Tests**: 11 integration test cases for subplan execution and merge summaries - **ASV Benchmarks**: Performance benchmarks for scheduler overhead - **Documentation**: Reference documentation at `docs/reference/subplans.md` ### Motivation Subplan execution is a core capability for breaking down complex plans into manageable units of work. The scheduling infrastructure enables sequential, parallel, and dependency-aware execution while the merge system combines results safely using configurable strategies. ### Key Design Decisions - Split into two services (SubplanExecutionService + SubplanMergeService) following SRP - Parallel execution uses asyncio.gather with semaphore-based concurrency limiting - Dependency-ordered mode performs topological sort and executes independent subplans concurrently - All services use dependency injection for testability Closes #184
CoreRasurae added this to the v3.3.0 milestone 2026-02-25 22:23:53 +00:00
CoreRasurae force-pushed feature/m4-subplan-execution from cad1678bc5
All checks were successful
CI / lint (pull_request) Successful in 23s
CI / typecheck (pull_request) Successful in 1m3s
CI / security (pull_request) Successful in 50s
CI / quality (pull_request) Successful in 30s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 27s
CI / integration_tests (pull_request) Successful in 4m53s
CI / unit_tests (pull_request) Successful in 30m28s
CI / docker (pull_request) Successful in 1m1s
CI / benchmark-regression (pull_request) Successful in 23m22s
CI / coverage (pull_request) Successful in 1h18m14s
to 4805e399af
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 14s
CI / build (pull_request) Successful in 14s
CI / quality (pull_request) Successful in 17s
CI / typecheck (pull_request) Successful in 30s
CI / security (pull_request) Successful in 48s
CI / integration_tests (pull_request) Successful in 3m13s
CI / unit_tests (pull_request) Successful in 8m19s
CI / docker (pull_request) Successful in 1m2s
CI / benchmark-regression (pull_request) Successful in 20m35s
CI / coverage (pull_request) Successful in 30m32s
2026-02-27 15:27:31 +00:00
Compare
brent.edwards left a comment

Review Summary (commit 4805e399af97b46f3350a05b)

Scoped to the subplan execution/merge commit. The core service design is clean and the docs/tests are thorough, but there are a few behavioral gaps that will surprise callers.

CI status isn’t visible via the API on my side. Please confirm required checks per docs/development/ci-cd.md are green (lint, typecheck, security, quality, unit_tests, integration_tests, coverage, build, docker).

Findings

P1:must-fixtimeout_per_subplan_seconds is defined in SubplanConfig and documented, but never enforced in SubplanExecutionService. There’s no per‑subplan timeout in sequential or parallel modes. Implement timeouts (e.g., future.result(timeout=...) / watchdog) or remove the field/docs.

  • Code: src/cleveragents/application/services/subplan_execution_service.py
  • Model: src/cleveragents/domain/models/core/plan.py
  • Docs: docs/reference/subplans.md

P2:should-fixsequential_apply is documented as “completion order,” but in parallel mode the outputs are merged in original subplan order because results are re-ordered by the input list. If completion order is intended, capture completion timestamps and sort outputs before merge.

  • Code: src/cleveragents/application/services/subplan_execution_service.py (_execute_parallel, execute_all)
  • Merge: src/cleveragents/application/services/subplan_merge_service.py
  • Docs: docs/reference/subplans.md

P2:should-fix — In parallel fail_fast, cancelled futures surface as CancelledError and are marked ERRORED rather than CANCELLED. This makes status reporting inconsistent with sequential mode and with the “stop others” semantics.

  • Code: src/cleveragents/application/services/subplan_execution_service.py

P3:should-fix — The commit message claims dependency‑ordered mode executes independent subplans concurrently, but the implementation runs them sequentially after topological sort. Either update the description/docs or implement concurrent execution for ready nodes.

  • Code: src/cleveragents/application/services/subplan_execution_service.py

Positive Notes

  • Service API and merge strategy abstraction are clean and easy to test.
  • Failure handling and retry logic are thoughtfully separated.
  • Documentation covers execution modes and merge outcomes well.
## Review Summary (commit 4805e399af97b46f3350a05b) Scoped to the subplan execution/merge commit. The core service design is clean and the docs/tests are thorough, but there are a few behavioral gaps that will surprise callers. CI status isn’t visible via the API on my side. Please confirm required checks per `docs/development/ci-cd.md` are green (lint, typecheck, security, quality, unit_tests, integration_tests, coverage, build, docker). ## Findings **P1:must-fix** — `timeout_per_subplan_seconds` is defined in `SubplanConfig` and documented, but never enforced in `SubplanExecutionService`. There’s no per‑subplan timeout in sequential or parallel modes. Implement timeouts (e.g., `future.result(timeout=...)` / watchdog) or remove the field/docs. - Code: `src/cleveragents/application/services/subplan_execution_service.py` - Model: `src/cleveragents/domain/models/core/plan.py` - Docs: `docs/reference/subplans.md` **P2:should-fix** — `sequential_apply` is documented as “completion order,” but in parallel mode the outputs are merged in **original subplan order** because results are re-ordered by the input list. If completion order is intended, capture completion timestamps and sort outputs before merge. - Code: `src/cleveragents/application/services/subplan_execution_service.py` (`_execute_parallel`, `execute_all`) - Merge: `src/cleveragents/application/services/subplan_merge_service.py` - Docs: `docs/reference/subplans.md` **P2:should-fix** — In parallel `fail_fast`, cancelled futures surface as `CancelledError` and are marked `ERRORED` rather than `CANCELLED`. This makes status reporting inconsistent with sequential mode and with the “stop others” semantics. - Code: `src/cleveragents/application/services/subplan_execution_service.py` **P3:should-fix** — The commit message claims dependency‑ordered mode executes independent subplans concurrently, but the implementation runs them sequentially after topological sort. Either update the description/docs or implement concurrent execution for ready nodes. - Code: `src/cleveragents/application/services/subplan_execution_service.py` ## Positive Notes - Service API and merge strategy abstraction are clean and easy to test. - Failure handling and retry logic are thoughtfully separated. - Documentation covers execution modes and merge outcomes well.
CoreRasurae force-pushed feature/m4-subplan-execution from 4805e399af
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 14s
CI / build (pull_request) Successful in 14s
CI / quality (pull_request) Successful in 17s
CI / typecheck (pull_request) Successful in 30s
CI / security (pull_request) Successful in 48s
CI / integration_tests (pull_request) Successful in 3m13s
CI / unit_tests (pull_request) Successful in 8m19s
CI / docker (pull_request) Successful in 1m2s
CI / benchmark-regression (pull_request) Successful in 20m35s
CI / coverage (pull_request) Successful in 30m32s
to 1bbec743ba
All checks were successful
CI / quality (pull_request) Successful in 28s
CI / lint (pull_request) Successful in 46s
CI / security (pull_request) Successful in 48s
CI / benchmark-publish (pull_request) Has been skipped
CI / typecheck (pull_request) Successful in 57s
CI / build (pull_request) Successful in 22s
CI / integration_tests (pull_request) Successful in 4m41s
CI / unit_tests (pull_request) Successful in 8m29s
CI / docker (pull_request) Successful in 39s
CI / benchmark-regression (pull_request) Successful in 18m34s
CI / coverage (pull_request) Successful in 1h29m43s
2026-02-27 23:14:50 +00:00
Compare
brent.edwards approved these changes 2026-02-27 23:27:28 +00:00
Dismissed
brent.edwards left a comment

I (not an LLM) checked that the review comments were met. Great work!

I (not an LLM) checked that the review comments were met. Great work!
CoreRasurae force-pushed feature/m4-subplan-execution from 1bbec743ba
All checks were successful
CI / quality (pull_request) Successful in 28s
CI / lint (pull_request) Successful in 46s
CI / security (pull_request) Successful in 48s
CI / benchmark-publish (pull_request) Has been skipped
CI / typecheck (pull_request) Successful in 57s
CI / build (pull_request) Successful in 22s
CI / integration_tests (pull_request) Successful in 4m41s
CI / unit_tests (pull_request) Successful in 8m29s
CI / docker (pull_request) Successful in 39s
CI / benchmark-regression (pull_request) Successful in 18m34s
CI / coverage (pull_request) Successful in 1h29m43s
to 9f6fb667c2
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / build (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 20s
CI / security (pull_request) Successful in 33s
CI / typecheck (pull_request) Successful in 34s
CI / integration_tests (pull_request) Successful in 2m45s
CI / unit_tests (pull_request) Successful in 13m14s
CI / docker (pull_request) Successful in 38s
CI / benchmark-regression (pull_request) Successful in 24m49s
CI / coverage (pull_request) Successful in 49m5s
CI / lint (push) Successful in 14s
CI / quality (push) Successful in 15s
CI / build (push) Successful in 14s
CI / security (push) Successful in 28s
CI / typecheck (push) Successful in 30s
CI / benchmark-regression (push) Has been skipped
CI / integration_tests (push) Successful in 2m43s
CI / unit_tests (push) Successful in 13m9s
CI / docker (push) Successful in 39s
CI / benchmark-publish (push) Successful in 14m49s
CI / coverage (push) Successful in 47m53s
2026-03-01 09:07:35 +00:00
Compare
CoreRasurae dismissed brent.edwards's review 2026-03-01 09:07:35 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

CoreRasurae scheduled this pull request to auto merge when all checks succeed 2026-03-01 09:07:55 +00:00
CoreRasurae deleted branch feature/m4-subplan-execution 2026-03-01 09:57:17 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core!444
No description provided.