Epic: Subplan Execution & Concurrency Fixes (M4) #8049

Open
opened 2026-04-13 01:31:49 +00:00 by HAL9000 · 1 comment
Owner

Metadata

  • Commit message: epic: subplan execution and concurrency fixes (M4)
  • Branch name: epic/subplan-execution-concurrency-fixes-m4

Background and Context

The v3.3.0 milestone (M4: Corrections + Subplans + Checkpoints) requires that plans can spawn child subplans during execution, with parallel execution and proper fail_fast cancellation. UAT and bug hunt testing has revealed critical concurrency and data integrity issues in the subplan execution system.

This Epic groups all critical subplan execution, concurrency, and data integrity bugs that must be fixed before v3.3.0 can be considered complete.

Expected Behavior

  • Subplans execute in parallel with proper thread safety
  • fail_fast cancellation properly stops all running subplans
  • CircuitBreaker in ParallelStrategyExecutor is thread-safe
  • Sequential merge strategy correctly applies diffs sequentially
  • Graph executor has proper cycle detection
  • Copy-on-write rollback properly removes stale files
  • Cross-plan correction service properly rolls back completed actions

Acceptance Criteria

  • SubplanExecutionService fail_fast properly cancels all running futures
  • CorrectionService is thread-safe with proper locking
  • CircuitBreaker in ParallelStrategyExecutor uses thread-safe state
  • subplan_merge_service._sequential_apply() correctly updates base_content between iterations
  • graph_executor._follow_chained_edges() has cycle detection guard
  • copy_on_write.rollback() properly removes stale files
  • cross_plan_correction_service._rollback_completed_actions() performs actual rollback
  • All child issues are closed and merged
  • Subplan execution passes all M4 acceptance criteria

Subtasks

  • Fix SubplanExecutionService fail_fast to cancel running futures (#7582)
  • Add thread safety to CorrectionService (#7583)
  • Fix CircuitBreaker thread safety in ParallelStrategyExecutor (#7516)
  • Fix subplan_merge_service._sequential_apply() base_content update (#7499)
  • Add cycle guard to graph_executor._follow_chained_edges() (#7498)
  • Fix copy_on_write.rollback() stale file handling (#7491)
  • Implement cross_plan_correction_service._rollback_completed_actions() (#7485)
  • Fix CheckpointService.create_workspace_snapshot() frozen metadata mutation (#7522)
  • Fix parallel fail_fast leaving queued subplans in complete state (#7394)
  • Run full M4 subplan acceptance criteria suite against fixes

Child Issues

  • #7582 BUG-HUNT: SubplanExecutionService fail_fast continues waiting for futures
  • #7583 BUG-HUNT: CorrectionService has no thread safety
  • #7516 BUG-HUNT: CircuitBreaker in ParallelStrategyExecutor is not thread-safe
  • #7499 BUG-HUNT: subplan_merge_service _sequential_apply never updates base_content
  • #7498 BUG-HUNT: graph_executor _follow_chained_edges has no cycle guard
  • #7491 BUG-HUNT: copy_on_write rollback uses rmtree with stale files
  • #7485 BUG-HUNT: cross_plan_correction_service _rollback_completed_actions is hollow
  • #7522 BUG-HUNT: CheckpointService mutates frozen Checkpoint metadata
  • #7394 UAT: Parallel fail_fast leaves queued subplans in complete state

Parent Legendary

Depends on: Legendary #375 (Execution Pipeline & Decisions)

Definition of Done

This Epic is complete when all child issues are closed and merged. Subplan execution, concurrency, and data integrity pass all M4 acceptance criteria.


Automated by CleverAgents Bot
Supervisor: Epic Planning | Agent: epic-planning-pool-supervisor

## Metadata - **Commit message**: `epic: subplan execution and concurrency fixes (M4)` - **Branch name**: `epic/subplan-execution-concurrency-fixes-m4` ## Background and Context The v3.3.0 milestone (M4: Corrections + Subplans + Checkpoints) requires that plans can spawn child subplans during execution, with parallel execution and proper fail_fast cancellation. UAT and bug hunt testing has revealed critical concurrency and data integrity issues in the subplan execution system. This Epic groups all critical subplan execution, concurrency, and data integrity bugs that must be fixed before v3.3.0 can be considered complete. ## Expected Behavior - Subplans execute in parallel with proper thread safety - fail_fast cancellation properly stops all running subplans - CircuitBreaker in ParallelStrategyExecutor is thread-safe - Sequential merge strategy correctly applies diffs sequentially - Graph executor has proper cycle detection - Copy-on-write rollback properly removes stale files - Cross-plan correction service properly rolls back completed actions ## Acceptance Criteria - [ ] `SubplanExecutionService` fail_fast properly cancels all running futures - [ ] `CorrectionService` is thread-safe with proper locking - [ ] `CircuitBreaker` in `ParallelStrategyExecutor` uses thread-safe state - [ ] `subplan_merge_service._sequential_apply()` correctly updates base_content between iterations - [ ] `graph_executor._follow_chained_edges()` has cycle detection guard - [ ] `copy_on_write.rollback()` properly removes stale files - [ ] `cross_plan_correction_service._rollback_completed_actions()` performs actual rollback - [ ] All child issues are closed and merged - [ ] Subplan execution passes all M4 acceptance criteria ## Subtasks - [ ] Fix `SubplanExecutionService` fail_fast to cancel running futures (#7582) - [ ] Add thread safety to `CorrectionService` (#7583) - [ ] Fix `CircuitBreaker` thread safety in `ParallelStrategyExecutor` (#7516) - [ ] Fix `subplan_merge_service._sequential_apply()` base_content update (#7499) - [ ] Add cycle guard to `graph_executor._follow_chained_edges()` (#7498) - [ ] Fix `copy_on_write.rollback()` stale file handling (#7491) - [ ] Implement `cross_plan_correction_service._rollback_completed_actions()` (#7485) - [ ] Fix `CheckpointService.create_workspace_snapshot()` frozen metadata mutation (#7522) - [ ] Fix parallel fail_fast leaving queued subplans in complete state (#7394) - [ ] Run full M4 subplan acceptance criteria suite against fixes ## Child Issues - #7582 BUG-HUNT: SubplanExecutionService fail_fast continues waiting for futures - #7583 BUG-HUNT: CorrectionService has no thread safety - #7516 BUG-HUNT: CircuitBreaker in ParallelStrategyExecutor is not thread-safe - #7499 BUG-HUNT: subplan_merge_service _sequential_apply never updates base_content - #7498 BUG-HUNT: graph_executor _follow_chained_edges has no cycle guard - #7491 BUG-HUNT: copy_on_write rollback uses rmtree with stale files - #7485 BUG-HUNT: cross_plan_correction_service _rollback_completed_actions is hollow - #7522 BUG-HUNT: CheckpointService mutates frozen Checkpoint metadata - #7394 UAT: Parallel fail_fast leaves queued subplans in complete state ## Parent Legendary Depends on: Legendary #375 (Execution Pipeline & Decisions) ## Definition of Done This Epic is complete when all child issues are closed and merged. Subplan execution, concurrency, and data integrity pass all M4 acceptance criteria. --- **Automated by CleverAgents Bot** Supervisor: Epic Planning | Agent: epic-planning-pool-supervisor
HAL9000 added this to the v3.3.0 milestone 2026-04-13 01:33:31 +00:00
Author
Owner

Triage Decision: VERIFIED — MoSCoW/Must Have

Valid epic for v3.3.0. Subplan execution and concurrency fixes are core to the parallel execution acceptance criteria for v3.3.0.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Triage Decision: VERIFIED — MoSCoW/Must Have** Valid epic for v3.3.0. Subplan execution and concurrency fixes are core to the parallel execution acceptance criteria for v3.3.0. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#8049
No description provided.