UAT: LockService not integrated into PlanLifecycleService or SubplanService — lock enforcement missing during plan transitions #3995

Open
opened 2026-04-06 08:22:51 +00:00 by freemo · 0 comments
Owner

Metadata

  • Branch: fix/lock-service-plan-lifecycle-integration
  • Commit Message: fix(concurrency): wire LockService into PlanLifecycleService transitions
  • Milestone: (none — see backlog note below)
  • Parent Epic: #362

Bug Description

The LockService exists and is fully implemented (src/cleveragents/application/services/lock_service.py) but is not wired into PlanLifecycleService or SubplanService. This means plan-level and project-level advisory locks are never acquired or released during plan lifecycle transitions, defeating the purpose of the lock system.

Expected Behavior (from spec and issue #327)

Per docs/reference/concurrency.md (Integration Points section):

PlanLifecycleService: Transitions acquire a plan-level lock before mutating phase/state and release it after persistence.

Per issue #327 acceptance criteria:

Ensure locks are enforced in PlanLifecycleService transitions and SubplanService scheduling.

The LockService should be:

  1. Injected into PlanLifecycleService.__init__() as a dependency
  2. Called with acquire() before any plan state transition (strategize, execute, apply, cancel, etc.)
  3. Called with release() after the transition completes (or on error)
  4. Called with renew() during long-running phases
  5. Called with release_all_for_owner() on graceful shutdown

Actual Behavior

Searching the entire codebase for LockService usage:

  • src/cleveragents/application/services/lock_service.py — the implementation itself
  • src/cleveragents/cli/commands/system.py — only used for count_stale_locks() in the agents diagnostics health check

PlanLifecycleService (src/cleveragents/application/services/plan_lifecycle_service.py) has zero references to LockService, lock_service, acquire, or release (in the lock context).

SubplanService (src/cleveragents/application/services/subplan_service.py) similarly has zero references to LockService.

Impact

  • Multiple concurrent processes can simultaneously modify the same plan, causing data corruption and race conditions
  • The locks table exists in the database but is never populated during plan execution
  • The docs/reference/concurrency.md documentation describes integration that doesn't exist in the code
  • Issue #327 was closed as complete but the core acceptance criterion ("Ensure locks are enforced in PlanLifecycleService transitions") was not implemented

Steps to Reproduce

  1. Start two concurrent processes that both attempt to transition the same plan
  2. Observe that both succeed without any lock conflict error
  3. Verify the locks table remains empty during plan execution

Code Locations

  • Missing integration: src/cleveragents/application/services/plan_lifecycle_service.py
  • Missing integration: src/cleveragents/application/services/subplan_service.py
  • Existing implementation: src/cleveragents/application/services/lock_service.py
  • Documentation describing expected behavior: docs/reference/concurrency.md (lines 88-93)

Subtasks

  • Inject LockService into PlanLifecycleService.__init__() as an optional dependency (with fallback to no-op for backward compatibility)
  • Acquire plan-level lock before each phase transition in PlanLifecycleService
  • Release plan-level lock after each phase transition (in finally block)
  • Add lock renewal for long-running Execute phase
  • Inject LockService into SubplanService for scheduling coordination
  • Add BDD scenarios to features/concurrency.feature verifying lock enforcement during plan transitions
  • Verify coverage >= 97%
  • Run nox and fix any errors

Definition of Done

  • All subtasks completed
  • PlanLifecycleService acquires/releases plan locks on every state transition
  • SubplanService uses locks for scheduling coordination
  • BDD tests verify lock enforcement during plan transitions
  • All nox stages pass
  • Coverage >= 97%

Closes: #327 (re-opens the missing acceptance criterion)


Backlog note: This issue was discovered during autonomous operation
on milestone v3.3.0. It does not block milestone completion and has been
placed in the backlog for human review and future milestone assignment.


Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/lock-service-plan-lifecycle-integration` - **Commit Message**: `fix(concurrency): wire LockService into PlanLifecycleService transitions` - **Milestone**: *(none — see backlog note below)* - **Parent Epic**: #362 ## Bug Description The `LockService` exists and is fully implemented (`src/cleveragents/application/services/lock_service.py`) but is **not wired into `PlanLifecycleService` or `SubplanService`**. This means plan-level and project-level advisory locks are never acquired or released during plan lifecycle transitions, defeating the purpose of the lock system. ## Expected Behavior (from spec and issue #327) Per `docs/reference/concurrency.md` (Integration Points section): > **PlanLifecycleService**: Transitions acquire a plan-level lock before mutating phase/state and release it after persistence. Per issue #327 acceptance criteria: > Ensure locks are enforced in PlanLifecycleService transitions and SubplanService scheduling. The `LockService` should be: 1. Injected into `PlanLifecycleService.__init__()` as a dependency 2. Called with `acquire()` before any plan state transition (strategize, execute, apply, cancel, etc.) 3. Called with `release()` after the transition completes (or on error) 4. Called with `renew()` during long-running phases 5. Called with `release_all_for_owner()` on graceful shutdown ## Actual Behavior Searching the entire codebase for `LockService` usage: - `src/cleveragents/application/services/lock_service.py` — the implementation itself - `src/cleveragents/cli/commands/system.py` — only used for `count_stale_locks()` in the `agents diagnostics` health check `PlanLifecycleService` (`src/cleveragents/application/services/plan_lifecycle_service.py`) has **zero** references to `LockService`, `lock_service`, `acquire`, or `release` (in the lock context). `SubplanService` (`src/cleveragents/application/services/subplan_service.py`) similarly has **zero** references to `LockService`. ## Impact - Multiple concurrent processes can simultaneously modify the same plan, causing data corruption and race conditions - The `locks` table exists in the database but is never populated during plan execution - The `docs/reference/concurrency.md` documentation describes integration that doesn't exist in the code - Issue #327 was closed as complete but the core acceptance criterion ("Ensure locks are enforced in PlanLifecycleService transitions") was not implemented ## Steps to Reproduce 1. Start two concurrent processes that both attempt to transition the same plan 2. Observe that both succeed without any lock conflict error 3. Verify the `locks` table remains empty during plan execution ## Code Locations - Missing integration: `src/cleveragents/application/services/plan_lifecycle_service.py` - Missing integration: `src/cleveragents/application/services/subplan_service.py` - Existing implementation: `src/cleveragents/application/services/lock_service.py` - Documentation describing expected behavior: `docs/reference/concurrency.md` (lines 88-93) ## Subtasks - [ ] Inject `LockService` into `PlanLifecycleService.__init__()` as an optional dependency (with fallback to no-op for backward compatibility) - [ ] Acquire plan-level lock before each phase transition in `PlanLifecycleService` - [ ] Release plan-level lock after each phase transition (in finally block) - [ ] Add lock renewal for long-running Execute phase - [ ] Inject `LockService` into `SubplanService` for scheduling coordination - [ ] Add BDD scenarios to `features/concurrency.feature` verifying lock enforcement during plan transitions - [ ] Verify coverage >= 97% - [ ] Run `nox` and fix any errors ## Definition of Done - [ ] All subtasks completed - [ ] `PlanLifecycleService` acquires/releases plan locks on every state transition - [ ] `SubplanService` uses locks for scheduling coordination - [ ] BDD tests verify lock enforcement during plan transitions - [ ] All nox stages pass - [ ] Coverage >= 97% Closes: #327 (re-opens the missing acceptance criterion) --- > **Backlog note:** This issue was discovered during autonomous operation > on milestone v3.3.0. It does not block milestone completion and has been > placed in the backlog for human review and future milestone assignment. --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-new-issue-creator
HAL9000 added this to the v3.5.0 milestone 2026-04-09 03:12:13 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#362 Epic: Security & Safety Hardening
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#3995
No description provided.