BUG-HUNT: [consistency] Redundant plan fetching in try_auto_run #1285

Open
opened 2026-04-02 08:48:50 +00:00 by freemo · 1 comment
Owner

Bug Report: [consistency] — Redundant plan fetching in try_auto_run

Severity Assessment

  • Impact: Low. The code is inefficient and harder to reason about. It could potentially mask bugs related to in-memory state not being persisted correctly, as it immediately re-fetches the state from the source of truth.
  • Likelihood: High. This code path is executed every time try_auto_run is called.
  • Priority: Low

Location

  • File: src/cleveragents/application/services/plan_lifecycle_service.py
  • Function/Class: PlanLifecycleService.try_auto_run
  • Lines: 1815-1816 and 1832-1833

Description

In the try_auto_run method, after calling a phase completion method (e.g., self.complete_strategize(plan_id)), the returned plan object is immediately discarded and re-fetched from the database/cache with plan = self.get_plan(plan_id).

This is redundant. The phase completion methods already return the updated plan object. The immediate re-fetch is unnecessary and makes the data flow confusing. If the completion method were to perform an in-memory modification without committing, this pattern would hide that bug by overwriting the modified object with a fresh copy.

Evidence

# Relevant code snippet showing the issue
# In try_auto_run:
# ...
            # complete_strategize() calls auto_progress() which may
            # transition the plan to Execute/QUEUED.
            plan = self.complete_strategize(plan_id) # plan object is returned...
            plan = self.get_plan(plan_id) # ...and then immediately overwritten

# ...

            # complete_execute() calls auto_progress() which may
            # transition the plan to Apply/QUEUED.
            plan = self.complete_execute(plan_id) # plan object is returned...
            plan = self.get_plan(plan_id) # ...and then immediately overwritten

Expected Behavior

The code should use the plan object returned by the phase completion methods directly. The redundant get_plan calls should be removed.

Actual Behavior

The code performs unnecessary database/cache lookups, making it less efficient and harder to understand.

Suggested Fix

Remove the redundant get_plan calls:

# In try_auto_run:
# ...
            plan = self.complete_strategize(plan_id)
            # plan = self.get_plan(plan_id) # REMOVE THIS LINE

# ...

            plan = self.complete_execute(plan_id)
            # plan = self.get_plan(plan_id) # REMOVE THIS LINE

Category

consistency


Metadata

  • Branch: fix/consistency-try-auto-run-redundant-plan-fetch
  • Commit Message: fix(consistency): remove redundant get_plan calls after phase completion in try_auto_run
  • Milestone: v3.3.0
  • Parent Epic: #362

Subtasks

  • Write a TDD issue-capture Behave scenario tagged @tdd_issue, @tdd_expected_fail that asserts the plan object returned by complete_strategize is used directly (i.e., get_plan is not called immediately after)
  • Write a TDD issue-capture Behave scenario tagged @tdd_issue, @tdd_expected_fail that asserts the plan object returned by complete_execute is used directly (i.e., get_plan is not called immediately after)
  • Remove the redundant plan = self.get_plan(plan_id) line at line 1816 in plan_lifecycle_service.py
  • Remove the redundant plan = self.get_plan(plan_id) line at line 1833 in plan_lifecycle_service.py
  • Remove the @tdd_expected_fail tags from both TDD scenarios once the fix is in place and verify they pass
  • Run nox -e typecheck to confirm no type regressions
  • Run nox -e unit_tests to confirm all Behave tests pass
  • Run nox -e coverage_report to confirm coverage remains ≥ 97%
  • Run nox (all default sessions) and fix any errors

Definition of Done

  • Both @tdd_expected_fail TDD capture scenarios are committed first and demonstrate the redundant get_plan calls
  • The redundant plan = self.get_plan(plan_id) call after complete_strategize is removed
  • The redundant plan = self.get_plan(plan_id) call after complete_execute is removed
  • Both TDD scenarios pass without @tdd_expected_fail after the fix
  • No regressions in existing try_auto_run or plan lifecycle tests
  • All nox stages pass (nox -e lint, nox -e typecheck, nox -e unit_tests, nox -e integration_tests, nox -e coverage_report)
  • Coverage >= 97%
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
## Bug Report: [consistency] — Redundant plan fetching in `try_auto_run` ### Severity Assessment - **Impact**: Low. The code is inefficient and harder to reason about. It could potentially mask bugs related to in-memory state not being persisted correctly, as it immediately re-fetches the state from the source of truth. - **Likelihood**: High. This code path is executed every time `try_auto_run` is called. - **Priority**: Low ### Location - **File**: `src/cleveragents/application/services/plan_lifecycle_service.py` - **Function/Class**: `PlanLifecycleService.try_auto_run` - **Lines**: 1815-1816 and 1832-1833 ### Description In the `try_auto_run` method, after calling a phase completion method (e.g., `self.complete_strategize(plan_id)`), the returned `plan` object is immediately discarded and re-fetched from the database/cache with `plan = self.get_plan(plan_id)`. This is redundant. The phase completion methods already return the updated plan object. The immediate re-fetch is unnecessary and makes the data flow confusing. If the completion method were to perform an in-memory modification without committing, this pattern would hide that bug by overwriting the modified object with a fresh copy. ### Evidence ```python # Relevant code snippet showing the issue # In try_auto_run: # ... # complete_strategize() calls auto_progress() which may # transition the plan to Execute/QUEUED. plan = self.complete_strategize(plan_id) # plan object is returned... plan = self.get_plan(plan_id) # ...and then immediately overwritten # ... # complete_execute() calls auto_progress() which may # transition the plan to Apply/QUEUED. plan = self.complete_execute(plan_id) # plan object is returned... plan = self.get_plan(plan_id) # ...and then immediately overwritten ``` ### Expected Behavior The code should use the `plan` object returned by the phase completion methods directly. The redundant `get_plan` calls should be removed. ### Actual Behavior The code performs unnecessary database/cache lookups, making it less efficient and harder to understand. ### Suggested Fix Remove the redundant `get_plan` calls: ```python # In try_auto_run: # ... plan = self.complete_strategize(plan_id) # plan = self.get_plan(plan_id) # REMOVE THIS LINE # ... plan = self.complete_execute(plan_id) # plan = self.get_plan(plan_id) # REMOVE THIS LINE ``` ### Category consistency --- ## Metadata - **Branch**: `fix/consistency-try-auto-run-redundant-plan-fetch` - **Commit Message**: `fix(consistency): remove redundant get_plan calls after phase completion in try_auto_run` - **Milestone**: v3.3.0 - **Parent Epic**: #362 ## Subtasks - [ ] Write a TDD issue-capture Behave scenario tagged `@tdd_issue`, `@tdd_expected_fail` that asserts the `plan` object returned by `complete_strategize` is used directly (i.e., `get_plan` is not called immediately after) - [ ] Write a TDD issue-capture Behave scenario tagged `@tdd_issue`, `@tdd_expected_fail` that asserts the `plan` object returned by `complete_execute` is used directly (i.e., `get_plan` is not called immediately after) - [ ] Remove the redundant `plan = self.get_plan(plan_id)` line at line 1816 in `plan_lifecycle_service.py` - [ ] Remove the redundant `plan = self.get_plan(plan_id)` line at line 1833 in `plan_lifecycle_service.py` - [ ] Remove the `@tdd_expected_fail` tags from both TDD scenarios once the fix is in place and verify they pass - [ ] Run `nox -e typecheck` to confirm no type regressions - [ ] Run `nox -e unit_tests` to confirm all Behave tests pass - [ ] Run `nox -e coverage_report` to confirm coverage remains ≥ 97% - [ ] Run `nox` (all default sessions) and fix any errors ## Definition of Done - [ ] Both `@tdd_expected_fail` TDD capture scenarios are committed first and demonstrate the redundant `get_plan` calls - [ ] The redundant `plan = self.get_plan(plan_id)` call after `complete_strategize` is removed - [ ] The redundant `plan = self.get_plan(plan_id)` call after `complete_execute` is removed - [ ] Both TDD scenarios pass without `@tdd_expected_fail` after the fix - [ ] No regressions in existing `try_auto_run` or plan lifecycle tests - [ ] All nox stages pass (`nox -e lint`, `nox -e typecheck`, `nox -e unit_tests`, `nox -e integration_tests`, `nox -e coverage_report`) - [ ] Coverage >= 97% - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done.
freemo added this to the v3.3.0 milestone 2026-04-02 08:50:52 +00:00
Author
Owner

Dependency Note: This issue is a child of Epic #362 (Security & Safety Hardening). Per the project's ticket hierarchy, this issue blocks #362 — the Epic cannot be considered complete until this child issue is resolved and merged.

**Dependency Note:** This issue is a child of Epic #362 (Security & Safety Hardening). Per the project's ticket hierarchy, this issue **blocks** #362 — the Epic cannot be considered complete until this child issue is resolved and merged.
freemo self-assigned this 2026-04-02 18:45:26 +00:00
freemo removed this from the v3.3.0 milestone 2026-04-07 02:31:46 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#1285
No description provided.