bug(cli): plan correct auto-resolve fails in isolated E2E environments #1025

Closed
opened 2026-03-17 04:23:46 +00:00 by hamza.khyari · 8 comments
Member

Metadata

  • Commit Message: fix(cli): plan correct active-plan resolution in isolated environments
  • Branch: fix/plan-correct-resolve

Background

The plan correct CLI command uses _resolve_active_plan_id() to find the current plan when --plan is omitted. In isolated E2E environments (where CLEVERAGENTS_HOME is a temp directory), this resolution fails even when a plan in Execute/COMPLETE state exists in the database.

Current Behavior

Running plan correct <decision_id> --mode revert --guidance ... without --plan in an isolated E2E subprocess environment fails with:

Error: No active plan found. Specify --plan <PLAN_ID> explicitly.

Adding --plan <plan_id> explicitly resolves the issue.

Expected Behavior

_resolve_active_plan_id() should find the plan in Execute/COMPLETE state when called from isolated subprocess invocations with CLEVERAGENTS_HOME set.

Acceptance Criteria

  • plan correct auto-resolves the active plan in isolated E2E environments
  • Or: document that --plan is required in subprocess/non-interactive contexts and update test accordingly

Subtasks

  • Investigate why _resolve_active_plan_id() fails to find plans in isolated environments (check workspace/session context resolution)
  • Fix the resolution logic or document the --plan requirement
  • Remove --plan ${plan_id} workaround from m4_acceptance.robot if auto-resolve is fixed
  • Tests (Robot): Verify plan correct works without --plan in E2E environment
  • Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
## Metadata - **Commit Message**: `fix(cli): plan correct active-plan resolution in isolated environments` - **Branch**: `fix/plan-correct-resolve` ## Background The `plan correct` CLI command uses `_resolve_active_plan_id()` to find the current plan when `--plan` is omitted. In isolated E2E environments (where CLEVERAGENTS_HOME is a temp directory), this resolution fails even when a plan in `Execute/COMPLETE` state exists in the database. ## Current Behavior Running `plan correct <decision_id> --mode revert --guidance ...` without `--plan` in an isolated E2E subprocess environment fails with: ``` Error: No active plan found. Specify --plan <PLAN_ID> explicitly. ``` Adding `--plan <plan_id>` explicitly resolves the issue. ## Expected Behavior `_resolve_active_plan_id()` should find the plan in `Execute/COMPLETE` state when called from isolated subprocess invocations with `CLEVERAGENTS_HOME` set. ## Acceptance Criteria - [x] `plan correct` auto-resolves the active plan in isolated E2E environments - [x] Or: document that `--plan` is required in subprocess/non-interactive contexts and update test accordingly ## Subtasks - [x] Investigate why `_resolve_active_plan_id()` fails to find plans in isolated environments (check workspace/session context resolution) - [x] Fix the resolution logic or document the `--plan` requirement - [x] Remove `--plan ${plan_id}` workaround from `m4_acceptance.robot` if auto-resolve is fixed - [x] Tests (Robot): Verify `plan correct` works without `--plan` in E2E environment - [x] Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done.
hamza.khyari added this to the v3.3.0 milestone 2026-03-17 04:31:40 +00:00
Owner

TDD workflow initiated for this bug:

  • Created TDD issue #1035 to write a tagged test capturing the plan correct auto-resolve failure.
  • Dependency: this issue is blocked by #1035.
  • TDD assigned to @hamza.khyari.

PM triage — Day 37

**TDD workflow initiated for this bug:** - Created TDD issue #1035 to write a tagged test capturing the plan correct auto-resolve failure. - Dependency: this issue is blocked by #1035. - TDD assigned to @hamza.khyari. --- *PM triage — Day 37*
Owner

Assigned to @CoreRasurae for bug fix based on developer expertise (plan correction auto-resolve — Luis). State changed from Unverified to Verified. This bug and its TDD counterpart (#1035) are top priority per project policy — bugs always take precedence over feature work.

Assigned to @CoreRasurae for bug fix based on developer expertise (plan correction auto-resolve — Luis). State changed from Unverified to Verified. This bug and its TDD counterpart (#1035) are top priority per project policy — bugs always take precedence over feature work.
Owner

Planning Agent — Discussion Review

TDD pipeline status:

Stage Issue Assignee Status
TDD test #1035 @hurui200320 Open
Bug fix #1025 (this) @CoreRasurae Blocked by #1035

Assignment note: Day 37 triage originally assigned TDD to @hamza.khyari, but the latest assignment gives it to @hurui200320. Bug fix assigned to @CoreRasurae (plan correction auto-resolve expertise). Both assignments are reasonable.

The 5-point estimate reflects the complexity of debugging auto-resolve failures in isolated E2E environments — this likely involves environment setup, path resolution, or state initialization issues.

No disputes. Workflow proceeding per CONTRIBUTING.md §Bug Fix Workflow.

## Planning Agent — Discussion Review **TDD pipeline status:** | Stage | Issue | Assignee | Status | |-------|-------|----------|--------| | TDD test | #1035 | @hurui200320 | Open | | Bug fix | #1025 (this) | @CoreRasurae | Blocked by #1035 | **Assignment note:** Day 37 triage originally assigned TDD to @hamza.khyari, but the latest assignment gives it to @hurui200320. Bug fix assigned to @CoreRasurae (plan correction auto-resolve expertise). Both assignments are reasonable. The 5-point estimate reflects the complexity of debugging auto-resolve failures in isolated E2E environments — this likely involves environment setup, path resolution, or state initialization issues. No disputes. Workflow proceeding per CONTRIBUTING.md §Bug Fix Workflow.
freemo self-assigned this 2026-03-23 03:30:00 +00:00
Owner

Action required: Create Forgejo dependency link.

This bug issue must have a Forgejo dependency link to TDD issue #1035 (bug is blocked by TDD issue). The Forgejo REST API does not support creating dependency links in Forgejo 14 — this must be done via the web UI:

  1. Open this issue in the browser
  2. In the sidebar, find "Dependencies" and click "Add dependency"
  3. Search for #1035 and add it as a dependency (this issue depends on #1035)

This ensures the correct TDD workflow ordering: #1035 (write the tagged test) must be completed before this issue (implement the fix) can begin.

**Action required: Create Forgejo dependency link.** This bug issue must have a Forgejo dependency link to TDD issue #1035 (bug is blocked by TDD issue). The Forgejo REST API does not support creating dependency links in Forgejo 14 — this must be done via the web UI: 1. Open this issue in the browser 2. In the sidebar, find "Dependencies" and click "Add dependency" 3. Search for #1035 and add it as a dependency (this issue depends on #1035) This ensures the correct TDD workflow ordering: #1035 (write the tagged test) must be completed before this issue (implement the fix) can begin.
Member

Implementation journal update (issue #1025):

  • Investigated src/cleveragents/cli/commands/plan.py::_resolve_active_plan_id and confirmed auto-resolve only queried the process-default lifecycle service database.
  • Reproduced the failure mode in an isolated environment by:
    1. Seeding a non-terminal plan (phase=execute, processing_state=complete) with CLEVERAGENTS_HOME set and CWD = home workspace.
    2. Resetting singleton process state (settings/container) to mimic a fresh subprocess invocation.
    3. Invoking plan correct <decision_id> --mode revert --guidance ... from a different CWD with the same CLEVERAGENTS_HOME and no --plan.
    4. Observing No active plan found before the fix.

Code changes in progress:

  • Added fallback logic in plan.py::_resolve_active_plan_id to attempt active-plan lookup from $CLEVERAGENTS_HOME/.cleveragents/db.sqlite when the primary lookup finds no non-terminal plans.
  • Added Robot regression coverage:
    • robot/plan_correct_isolated_resolve.robot
    • robot/helper_plan_correct_isolated_resolve.py
      This test verifies auto-resolution without --plan across subprocess-like isolated invocations with differing working directories.

Current local verification:

  • New helper regression now passes after the fallback fix (issue-1025-plan-correct-isolated-resolve-ok).
  • Next: run full required quality gates (nox and coverage threshold checks), then finalize checklist/body updates and PR metadata.
Implementation journal update (issue #1025): - Investigated `src/cleveragents/cli/commands/plan.py::_resolve_active_plan_id` and confirmed auto-resolve only queried the process-default lifecycle service database. - Reproduced the failure mode in an isolated environment by: 1. Seeding a non-terminal plan (`phase=execute`, `processing_state=complete`) with `CLEVERAGENTS_HOME` set and CWD = home workspace. 2. Resetting singleton process state (settings/container) to mimic a fresh subprocess invocation. 3. Invoking `plan correct <decision_id> --mode revert --guidance ...` from a different CWD with the same `CLEVERAGENTS_HOME` and no `--plan`. 4. Observing `No active plan found` before the fix. Code changes in progress: - Added fallback logic in `plan.py::_resolve_active_plan_id` to attempt active-plan lookup from `$CLEVERAGENTS_HOME/.cleveragents/db.sqlite` when the primary lookup finds no non-terminal plans. - Added Robot regression coverage: - `robot/plan_correct_isolated_resolve.robot` - `robot/helper_plan_correct_isolated_resolve.py` This test verifies auto-resolution without `--plan` across subprocess-like isolated invocations with differing working directories. Current local verification: - New helper regression now passes after the fallback fix (`issue-1025-plan-correct-isolated-resolve-ok`). - Next: run full required quality gates (`nox` and coverage threshold checks), then finalize checklist/body updates and PR metadata.
Owner

Day 48 Planning Note — Missing TDD Dependency Link

This bug's TDD counterpart is #1035 (TDD: Write failing test for #1025 — plan correct auto-resolve), which is open (PR #1172 in review, needs rebase).

A Forgejo dependency link (bug #1025 depends on TDD #1035) should exist but could not be created via the API. A maintainer should manually add this dependency link through the Forgejo UI.

TDD workflow status: TDD PR #1172 has REQUEST_CHANGES (needs rebase + scope cleanup). Once merged, @brent.edwards should start the bugfix branch.

**Day 48 Planning Note — Missing TDD Dependency Link** This bug's TDD counterpart is **#1035** (TDD: Write failing test for #1025 — plan correct auto-resolve), which is **open** (PR #1172 in review, needs rebase). A Forgejo dependency link (bug #1025 depends on TDD #1035) should exist but could not be created via the API. A maintainer should manually add this dependency link through the Forgejo UI. **TDD workflow status**: TDD PR #1172 has REQUEST_CHANGES (needs rebase + scope cleanup). Once merged, @brent.edwards should start the bugfix branch.
Member

Implementation journal update:

  • Root-cause analysis: _resolve_active_plan_id() only inspected plans from the lifecycle service bound to the process CWD database, so isolated subprocess invocations that set CLEVERAGENTS_HOME but execute from a different working directory could not find the active Execute/COMPLETE plan.
  • Design decision: keep existing primary lookup unchanged and add a best-effort fallback only when no active plans are found. This minimizes behavioral impact for normal CLI flows while fixing isolated execution paths.
  • Implementation details:
    • Updated src/cleveragents/cli/commands/plan.py in _resolve_active_plan_id() to add _resolve_from_cleveragents_home().
    • Fallback resolves $CLEVERAGENTS_HOME/.cleveragents/db.sqlite and compares it with the CWD database path (Path.resolve(strict=False)).
    • If paths differ, fallback opens a UnitOfWork against the CLEVERAGENTS_HOME sqlite URL and selects the first non-terminal lifecycle plan, returning plan.identity.plan_id.
    • Errors inside fallback are intentionally swallowed to preserve existing user-facing failure behavior if no active plan is available.
  • Regression test coverage (Robot + helper):
    • Added robot/helper_plan_correct_isolated_resolve.py to simulate two isolated invocations sharing CLEVERAGENTS_HOME but with different CWDs.
    • Added robot/plan_correct_isolated_resolve.robot to execute the helper and assert successful auto-resolution without --plan.

Validation results for this issue branch:

  • nox full default suite: PASS (all default sessions successful; total runtime ~55m)
  • nox -e e2e_tests: PASS (56/56)
  • nox -e integration_tests: PASS (1816/1816)
  • nox -e coverage_report: PASS, summary coverage 97%
  • nox -e lint: PASS
  • nox -e typecheck: PASS
  • nox -e unit_tests: PASS

No additional out-of-scope defects were introduced by this change set.

Implementation journal update: - Root-cause analysis: `_resolve_active_plan_id()` only inspected plans from the lifecycle service bound to the process CWD database, so isolated subprocess invocations that set `CLEVERAGENTS_HOME` but execute from a different working directory could not find the active Execute/COMPLETE plan. - Design decision: keep existing primary lookup unchanged and add a best-effort fallback only when no active plans are found. This minimizes behavioral impact for normal CLI flows while fixing isolated execution paths. - Implementation details: - Updated `src/cleveragents/cli/commands/plan.py` in `_resolve_active_plan_id()` to add `_resolve_from_cleveragents_home()`. - Fallback resolves `$CLEVERAGENTS_HOME/.cleveragents/db.sqlite` and compares it with the CWD database path (`Path.resolve(strict=False)`). - If paths differ, fallback opens a `UnitOfWork` against the CLEVERAGENTS_HOME sqlite URL and selects the first non-terminal lifecycle plan, returning `plan.identity.plan_id`. - Errors inside fallback are intentionally swallowed to preserve existing user-facing failure behavior if no active plan is available. - Regression test coverage (Robot + helper): - Added `robot/helper_plan_correct_isolated_resolve.py` to simulate two isolated invocations sharing `CLEVERAGENTS_HOME` but with different CWDs. - Added `robot/plan_correct_isolated_resolve.robot` to execute the helper and assert successful auto-resolution without `--plan`. Validation results for this issue branch: - `nox` full default suite: PASS (all default sessions successful; total runtime ~55m) - `nox -e e2e_tests`: PASS (56/56) - `nox -e integration_tests`: PASS (1816/1816) - `nox -e coverage_report`: PASS, summary coverage 97% - `nox -e lint`: PASS - `nox -e typecheck`: PASS - `nox -e unit_tests`: PASS No additional out-of-scope defects were introduced by this change set.
Member

Final implementation note:

  • Commit: 6e2556a51ae47130d285377b9f4f391930cfd97a
  • Branch: fix/plan-correct-resolve
  • PR: #1184

PR metadata verification:

  • Milestone: v3.3.0 set on PR
  • Type label: Type/Bug set on PR
  • Closing keyword: Closes #1025 included in PR body
  • Dependency direction configured: issue #1025 depends on PR #1184 (PR blocks issue)

CI status for commit 6e2556a5:

  • lint: success
  • typecheck: success
  • security: success
  • quality: success
  • unit_tests: success
  • integration_tests: success
  • e2e_tests: success
  • coverage: success
  • docker: success
  • helm: success
  • build: success
  • benchmark-regression: success
  • status-check: success

Issue transitioned to State/In Review.

Final implementation note: - Commit: `6e2556a51ae47130d285377b9f4f391930cfd97a` - Branch: `fix/plan-correct-resolve` - PR: https://git.cleverthis.com/cleveragents/cleveragents-core/pulls/1184 PR metadata verification: - Milestone: `v3.3.0` set on PR - Type label: `Type/Bug` set on PR - Closing keyword: `Closes #1025` included in PR body - Dependency direction configured: issue `#1025` depends on PR `#1184` (PR blocks issue) CI status for commit `6e2556a5`: - lint: success - typecheck: success - security: success - quality: success - unit_tests: success - integration_tests: success - e2e_tests: success - coverage: success - docker: success - helm: success - build: success - benchmark-regression: success - status-check: success Issue transitioned to `State/In Review`.
Sign in to join this conversation.
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#1025
No description provided.