feat(autonomy): automation profile resolution precedence correct #1196

Merged
brent.edwards merged 4 commits from feature/m6-automation-profiles into master 2026-04-01 04:27:09 +00:00
Member

Summary

  • Fix automation profile resolution precedence at plan-creation time to follow plan > action > project > global.
  • Narrow exception handling in _resolve_plan_profile_ref to catch only (KeyError, ValueError, OSError) with warning-level structured logging, replacing bare except Exception: per CONTRIBUTING.md §Exception Propagation.
  • Fix TDD feature file tag/title inconsistency: align to @tdd_bug @tdd_bug_1076 / "TDD Bug #1076" referencing the original bug ticket.
  • Add BDD scenarios for plan-level profile override and config service error fallback.
  • Harden lifecycle and Robot/Behave test paths for auto-progression and parallel CI stability (including pre-seeded DB initialization for tdd_plan_explain_plan_id helper).
  • Update Robot helper timing/session patterns to reduce false negatives in long-running integration/E2E validations.

Approach

The automation profile resolution is implemented in PlanLifecycleService._resolve_plan_profile_ref, which evaluates the four-level precedence chain (plan > action > project > global) at use_action() time. The resolved profile is stored as an AutomationProfileRef on the Plan with provenance tracking. The ConfigService is used for project-scoped and global config lookups, with narrowed exception handling that logs warnings on failure and falls back to the settings default.

Validation

  • nox -s lint
  • nox -s typecheck
  • nox -s unit_tests
  • nox -s coverage_report (97%)

Review Fix Round (Review #2963)

  1. Bare except Exception: -> narrowed to (KeyError, ValueError, OSError) with warning logging - addresses CONTRIBUTING.md §Exception Propagation violation
  2. Tag inconsistency fixed - @tdd_bug @tdd_bug_1076 with title "TDD Bug #1076"
  3. Added 2 new BDD scenarios - plan-level override and config error fallback
  4. Rebased onto latest master (532ea100)

Integration CI Follow-up

  1. Addressed an intermittent integration_tests failure in workflow-05 helper logic caused by fixed-step lifecycle assumptions under auto-progression.
  2. Updated helper lifecycle flow to be state-aware (phase + processing_state) before invoking each transition, covering both manual and auto-progressed paths.
  3. Local verification after the fix:
    • nox -s integration_tests
    • nox -s lint
    • nox -s typecheck
    • nox -s unit_tests
    • nox -s e2e_tests
    • nox -s coverage_report

Closes #854

## Summary - Fix automation profile resolution precedence at plan-creation time to follow plan > action > project > global. - Narrow exception handling in `_resolve_plan_profile_ref` to catch only `(KeyError, ValueError, OSError)` with warning-level structured logging, replacing bare `except Exception:` per CONTRIBUTING.md §Exception Propagation. - Fix TDD feature file tag/title inconsistency: align to `@tdd_bug @tdd_bug_1076` / "TDD Bug #1076" referencing the original bug ticket. - Add BDD scenarios for plan-level profile override and config service error fallback. - Harden lifecycle and Robot/Behave test paths for auto-progression and parallel CI stability (including pre-seeded DB initialization for `tdd_plan_explain_plan_id` helper). - Update Robot helper timing/session patterns to reduce false negatives in long-running integration/E2E validations. ## Approach The automation profile resolution is implemented in `PlanLifecycleService._resolve_plan_profile_ref`, which evaluates the four-level precedence chain (plan > action > project > global) at `use_action()` time. The resolved profile is stored as an `AutomationProfileRef` on the Plan with provenance tracking. The `ConfigService` is used for project-scoped and global config lookups, with narrowed exception handling that logs warnings on failure and falls back to the settings default. ## Validation - `nox -s lint` ✅ - `nox -s typecheck` ✅ - `nox -s unit_tests` ✅ - `nox -s coverage_report` ✅ (97%) ## Review Fix Round (Review #2963) 1. **Bare `except Exception:` -> narrowed to `(KeyError, ValueError, OSError)` with warning logging** - addresses CONTRIBUTING.md §Exception Propagation violation 2. **Tag inconsistency fixed** - `@tdd_bug @tdd_bug_1076` with title "TDD Bug #1076" 3. **Added 2 new BDD scenarios** - plan-level override and config error fallback 4. **Rebased onto latest master** (532ea100) ## Integration CI Follow-up 1. Addressed an intermittent `integration_tests` failure in workflow-05 helper logic caused by fixed-step lifecycle assumptions under auto-progression. 2. Updated helper lifecycle flow to be state-aware (`phase` + `processing_state`) before invoking each transition, covering both manual and auto-progressed paths. 3. Local verification after the fix: - `nox -s integration_tests` ✅ - `nox -s lint` ✅ - `nox -s typecheck` ✅ - `nox -s unit_tests` ✅ - `nox -s e2e_tests` ✅ - `nox -s coverage_report` ✅ Closes #854
brent.edwards added this to the v3.5.0 milestone 2026-03-29 08:30:07 +00:00
freemo approved these changes 2026-03-30 04:20:33 +00:00
Dismissed
freemo left a comment

Review: APPROVED with Comments

Small, well-focused change implementing 4-level automation profile precedence.

Notes

  1. Consider adding BDD scenarios specifically testing each precedence level (plan > action > project > global) individually. Currently only the TDD bug-capture test is updated.

  2. The duplicate validation (CLI validates profile name, service also validates) is acceptable as defense-in-depth.

  3. Good: Clean precedence chain logic with structured logging at each resolution level.

  4. Good: ConfigService properly injected with a sensible default.

## Review: APPROVED with Comments Small, well-focused change implementing 4-level automation profile precedence. ### Notes 1. Consider adding BDD scenarios specifically testing each precedence level (plan > action > project > global) individually. Currently only the TDD bug-capture test is updated. 2. The duplicate validation (CLI validates profile name, service also validates) is acceptable as defense-in-depth. 3. **Good**: Clean precedence chain logic with structured logging at each resolution level. 4. **Good**: `ConfigService` properly injected with a sensible default.
freemo left a comment

Updated Review (Deep Pass): REQUEST CHANGES

My initial review approved this PR. The deep review reveals a blocking issue.

New Finding: Bare except Exception: in _resolve_plan_profile_ref

At line ~905 of plan_lifecycle_service.py, the config resolution is wrapped in:

except Exception:
    ...  # silently falls back to settings.default_automation_profile

This catches all exceptions — including config corruption, permission errors, import failures, AttributeError, etc. — and silently falls back to the default profile. Per CONTRIBUTING.md §Exception Propagation: "Do not suppress errors. Let exceptions propagate."

Action required: Either narrow to specific expected exceptions (e.g., KeyError, FileNotFoundError) or at minimum log the caught exception at warning level so issues are traceable.

New Finding: Tag inconsistency

The feature file tag was changed from @tdd_issue_1076 to @tdd_issue_854, but the feature title still says "TDD Issue #1076". If #854 is correct, update the title. If #1076 is correct, revert the tag.

Previous findings still apply:

  • Clean precedence chain logic with structured logging
  • ConfigService properly injected with sensible default
  • Duplicate validation (CLI + service) is acceptable defense-in-depth
## Updated Review (Deep Pass): REQUEST CHANGES My initial review approved this PR. The deep review reveals a blocking issue. ### New Finding: Bare `except Exception:` in `_resolve_plan_profile_ref` At line ~905 of `plan_lifecycle_service.py`, the config resolution is wrapped in: ```python except Exception: ... # silently falls back to settings.default_automation_profile ``` This catches **all exceptions** — including config corruption, permission errors, import failures, `AttributeError`, etc. — and silently falls back to the default profile. Per CONTRIBUTING.md §Exception Propagation: "Do not suppress errors. Let exceptions propagate." **Action required:** Either narrow to specific expected exceptions (e.g., `KeyError`, `FileNotFoundError`) or at minimum log the caught exception at `warning` level so issues are traceable. ### New Finding: Tag inconsistency The feature file tag was changed from `@tdd_issue_1076` to `@tdd_issue_854`, but the feature title still says "TDD Issue #1076". If #854 is correct, update the title. If #1076 is correct, revert the tag. ### Previous findings still apply: - Clean precedence chain logic with structured logging - `ConfigService` properly injected with sensible default - Duplicate validation (CLI + service) is acceptable defense-in-depth
brent.edwards force-pushed feature/m6-automation-profiles from 859b74a5cb
All checks were successful
CI / build (pull_request) Successful in 19s
CI / helm (pull_request) Successful in 20s
CI / typecheck (pull_request) Successful in 52s
CI / security (pull_request) Successful in 1m0s
CI / quality (pull_request) Successful in 4m4s
CI / benchmark-publish (pull_request) Has been skipped
CI / integration_tests (pull_request) Successful in 9m15s
CI / unit_tests (pull_request) Successful in 9m25s
CI / lint (pull_request) Successful in 9m32s
CI / e2e_tests (pull_request) Successful in 10m58s
CI / docker (pull_request) Successful in 1m22s
CI / coverage (pull_request) Successful in 8m24s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-regression (pull_request) Successful in 53m46s
to d35280ff18
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 20s
CI / helm (pull_request) Successful in 21s
CI / lint (pull_request) Successful in 24s
CI / typecheck (pull_request) Successful in 48s
CI / quality (pull_request) Successful in 57s
CI / security (pull_request) Successful in 1m2s
CI / integration_tests (pull_request) Successful in 7m0s
CI / unit_tests (pull_request) Successful in 7m41s
CI / docker (pull_request) Successful in 1m21s
CI / coverage (pull_request) Successful in 11m55s
CI / e2e_tests (pull_request) Successful in 16m19s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-regression (pull_request) Successful in 54m59s
2026-03-31 03:15:41 +00:00
Compare
Author
Member

Review Response — Addressing Review #2963 (REQUEST_CHANGES by @freemo)

1. Bare except Exception: in _resolve_plan_profile_ref Fixed

Narrowed the exception handler to except (KeyError, ValueError, OSError) as exc: and added warning-level structured logging that records:

  • The error message and type
  • The settings fallback value being used

This covers the three exception categories that ConfigService.resolve() can legitimately raise (dict key errors, unregistered keys / TOML parse errors, file I/O issues). Any unexpected exception type (programming errors) will now propagate as required by CONTRIBUTING.md §Exception Propagation.

Added a new BDD scenario ("Plan falls back to settings default when config service raises an error") that exercises this handler via a _BrokenConfigService test double.

2. Tag inconsistency Fixed

Updated features/tdd_use_action_automation_profile.feature:

  • Tags: @tdd_issue @tdd_issue_854@tdd_bug @tdd_bug_1076
  • Title: "TDD Issue #1076" → "TDD Bug #1076"

The TDD test was correctly associated with bug #1076 (use_action() doesn't propagate automation_profile). Aligned tag convention to @tdd_bug per project standards.

3. Previous approved comments (informational) — Acknowledged

  • BDD scenarios for each precedence level: Added a new scenario for plan-level override of action-level profile. Together with the existing scenarios, the feature file now covers action-level, project-level, global-level, plan-level override, and config-error fallback. More granular individual-level tests can be added in a follow-up if desired.
  • Clean precedence chain logic / ConfigService injection / Duplicate validation — No changes needed, these were already approved.

Quality Gates (all pass)

  • nox -s lint
  • nox -s typecheck
  • nox -s unit_tests
  • nox -s coverage_report (97.0%)

Rebase

Branch rebased onto latest origin/master (532ea100). Resolved one conflict in robot/resource_dag.robot (shared session variable naming — took master's convention).

## Review Response — Addressing Review #2963 (REQUEST_CHANGES by @freemo) ### 1. Bare `except Exception:` in `_resolve_plan_profile_ref` ✅ Fixed Narrowed the exception handler to `except (KeyError, ValueError, OSError) as exc:` and added warning-level structured logging that records: - The error message and type - The settings fallback value being used This covers the three exception categories that `ConfigService.resolve()` can legitimately raise (dict key errors, unregistered keys / TOML parse errors, file I/O issues). Any unexpected exception type (programming errors) will now propagate as required by CONTRIBUTING.md §Exception Propagation. Added a new BDD scenario ("Plan falls back to settings default when config service raises an error") that exercises this handler via a `_BrokenConfigService` test double. ### 2. Tag inconsistency ✅ Fixed Updated `features/tdd_use_action_automation_profile.feature`: - Tags: `@tdd_issue @tdd_issue_854` → `@tdd_bug @tdd_bug_1076` - Title: "TDD Issue #1076" → "TDD Bug #1076" The TDD test was correctly associated with bug #1076 (use_action() doesn't propagate automation_profile). Aligned tag convention to `@tdd_bug` per project standards. ### 3. Previous approved comments (informational) — Acknowledged - **BDD scenarios for each precedence level:** Added a new scenario for plan-level override of action-level profile. Together with the existing scenarios, the feature file now covers action-level, project-level, global-level, plan-level override, and config-error fallback. More granular individual-level tests can be added in a follow-up if desired. - **Clean precedence chain logic / ConfigService injection / Duplicate validation** — No changes needed, these were already approved. ### Quality Gates (all pass) - `nox -s lint` ✅ - `nox -s typecheck` ✅ - `nox -s unit_tests` ✅ - `nox -s coverage_report` ✅ (97.0%) ### Rebase Branch rebased onto latest `origin/master` (532ea100). Resolved one conflict in `robot/resource_dag.robot` (shared session variable naming — took master's convention).
brent.edwards force-pushed feature/m6-automation-profiles from 8111a98847
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 17s
CI / helm (pull_request) Successful in 23s
CI / quality (pull_request) Has been cancelled
CI / typecheck (pull_request) Has been cancelled
CI / e2e_tests (pull_request) Has been cancelled
CI / lint (pull_request) Has been cancelled
CI / integration_tests (pull_request) Has been cancelled
CI / unit_tests (pull_request) Has been cancelled
CI / security (pull_request) Has been cancelled
CI / status-check (pull_request) Has been cancelled
CI / coverage (pull_request) Has been cancelled
CI / docker (pull_request) Has been cancelled
CI / benchmark-regression (pull_request) Has been cancelled
to 915e791a05
Some checks failed
CI / status-check (pull_request) Blocked by required conditions
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 18s
CI / helm (pull_request) Successful in 23s
CI / security (pull_request) Successful in 1m10s
CI / lint (pull_request) Successful in 3m19s
CI / quality (pull_request) Successful in 3m44s
CI / typecheck (pull_request) Successful in 3m57s
CI / unit_tests (pull_request) Successful in 9m4s
CI / docker (pull_request) Failing after 1m19s
CI / coverage (pull_request) Successful in 9m45s
CI / e2e_tests (pull_request) Successful in 19m20s
CI / benchmark-regression (pull_request) Failing after 22m54s
CI / integration_tests (pull_request) Failing after 26m50s
2026-04-01 02:04:50 +00:00
Compare
chore(ci): rerun flaky checks
Some checks are pending
CI / benchmark-publish (pull_request) Waiting to run
CI / build (pull_request) Successful in 17s
CI / helm (pull_request) Successful in 22s
CI / lint (pull_request) Successful in 3m19s
CI / quality (pull_request) Successful in 3m43s
CI / typecheck (pull_request) Successful in 3m54s
CI / benchmark-regression (pull_request) Waiting to run
CI / security (pull_request) Successful in 4m4s
CI / unit_tests (pull_request) Successful in 9m20s
CI / docker (pull_request) Successful in 1m18s
CI / coverage (pull_request) Successful in 12m7s
CI / integration_tests (pull_request) Successful in 21m11s
CI / e2e_tests (pull_request) Successful in 18m59s
CI / status-check (pull_request) Successful in 1s
c5dc794e03
Merge branch 'master' into feature/m6-automation-profiles
All checks were successful
CI / build (pull_request) Successful in 19s
CI / lint (pull_request) Successful in 26s
CI / helm (pull_request) Successful in 50s
CI / quality (pull_request) Successful in 3m43s
CI / typecheck (pull_request) Successful in 3m54s
CI / security (pull_request) Successful in 4m7s
CI / unit_tests (pull_request) Successful in 9m16s
CI / docker (pull_request) Successful in 1m31s
CI / coverage (pull_request) Successful in 8m52s
CI / e2e_tests (pull_request) Successful in 20m0s
CI / integration_tests (pull_request) Successful in 21m8s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-publish (pull_request) Has been skipped
CI / benchmark-regression (pull_request) Successful in 55m5s
172ab9afa9
brent.edwards scheduled this pull request to auto merge when all checks succeed 2026-04-01 04:06:59 +00:00
Author
Member

Implementation note: addressing CI integration_tests instability found on this PR head.

  • Review item addressed: intermittent workflow-05 lifecycle helper failure after complete_strategize when auto-progression advances the plan state earlier than the helper expected.
  • Root cause: robot/helper_int_wf05_db_migration.py assumed a fixed linear transition sequence (complete_strategize -> execute_plan -> start_execute -> complete_execute). With auto-progression enabled, the plan can already be in EXECUTE/QUEUED (or later), so unconditional calls can become invalid and fail nondeterministically.
  • Change made: updated _review_profile_behavior() and _plan_lifecycle_review_profile() to read the current (phase, processing_state) after each transition and only invoke the next operation when valid for that state.
  • Design decision: keep assertions strict about expected states while making transition calls state-aware; this preserves behavioral guarantees without introducing sleeps/retries.
  • Result: helper now supports both legitimate lifecycle paths (manual step-by-step and auto-progressed handoff), removing the race-sensitive assumption.

Key code location:

  • robot/helper_int_wf05_db_migration.py (_review_profile_behavior, _plan_lifecycle_review_profile)
Implementation note: addressing CI `integration_tests` instability found on this PR head. - Review item addressed: intermittent workflow-05 lifecycle helper failure after `complete_strategize` when auto-progression advances the plan state earlier than the helper expected. - Root cause: `robot/helper_int_wf05_db_migration.py` assumed a fixed linear transition sequence (`complete_strategize -> execute_plan -> start_execute -> complete_execute`). With auto-progression enabled, the plan can already be in `EXECUTE/QUEUED` (or later), so unconditional calls can become invalid and fail nondeterministically. - Change made: updated `_review_profile_behavior()` and `_plan_lifecycle_review_profile()` to read the current `(phase, processing_state)` after each transition and only invoke the next operation when valid for that state. - Design decision: keep assertions strict about expected states while making transition calls state-aware; this preserves behavioral guarantees without introducing sleeps/retries. - Result: helper now supports both legitimate lifecycle paths (manual step-by-step and auto-progressed handoff), removing the race-sensitive assumption. Key code location: - `robot/helper_int_wf05_db_migration.py` (`_review_profile_behavior`, `_plan_lifecycle_review_profile`)
brent.edwards deleted branch feature/m6-automation-profiles 2026-04-01 04:27:09 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core!1196
No description provided.