feat(autonomy): automation profile resolution precedence correct #1196

2026-03-29T08:30:07Z

brent.edwards commented

2026-03-29 08:30:07 +00:00

Summary

Fix automation profile resolution precedence at plan-creation time to follow plan > action > project > global.
Narrow exception handling in _resolve_plan_profile_ref to catch only (KeyError, ValueError, OSError) with warning-level structured logging, replacing bare except Exception: per CONTRIBUTING.md §Exception Propagation.
Fix TDD feature file tag/title inconsistency: align to @tdd_bug @tdd_bug_1076 / "TDD Bug #1076" referencing the original bug ticket.
Add BDD scenarios for plan-level profile override and config service error fallback.
Harden lifecycle and Robot/Behave test paths for auto-progression and parallel CI stability (including pre-seeded DB initialization for tdd_plan_explain_plan_id helper).
Update Robot helper timing/session patterns to reduce false negatives in long-running integration/E2E validations.

Approach

The automation profile resolution is implemented in PlanLifecycleService._resolve_plan_profile_ref, which evaluates the four-level precedence chain (plan > action > project > global) at use_action() time. The resolved profile is stored as an AutomationProfileRef on the Plan with provenance tracking. The ConfigService is used for project-scoped and global config lookups, with narrowed exception handling that logs warnings on failure and falls back to the settings default.

Validation

nox -s lint ✅
nox -s typecheck ✅
nox -s unit_tests ✅
nox -s coverage_report ✅ (97%)

Review Fix Round (Review #2963)

Bare except Exception: -> narrowed to (KeyError, ValueError, OSError) with warning logging - addresses CONTRIBUTING.md §Exception Propagation violation
Tag inconsistency fixed - @tdd_bug @tdd_bug_1076 with title "TDD Bug #1076"
Added 2 new BDD scenarios - plan-level override and config error fallback
Rebased onto latest master (532ea100)

Integration CI Follow-up

Addressed an intermittent integration_tests failure in workflow-05 helper logic caused by fixed-step lifecycle assumptions under auto-progression.
Updated helper lifecycle flow to be state-aware (phase + processing_state) before invoking each transition, covering both manual and auto-progressed paths.
Local verification after the fix:
- nox -s integration_tests ✅
- nox -s lint ✅
- nox -s typecheck ✅
- nox -s unit_tests ✅
- nox -s e2e_tests ✅
- nox -s coverage_report ✅

Closes #854

## Summary - Fix automation profile resolution precedence at plan-creation time to follow plan > action > project > global. - Narrow exception handling in `_resolve_plan_profile_ref` to catch only `(KeyError, ValueError, OSError)` with warning-level structured logging, replacing bare `except Exception:` per CONTRIBUTING.md §Exception Propagation. - Fix TDD feature file tag/title inconsistency: align to `@tdd_bug @tdd_bug_1076` / "TDD Bug #1076" referencing the original bug ticket. - Add BDD scenarios for plan-level profile override and config service error fallback. - Harden lifecycle and Robot/Behave test paths for auto-progression and parallel CI stability (including pre-seeded DB initialization for `tdd_plan_explain_plan_id` helper). - Update Robot helper timing/session patterns to reduce false negatives in long-running integration/E2E validations. ## Approach The automation profile resolution is implemented in `PlanLifecycleService._resolve_plan_profile_ref`, which evaluates the four-level precedence chain (plan > action > project > global) at `use_action()` time. The resolved profile is stored as an `AutomationProfileRef` on the Plan with provenance tracking. The `ConfigService` is used for project-scoped and global config lookups, with narrowed exception handling that logs warnings on failure and falls back to the settings default. ## Validation - `nox -s lint` ✅ - `nox -s typecheck` ✅ - `nox -s unit_tests` ✅ - `nox -s coverage_report` ✅ (97%) ## Review Fix Round (Review #2963) 1. **Bare `except Exception:` -> narrowed to `(KeyError, ValueError, OSError)` with warning logging** - addresses CONTRIBUTING.md §Exception Propagation violation 2. **Tag inconsistency fixed** - `@tdd_bug @tdd_bug_1076` with title "TDD Bug #1076" 3. **Added 2 new BDD scenarios** - plan-level override and config error fallback 4. **Rebased onto latest master** (532ea100) ## Integration CI Follow-up 1. Addressed an intermittent `integration_tests` failure in workflow-05 helper logic caused by fixed-step lifecycle assumptions under auto-progression. 2. Updated helper lifecycle flow to be state-aware (`phase` + `processing_state`) before invoking each transition, covering both manual and auto-progressed paths. 3. Local verification after the fix: - `nox -s integration_tests` ✅ - `nox -s lint` ✅ - `nox -s typecheck` ✅ - `nox -s unit_tests` ✅ - `nox -s e2e_tests` ✅ - `nox -s coverage_report` ✅ Closes #854

brent.edwards added this to the v3.5.0 milestone 2026-03-29 08:30:07 +00:00

brent.edwards added the

Type

Feature

label 2026-03-29 08:30:31 +00:00

brent.edwards added a new dependency 2026-03-29 08:31:19 +00:00

#854 feat(autonomy): automation profile resolution precedence correct (plan > action > global)

brent.edwards referenced this pull request

2026-03-29 08:31:37 +00:00

feat(autonomy): automation profile resolution precedence correct (plan > action > global) #854

brent.edwards referenced this pull request

2026-03-29 08:46:35 +00:00

feat(autonomy): automation profile resolution precedence correct (plan > action > global) #854

freemo approved these changes 2026-03-30 04:20:33 +00:00

Dismissed

freemo left a comment

Review: APPROVED with Comments

Small, well-focused change implementing 4-level automation profile precedence.

Notes

Consider adding BDD scenarios specifically testing each precedence level (plan > action > project > global) individually. Currently only the TDD bug-capture test is updated.
The duplicate validation (CLI validates profile name, service also validates) is acceptable as defense-in-depth.
Good: Clean precedence chain logic with structured logging at each resolution level.
Good: ConfigService properly injected with a sensible default.

## Review: APPROVED with Comments Small, well-focused change implementing 4-level automation profile precedence. ### Notes 1. Consider adding BDD scenarios specifically testing each precedence level (plan > action > project > global) individually. Currently only the TDD bug-capture test is updated. 2. The duplicate validation (CLI validates profile name, service also validates) is acceptable as defense-in-depth. 3. **Good**: Clean precedence chain logic with structured logging at each resolution level. 4. **Good**: `ConfigService` properly injected with a sensible default.

freemo requested changes 2026-03-30 04:48:49 +00:00

freemo left a comment

Updated Review (Deep Pass): REQUEST CHANGES

My initial review approved this PR. The deep review reveals a blocking issue.

New Finding: Bare `except Exception:` in `_resolve_plan_profile_ref`

At line ~905 of plan_lifecycle_service.py, the config resolution is wrapped in:

except Exception:
    ...  # silently falls back to settings.default_automation_profile

This catches all exceptions — including config corruption, permission errors, import failures, AttributeError, etc. — and silently falls back to the default profile. Per CONTRIBUTING.md §Exception Propagation: "Do not suppress errors. Let exceptions propagate."

Action required: Either narrow to specific expected exceptions (e.g., KeyError, FileNotFoundError) or at minimum log the caught exception at warning level so issues are traceable.

New Finding: Tag inconsistency

The feature file tag was changed from @tdd_issue_1076 to @tdd_issue_854, but the feature title still says "TDD Issue #1076". If #854 is correct, update the title. If #1076 is correct, revert the tag.

Previous findings still apply:

Clean precedence chain logic with structured logging
ConfigService properly injected with sensible default
Duplicate validation (CLI + service) is acceptable defense-in-depth

## Updated Review (Deep Pass): REQUEST CHANGES My initial review approved this PR. The deep review reveals a blocking issue. ### New Finding: Bare `except Exception:` in `_resolve_plan_profile_ref` At line ~905 of `plan_lifecycle_service.py`, the config resolution is wrapped in: ```python except Exception: ... # silently falls back to settings.default_automation_profile ``` This catches **all exceptions** — including config corruption, permission errors, import failures, `AttributeError`, etc. — and silently falls back to the default profile. Per CONTRIBUTING.md §Exception Propagation: "Do not suppress errors. Let exceptions propagate." **Action required:** Either narrow to specific expected exceptions (e.g., `KeyError`, `FileNotFoundError`) or at minimum log the caught exception at `warning` level so issues are traceable. ### New Finding: Tag inconsistency The feature file tag was changed from `@tdd_issue_1076` to `@tdd_issue_854`, but the feature title still says "TDD Issue #1076". If #854 is correct, update the title. If #1076 is correct, revert the tag. ### Previous findings still apply: - Clean precedence chain logic with structured logging - `ConfigService` properly injected with sensible default - Duplicate validation (CLI + service) is acceptable defense-in-depth

freemo requested review from freemo 2026-03-30 05:14:05 +00:00

freemo requested review from CoreRasurae 2026-03-30 05:14:06 +00:00

brent.edwards referenced this pull request

2026-03-30 12:04:35 +00:00

feat(autonomy): automation profile resolution precedence correct (plan > action > global) #854

brent.edwards referenced this pull request

2026-03-30 13:09:19 +00:00

feat(autonomy): automation profile resolution precedence correct (plan > action > global) #854

brent.edwards force-pushed feature/m6-automation-profiles from 859b74a5cb to d35280ff18

2026-03-31 03:15:41 +00:00

Compare

brent.edwards commented

2026-03-31 03:16:17 +00:00

Review Response — Addressing Review #2963 (REQUEST_CHANGES by @freemo)

1. Bare `except Exception:` in `_resolve_plan_profile_ref` ✅ Fixed

Narrowed the exception handler to except (KeyError, ValueError, OSError) as exc: and added warning-level structured logging that records:

The error message and type
The settings fallback value being used

This covers the three exception categories that ConfigService.resolve() can legitimately raise (dict key errors, unregistered keys / TOML parse errors, file I/O issues). Any unexpected exception type (programming errors) will now propagate as required by CONTRIBUTING.md §Exception Propagation.

Added a new BDD scenario ("Plan falls back to settings default when config service raises an error") that exercises this handler via a _BrokenConfigService test double.

2. Tag inconsistency ✅ Fixed

Updated features/tdd_use_action_automation_profile.feature:

Tags: @tdd_issue @tdd_issue_854 → @tdd_bug @tdd_bug_1076
Title: "TDD Issue #1076" → "TDD Bug #1076"

The TDD test was correctly associated with bug #1076 (use_action() doesn't propagate automation_profile). Aligned tag convention to @tdd_bug per project standards.

3. Previous approved comments (informational) — Acknowledged

BDD scenarios for each precedence level: Added a new scenario for plan-level override of action-level profile. Together with the existing scenarios, the feature file now covers action-level, project-level, global-level, plan-level override, and config-error fallback. More granular individual-level tests can be added in a follow-up if desired.
Clean precedence chain logic / ConfigService injection / Duplicate validation — No changes needed, these were already approved.

Quality Gates (all pass)

nox -s lint ✅
nox -s typecheck ✅
nox -s unit_tests ✅
nox -s coverage_report ✅ (97.0%)

Rebase

Branch rebased onto latest origin/master (532ea100). Resolved one conflict in robot/resource_dag.robot (shared session variable naming — took master's convention).

## Review Response — Addressing Review #2963 (REQUEST_CHANGES by @freemo) ### 1. Bare `except Exception:` in `_resolve_plan_profile_ref` ✅ Fixed Narrowed the exception handler to `except (KeyError, ValueError, OSError) as exc:` and added warning-level structured logging that records: - The error message and type - The settings fallback value being used This covers the three exception categories that `ConfigService.resolve()` can legitimately raise (dict key errors, unregistered keys / TOML parse errors, file I/O issues). Any unexpected exception type (programming errors) will now propagate as required by CONTRIBUTING.md §Exception Propagation. Added a new BDD scenario ("Plan falls back to settings default when config service raises an error") that exercises this handler via a `_BrokenConfigService` test double. ### 2. Tag inconsistency ✅ Fixed Updated `features/tdd_use_action_automation_profile.feature`: - Tags: `@tdd_issue @tdd_issue_854` → `@tdd_bug @tdd_bug_1076` - Title: "TDD Issue #1076" → "TDD Bug #1076" The TDD test was correctly associated with bug #1076 (use_action() doesn't propagate automation_profile). Aligned tag convention to `@tdd_bug` per project standards. ### 3. Previous approved comments (informational) — Acknowledged - **BDD scenarios for each precedence level:** Added a new scenario for plan-level override of action-level profile. Together with the existing scenarios, the feature file now covers action-level, project-level, global-level, plan-level override, and config-error fallback. More granular individual-level tests can be added in a follow-up if desired. - **Clean precedence chain logic / ConfigService injection / Duplicate validation** — No changes needed, these were already approved. ### Quality Gates (all pass) - `nox -s lint` ✅ - `nox -s typecheck` ✅ - `nox -s unit_tests` ✅ - `nox -s coverage_report` ✅ (97.0%) ### Rebase Branch rebased onto latest `origin/master` (532ea100). Resolved one conflict in `robot/resource_dag.robot` (shared session variable naming — took master's convention).

brent.edwards force-pushed feature/m6-automation-profiles from 8111a98847 to 915e791a05

2026-04-01 02:04:50 +00:00

Compare

brent.edwards added 1 commit 2026-04-01 03:43:34 +00:00

chore(ci): rerun flaky checks

CI / benchmark-publish (pull_request) Waiting to run

Details

CI / build (pull_request) Successful in 17s

Details

CI / helm (pull_request) Successful in 22s

Details

CI / lint (pull_request) Successful in 3m19s

Details

CI / quality (pull_request) Successful in 3m43s

Details

CI / typecheck (pull_request) Successful in 3m54s

Details

CI / benchmark-regression (pull_request) Waiting to run

Details

CI / security (pull_request) Successful in 4m4s

Details

CI / unit_tests (pull_request) Successful in 9m20s

Details

CI / docker (pull_request) Successful in 1m18s

Details

CI / coverage (pull_request) Successful in 12m7s

Details

CI / integration_tests (pull_request) Successful in 21m11s

Details

CI / e2e_tests (pull_request) Successful in 18m59s

Details

CI / status-check (pull_request) Successful in 1s

Details

c5dc794e03

brent.edwards added 1 commit 2026-04-01 04:05:57 +00:00

Merge branch 'master' into feature/m6-automation-profiles

CI / build (pull_request) Successful in 19s

Details

CI / lint (pull_request) Successful in 26s

Details

CI / helm (pull_request) Successful in 50s

Details

CI / quality (pull_request) Successful in 3m43s

Details

CI / typecheck (pull_request) Successful in 3m54s

Details

CI / security (pull_request) Successful in 4m7s

Details

CI / unit_tests (pull_request) Successful in 9m16s

Details

CI / docker (pull_request) Successful in 1m31s

Details

CI / coverage (pull_request) Successful in 8m52s

Details

CI / e2e_tests (pull_request) Successful in 20m0s

Details

CI / integration_tests (pull_request) Successful in 21m8s

Details

CI / status-check (pull_request) Successful in 1s

Details

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / benchmark-regression (pull_request) Successful in 55m5s

Details

172ab9afa9

brent.edwards removed a dependency 2026-04-01 04:06:50 +00:00

#854 feat(autonomy): automation profile resolution precedence correct (plan > action > global)

brent.edwards scheduled this pull request to auto merge when all checks succeed 2026-04-01 04:06:59 +00:00

brent.edwards commented

2026-04-01 04:16:48 +00:00

Implementation note: addressing CI integration_tests instability found on this PR head.

Review item addressed: intermittent workflow-05 lifecycle helper failure after complete_strategize when auto-progression advances the plan state earlier than the helper expected.
Root cause: robot/helper_int_wf05_db_migration.py assumed a fixed linear transition sequence (complete_strategize -> execute_plan -> start_execute -> complete_execute). With auto-progression enabled, the plan can already be in EXECUTE/QUEUED (or later), so unconditional calls can become invalid and fail nondeterministically.
Change made: updated _review_profile_behavior() and _plan_lifecycle_review_profile() to read the current (phase, processing_state) after each transition and only invoke the next operation when valid for that state.
Design decision: keep assertions strict about expected states while making transition calls state-aware; this preserves behavioral guarantees without introducing sleeps/retries.
Result: helper now supports both legitimate lifecycle paths (manual step-by-step and auto-progressed handoff), removing the race-sensitive assumption.

Key code location:

robot/helper_int_wf05_db_migration.py (_review_profile_behavior, _plan_lifecycle_review_profile)

Implementation note: addressing CI `integration_tests` instability found on this PR head. - Review item addressed: intermittent workflow-05 lifecycle helper failure after `complete_strategize` when auto-progression advances the plan state earlier than the helper expected. - Root cause: `robot/helper_int_wf05_db_migration.py` assumed a fixed linear transition sequence (`complete_strategize -> execute_plan -> start_execute -> complete_execute`). With auto-progression enabled, the plan can already be in `EXECUTE/QUEUED` (or later), so unconditional calls can become invalid and fail nondeterministically. - Change made: updated `_review_profile_behavior()` and `_plan_lifecycle_review_profile()` to read the current `(phase, processing_state)` after each transition and only invoke the next operation when valid for that state. - Design decision: keep assertions strict about expected states while making transition calls state-aware; this preserves behavioral guarantees without introducing sleeps/retries. - Result: helper now supports both legitimate lifecycle paths (manual step-by-step and auto-progressed handoff), removing the race-sensitive assumption. Key code location: - `robot/helper_int_wf05_db_migration.py` (`_review_profile_behavior`, `_plan_lifecycle_review_profile`)

brent.edwards merged commit 34dcf5226f into master

2026-04-01 04:27:09 +00:00

brent.edwards deleted branch feature/m6-automation-profiles

2026-04-01 04:27:09 +00:00

brent.edwards referenced this issue from a commit

2026-04-01 04:27:11 +00:00

feat(autonomy): automation profile resolution precedence correct (#1196)

freemo referenced this pull request

2026-04-02 16:58:16 +00:00

[Automated] Product Build Session State #1314

freemo referenced this pull request

2026-04-02 19:05:30 +00:00