refactor(autonomy): rename automation profile task flags to match specification field names #902

Closed
opened 2026-03-13 23:47:58 +00:00 by freemo · 4 comments
Owner

Metadata

  • Commit Message: refactor(autonomy): rename automation profile task flags to spec names
  • Branch: feature/m5-automation-profile-fields

Background and Context

The specification defines 11 task-type confidence threshold flags on each automation profile. The code implements 11 threshold fields but with completely different names — using phase-transition semantics instead of the spec's task-type semantics.

# Spec field name Code field name (actual)
1 decompose_task auto_strategize
2 create_tool auto_execute
3 select_tool auto_apply
4 edit_code auto_decisions_strategize
5 execute_command auto_decisions_execute
6 create_file auto_validation_fix
7 delete_content auto_strategy_revision
8 access_network auto_reversion_from_apply
9 install_dependency auto_child_plans
10 modify_config auto_retry_transient
11 approve_plan auto_checkpoint_restore

The SafetyProfile sub-model matches the spec exactly (all 8 fields).

The code's phase-transition approach (auto_strategize, auto_execute, auto_apply) may be a valid implementation strategy, but the field names must align with the spec for YAML config compatibility and documentation accuracy. All 8 built-in profiles (manual, review, supervised, cautious, trusted, auto, ci, full-auto) need their threshold values remapped.

Spec reference: Automation Profile section, ~lines 28216-28432

Acceptance Criteria

  • All 11 task flag field names match the specification exactly
  • All 8 built-in profiles have correct threshold values under the new field names
  • YAML automation profile configs use the spec field names
  • agents automation-profile show displays the spec field names
  • Profile resolution logic (plan > action > project > global) still works correctly
  • AutonomyController references updated field names
  • Database migration (if persisted) renames the columns

Subtasks

  • Rename all 11 fields in the AutomationProfile Pydantic model
  • Update all 8 built-in profile definitions with correct field names and values
  • Update YAML config schema for automation profiles
  • Update automation_profile.py CLI output formatting
  • Update AutonomyController / any service that references these fields
  • Update Alembic migration if columns are persisted
  • Tests (Behave): Verify profiles load correctly with new names
  • Verify coverage >= 97% via nox -s coverage_report
  • Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
## Metadata - **Commit Message**: `refactor(autonomy): rename automation profile task flags to spec names` - **Branch**: `feature/m5-automation-profile-fields` ## Background and Context The specification defines 11 task-type confidence threshold flags on each automation profile. The code implements 11 threshold fields but with **completely different names** — using phase-transition semantics instead of the spec's task-type semantics. | # | Spec field name | Code field name (actual) | |---|---|---| | 1 | `decompose_task` | `auto_strategize` | | 2 | `create_tool` | `auto_execute` | | 3 | `select_tool` | `auto_apply` | | 4 | `edit_code` | `auto_decisions_strategize` | | 5 | `execute_command` | `auto_decisions_execute` | | 6 | `create_file` | `auto_validation_fix` | | 7 | `delete_content` | `auto_strategy_revision` | | 8 | `access_network` | `auto_reversion_from_apply` | | 9 | `install_dependency` | `auto_child_plans` | | 10 | `modify_config` | `auto_retry_transient` | | 11 | `approve_plan` | `auto_checkpoint_restore` | The `SafetyProfile` sub-model matches the spec exactly (all 8 fields). The code's phase-transition approach (auto_strategize, auto_execute, auto_apply) may be a valid implementation strategy, but the field names must align with the spec for YAML config compatibility and documentation accuracy. All 8 built-in profiles (manual, review, supervised, cautious, trusted, auto, ci, full-auto) need their threshold values remapped. **Spec reference**: Automation Profile section, ~lines 28216-28432 ## Acceptance Criteria - [ ] All 11 task flag field names match the specification exactly - [ ] All 8 built-in profiles have correct threshold values under the new field names - [ ] YAML automation profile configs use the spec field names - [ ] `agents automation-profile show` displays the spec field names - [ ] Profile resolution logic (plan > action > project > global) still works correctly - [ ] AutonomyController references updated field names - [ ] Database migration (if persisted) renames the columns ## Subtasks - [ ] Rename all 11 fields in the `AutomationProfile` Pydantic model - [ ] Update all 8 built-in profile definitions with correct field names and values - [ ] Update YAML config schema for automation profiles - [ ] Update `automation_profile.py` CLI output formatting - [ ] Update `AutonomyController` / any service that references these fields - [ ] Update Alembic migration if columns are persisted - [ ] Tests (Behave): Verify profiles load correctly with new names - [ ] Verify coverage >= 97% via `nox -s coverage_report` - [ ] Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done.
freemo added this to the v3.5.0 milestone 2026-03-13 23:48:40 +00:00
Member

Implementation Notes

Summary

Renamed all 11 task-type confidence threshold fields in the AutomationProfile model from phase-transition semantics to the specification's task-type semantics. All 8 built-in profiles, CLI formatting, YAML schema, services, Alembic migration, and all Behave/Robot tests updated.

Files Modified (42 files)

  • Core: automation_profile.py, escalation.py, autonomy_controller.py, plan_lifecycle_service.py, CLI commands
  • Infrastructure: models.py (ORM columns), repositories.py (domain mapping)
  • Schema/Config: automation_profile.schema.yaml, all 9 YAML example profiles
  • Alembic: New migration m5_001_rename_automation_profile_fields.py
  • Tests: 15 Behave test files, 5 Robot test files, 3 benchmark files

Design Decisions

  1. Kept original migration intact: a6_001_automation_profiles retains old column names; new m5_001 migration performs the rename using batch_alter_table.
  2. Avoided false positives: Carefully avoided renaming auto_apply in unrelated contexts (ProjectSettings.auto_apply, PlanConfig.auto_apply, MigrationRunner.auto_apply) which use the same name but refer to different boolean settings.
  3. Preserved threshold values: All 8 built-in profiles retain exact same numerical threshold values — only field names changed.
  4. Updated CLI section headers: Changed from phase-transition groupings to unified "Task-Type Confidence Thresholds" section.

Quality Gate Results

Stage Result
lint PASSED
typecheck PASSED (0 errors)
unit_tests PASSED (462 features, 12,288 scenarios)
integration_tests PASSED (1,699 tests)
coverage 98% (exceeds 97% threshold)

PR

PR #1147feature/m5-automation-profile-fieldsmaster
Commit: 2d03b809

## Implementation Notes ### Summary Renamed all 11 task-type confidence threshold fields in the `AutomationProfile` model from phase-transition semantics to the specification's task-type semantics. All 8 built-in profiles, CLI formatting, YAML schema, services, Alembic migration, and all Behave/Robot tests updated. ### Files Modified (42 files) - **Core**: `automation_profile.py`, `escalation.py`, `autonomy_controller.py`, `plan_lifecycle_service.py`, CLI commands - **Infrastructure**: `models.py` (ORM columns), `repositories.py` (domain mapping) - **Schema/Config**: `automation_profile.schema.yaml`, all 9 YAML example profiles - **Alembic**: New migration `m5_001_rename_automation_profile_fields.py` - **Tests**: 15 Behave test files, 5 Robot test files, 3 benchmark files ### Design Decisions 1. **Kept original migration intact**: `a6_001_automation_profiles` retains old column names; new `m5_001` migration performs the rename using `batch_alter_table`. 2. **Avoided false positives**: Carefully avoided renaming `auto_apply` in unrelated contexts (`ProjectSettings.auto_apply`, `PlanConfig.auto_apply`, `MigrationRunner.auto_apply`) which use the same name but refer to different boolean settings. 3. **Preserved threshold values**: All 8 built-in profiles retain exact same numerical threshold values — only field names changed. 4. **Updated CLI section headers**: Changed from phase-transition groupings to unified "Task-Type Confidence Thresholds" section. ### Quality Gate Results | Stage | Result | |---|---| | lint | PASSED | | typecheck | PASSED (0 errors) | | unit_tests | PASSED (462 features, 12,288 scenarios) | | integration_tests | PASSED (1,699 tests) | | coverage | **98%** (exceeds 97% threshold) | ### PR PR #1147 — `feature/m5-automation-profile-fields` → `master` Commit: `2d03b809`
Member

Applied review-driven fixes for #902 on feature/m5-automation-profile-fields and validated with nox.

What was updated:

  • Hardened AutomationProfile.from_config unknown-field handling so legacy top-level keys are rejected instead of silently defaulting.
  • Added additionalProperties: false to docs/schema/automation_profile.schema.yaml for top-level profile validation consistency.
  • Added BDD regression coverage for legacy threshold keys in automation-profile add config handling.
  • Aligned docs/spec references from legacy auto_* threshold wording to the renamed task-type threshold fields (where relevant to automation profile thresholds).
  • Updated CHANGELOG.md with an Unreleased entry for these fixes.

Validation (per repo constraints):

  • TEST_PROCESSES=9 nox -s lint
  • TEST_PROCESSES=9 nox -s typecheck
  • TEST_PROCESSES=9 nox -s unit_tests
  • TEST_PROCESSES=9 nox -s integration_tests

Notes:

  • Coverage sessions were intentionally not run in this pass.
  • No new commit was created in this step.
Applied review-driven fixes for #902 on `feature/m5-automation-profile-fields` and validated with nox. What was updated: - Hardened `AutomationProfile.from_config` unknown-field handling so legacy top-level keys are rejected instead of silently defaulting. - Added `additionalProperties: false` to `docs/schema/automation_profile.schema.yaml` for top-level profile validation consistency. - Added BDD regression coverage for legacy threshold keys in `automation-profile add` config handling. - Aligned docs/spec references from legacy `auto_*` threshold wording to the renamed task-type threshold fields (where relevant to automation profile thresholds). - Updated `CHANGELOG.md` with an Unreleased entry for these fixes. Validation (per repo constraints): - `TEST_PROCESSES=9 nox -s lint` ✅ - `TEST_PROCESSES=9 nox -s typecheck` ✅ - `TEST_PROCESSES=9 nox -s unit_tests` ✅ - `TEST_PROCESSES=9 nox -s integration_tests` ✅ Notes: - Coverage sessions were intentionally not run in this pass. - No new commit was created in this step.
Member

Applied a follow-up validation/fix pass for #902 on local branch feature/m5-automation-profile-fields and cross-checked against docs/specification.md.

Validated review findings and outcomes:

  1. Docs schema/runtime parity for guardsvalid, fixed.
    • Added top-level guards object schema to docs/schema/automation_profile.schema.yaml (while preserving additionalProperties: false at top level).
  2. Missing regression guard for schema/runtime parity on guardsvalid, fixed.
    • Added BDD scenario and step coverage:
      • features/automation_profile_cli_coverage.feature
      • features/steps/automation_profile_cli_coverage_steps.py
    • New scenario validates the same guarded profile config against both docs schema (jsonschema) and runtime model (AutomationProfile.from_config).
  3. Migration header consistencyvalid, fixed.
    • Aligned migration docstring header Revises: with actual metadata in:
      • alembic/versions/m5_001_rename_automation_profile_fields.py

Documentation and changelog updates:

  • Updated CHANGELOG.md Unreleased entry for #902 to include these follow-up fixes.

Quality gates run (as requested, no coverage session):

  • TEST_PROCESSES=9 nox -s lint
  • TEST_PROCESSES=9 nox -s typecheck
  • TEST_PROCESSES=9 nox -s unit_tests
  • TEST_PROCESSES=9 nox -s integration_tests

No reviewed items were deferred in this pass; all validated findings were applied without conflicting with issue scope or specification intent.

Applied a follow-up validation/fix pass for #902 on local branch `feature/m5-automation-profile-fields` and cross-checked against `docs/specification.md`. Validated review findings and outcomes: 1. **Docs schema/runtime parity for `guards`** — **valid**, fixed. - Added top-level `guards` object schema to `docs/schema/automation_profile.schema.yaml` (while preserving `additionalProperties: false` at top level). 2. **Missing regression guard for schema/runtime parity on `guards`** — **valid**, fixed. - Added BDD scenario and step coverage: - `features/automation_profile_cli_coverage.feature` - `features/steps/automation_profile_cli_coverage_steps.py` - New scenario validates the same guarded profile config against both docs schema (`jsonschema`) and runtime model (`AutomationProfile.from_config`). 3. **Migration header consistency** — **valid**, fixed. - Aligned migration docstring header `Revises:` with actual metadata in: - `alembic/versions/m5_001_rename_automation_profile_fields.py` Documentation and changelog updates: - Updated `CHANGELOG.md` Unreleased entry for #902 to include these follow-up fixes. Quality gates run (as requested, no coverage session): - `TEST_PROCESSES=9 nox -s lint` ✅ - `TEST_PROCESSES=9 nox -s typecheck` ✅ - `TEST_PROCESSES=9 nox -s unit_tests` ✅ - `TEST_PROCESSES=9 nox -s integration_tests` ✅ No reviewed items were deferred in this pass; all validated findings were applied without conflicting with issue scope or specification intent.
Member

Post-Review Fix Pass (Review #2)

Applied fixes from the code review on PR #1147, validated against docs/specification.md and issue scope.

Fixes Applied

ID Severity Fix
M-1 Medium Restored categorised CLI automation-profile show output (Phase Transitions / Decision Automation / Self-Repair / Execution Controls) to match specification output examples. The prior commit had flattened all thresholds into a single "Task-Type Confidence Thresholds" section.
M-2 Medium Added missing access_network field to specification show output examples (Rich, Plain, JSON, YAML variants). This was pre-existing but within scope since the output sections were already being modified.
M-4 Medium Added 4 missing fields (create_file, install_dependency, modify_config, approve_plan) to ADR-017 profile fields table, bringing it to all 11 spec-defined fields.
M-5 Medium Extended repository roundtrip test (repositories_uncovered_lines_steps.py) to assert all 11 threshold fields after upsert (was only asserting 3).
H-4 High Aligned automation_profiles.md threshold descriptions with the specification's Automatable Tasks table (§28327–28339).
H-5 High Added spec section references (§ Automatable Tasks) in phase_reversion.md, error_recovery.md, and plan_execute.md to provide field naming context.

Findings Not Applied (with justification)

ID Severity Reason
H-2, H-3 High The semantic confusion in ADR-006 and ADR-017 (task-type names describing phase-transition behaviors) is inherent in the specification's design — the spec explicitly maps these names to those behaviors. Per CONTRIBUTING.md "Specification-First Development", the code/docs align with the spec. The ADR descriptions are correct.
M-3 Medium Schema descriptions already match the spec's "Description" column. Adding runtime behavior to the schema would make it verbose and diverge from the spec format.
L-1 Low Test-only: legacy validator tests 3/11 names but the dict-lookup detection logic is exercised by any single key. Low risk.
L-2 Low Test-only: CI profile values assumed but not asserted. Does not affect correctness.
L-5 Low ADR-007 text matches the spec's description of edit_code. No change needed.

Quality Gates

  • TEST_PROCESSES=9 nox -s lint — PASS
  • TEST_PROCESSES=9 nox -s typecheck — PASS (0 errors)
  • TEST_PROCESSES=9 nox -s unit_tests — PASS (463 features, 12,293 scenarios)
  • TEST_PROCESSES=9 nox -s integration_tests — PASS (1,699 tests)

Commit

Amended commit 4dbb6ef9 on feature/m5-automation-profile-fields.

## Post-Review Fix Pass (Review #2) Applied fixes from the code review on PR #1147, validated against `docs/specification.md` and issue scope. ### Fixes Applied | ID | Severity | Fix | |----|----------|-----| | **M-1** | Medium | Restored categorised CLI `automation-profile show` output (Phase Transitions / Decision Automation / Self-Repair / Execution Controls) to match specification output examples. The prior commit had flattened all thresholds into a single "Task-Type Confidence Thresholds" section. | | **M-2** | Medium | Added missing `access_network` field to specification `show` output examples (Rich, Plain, JSON, YAML variants). This was pre-existing but within scope since the output sections were already being modified. | | **M-4** | Medium | Added 4 missing fields (`create_file`, `install_dependency`, `modify_config`, `approve_plan`) to ADR-017 profile fields table, bringing it to all 11 spec-defined fields. | | **M-5** | Medium | Extended repository roundtrip test (`repositories_uncovered_lines_steps.py`) to assert all 11 threshold fields after upsert (was only asserting 3). | | **H-4** | High | Aligned `automation_profiles.md` threshold descriptions with the specification's Automatable Tasks table (§28327–28339). | | **H-5** | High | Added spec section references (`§ Automatable Tasks`) in `phase_reversion.md`, `error_recovery.md`, and `plan_execute.md` to provide field naming context. | ### Findings Not Applied (with justification) | ID | Severity | Reason | |----|----------|--------| | **H-2, H-3** | High | The semantic confusion in ADR-006 and ADR-017 (task-type names describing phase-transition behaviors) is inherent in the specification's design — the spec explicitly maps these names to those behaviors. Per CONTRIBUTING.md "Specification-First Development", the code/docs align with the spec. The ADR descriptions are correct. | | **M-3** | Medium | Schema descriptions already match the spec's "Description" column. Adding runtime behavior to the schema would make it verbose and diverge from the spec format. | | **L-1** | Low | Test-only: legacy validator tests 3/11 names but the dict-lookup detection logic is exercised by any single key. Low risk. | | **L-2** | Low | Test-only: CI profile values assumed but not asserted. Does not affect correctness. | | **L-5** | Low | ADR-007 text matches the spec's description of `edit_code`. No change needed. | ### Quality Gates - `TEST_PROCESSES=9 nox -s lint` — PASS - `TEST_PROCESSES=9 nox -s typecheck` — PASS (0 errors) - `TEST_PROCESSES=9 nox -s unit_tests` — PASS (463 features, 12,293 scenarios) - `TEST_PROCESSES=9 nox -s integration_tests` — PASS (1,699 tests) ### Commit Amended commit `4dbb6ef9` on `feature/m5-automation-profile-fields`.
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#902
No description provided.