feat(autonomy): guard enforcement works (denylist, budget caps, tool call limits) #1204

Merged
brent.edwards merged 4 commits from feature/m6-guard-enforcement into master 2026-04-01 01:46:37 +00:00
Member

Summary

  • Enforced autonomy guard behavior for denylist/allowlist checks, budget caps, tool-call limits, and write/apply approval gates.
  • Added scope-aware guard evaluation (plan vs subplan) using GuardScope(StrEnum) for type-safe scope handling.
  • Extracted remediation guidance strings as module-level constants (REMEDIATION_DENYLIST, REMEDIATION_ALLOWLIST, etc.) for consistency and testability.
  • Added Behave coverage to validate guard remediation messaging and subplan-scoped tool-call limit behavior.

Review Fix Round (v2)

Addressed review #2910 by @freemo:

  1. Reverted resource_dag.robot — Unrelated Robot SQLite/cycle-detection changes removed from this commit; will be submitted as a separate PR.
  2. scope parameter → GuardScope enum — Created GuardScope(StrEnum) in automation_guard.py with PLAN and SUBPLAN members. Updated check_guard() signature and all call sites.
  3. Removed redundant scope_label — Now uses scope.value directly.
  4. Extracted remediation constants — Guidance strings moved to module-level constants in automation_guard.py.

Validation

  • nox -s lint — passed
  • nox -s typecheck — passed (0 errors, 0 warnings)
  • nox -s unit_tests — passed (508 features, 12989 scenarios, 0 failures)
  • nox -s coverage_report — passed (97% coverage)
  • Rebased onto latest master (532ea100)

Closes #853

## Summary - Enforced autonomy guard behavior for denylist/allowlist checks, budget caps, tool-call limits, and write/apply approval gates. - Added scope-aware guard evaluation (plan vs subplan) using `GuardScope(StrEnum)` for type-safe scope handling. - Extracted remediation guidance strings as module-level constants (`REMEDIATION_DENYLIST`, `REMEDIATION_ALLOWLIST`, etc.) for consistency and testability. - Added Behave coverage to validate guard remediation messaging and subplan-scoped tool-call limit behavior. ## Review Fix Round (v2) Addressed review #2910 by @freemo: 1. **Reverted `resource_dag.robot`** — Unrelated Robot SQLite/cycle-detection changes removed from this commit; will be submitted as a separate PR. 2. **`scope` parameter → `GuardScope` enum** — Created `GuardScope(StrEnum)` in `automation_guard.py` with `PLAN` and `SUBPLAN` members. Updated `check_guard()` signature and all call sites. 3. **Removed redundant `scope_label`** — Now uses `scope.value` directly. 4. **Extracted remediation constants** — Guidance strings moved to module-level constants in `automation_guard.py`. ## Validation - `nox -s lint` — passed - `nox -s typecheck` — passed (0 errors, 0 warnings) - `nox -s unit_tests` — passed (508 features, 12989 scenarios, 0 failures) - `nox -s coverage_report` — passed (97% coverage) - Rebased onto latest `master` (532ea100) Closes #853
brent.edwards added this to the v3.5.0 milestone 2026-03-29 20:30:42 +00:00
brent.edwards force-pushed feature/m6-guard-enforcement from ca701e2ca7
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / unit_tests (pull_request) Has been cancelled
CI / lint (pull_request) Has been cancelled
CI / typecheck (pull_request) Has been cancelled
CI / quality (pull_request) Has been cancelled
CI / security (pull_request) Has been cancelled
CI / build (pull_request) Has been cancelled
CI / integration_tests (pull_request) Has been cancelled
CI / e2e_tests (pull_request) Has been cancelled
CI / coverage (pull_request) Has been cancelled
CI / benchmark-regression (pull_request) Has been cancelled
CI / docker (pull_request) Has been cancelled
CI / helm (pull_request) Has been cancelled
CI / status-check (pull_request) Has been cancelled
to 47d86c77aa
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / typecheck (pull_request) Successful in 53s
CI / lint (pull_request) Successful in 3m24s
CI / build (pull_request) Successful in 15s
CI / helm (pull_request) Successful in 23s
CI / quality (pull_request) Successful in 3m48s
CI / security (pull_request) Successful in 4m4s
CI / integration_tests (pull_request) Successful in 6m5s
CI / unit_tests (pull_request) Successful in 6m24s
CI / docker (pull_request) Successful in 1m22s
CI / e2e_tests (pull_request) Successful in 8m34s
CI / coverage (pull_request) Successful in 8m40s
CI / status-check (pull_request) Successful in 2s
CI / benchmark-regression (pull_request) Successful in 52m25s
2026-03-29 20:38:24 +00:00
Compare
freemo left a comment

Review: REQUEST CHANGES

Issue 1: Unrelated Robot Fix Bundled

The resource_dag.robot changes (replacing create_engine with StaticPool and fixing cycle detection test to use distinct resource types) are unrelated to the guard enforcement feature. Per CONTRIBUTING.md §Atomic Commits: "Do not mix concerns. Never bundle cosmetic changes with functional changes in the same commit."

Please split the Robot SQLite fix into a separate commit.

Issue 2: scope Parameter Should Be an Enum

check_guard() accepts scope as a raw str validated with if scope not in {"plan", "subplan"}. Per CONTRIBUTING.md §Type Safety: "Prefer static typing whenever the language supports it." This should be a proper Enum:

class GuardScope(str, Enum):
    PLAN = "plan"
    SUBPLAN = "subplan"

This provides type safety, IDE autocompletion, and prevents typos at call sites.

Minor Notes

  • The scope_label variable is redundant — after validation, it's always equal to scope itself.
  • Remediation guidance strings are hardcoded inline in f-strings. Consider extracting them as constants for consistency and testability.
  • The core guard enforcement logic (denylist, budget caps, tool-call limits) is well-implemented with clear deny reasons.
## Review: REQUEST CHANGES ### Issue 1: Unrelated Robot Fix Bundled The `resource_dag.robot` changes (replacing `create_engine` with `StaticPool` and fixing cycle detection test to use distinct resource types) are **unrelated** to the guard enforcement feature. Per CONTRIBUTING.md §Atomic Commits: "Do not mix concerns. Never bundle cosmetic changes with functional changes in the same commit." Please split the Robot SQLite fix into a separate commit. ### Issue 2: `scope` Parameter Should Be an Enum `check_guard()` accepts `scope` as a raw `str` validated with `if scope not in {"plan", "subplan"}`. Per CONTRIBUTING.md §Type Safety: "Prefer static typing whenever the language supports it." This should be a proper `Enum`: ```python class GuardScope(str, Enum): PLAN = "plan" SUBPLAN = "subplan" ``` This provides type safety, IDE autocompletion, and prevents typos at call sites. ### Minor Notes - The `scope_label` variable is redundant — after validation, it's always equal to `scope` itself. - Remediation guidance strings are hardcoded inline in f-strings. Consider extracting them as constants for consistency and testability. - The core guard enforcement logic (denylist, budget caps, tool-call limits) is well-implemented with clear deny reasons.
brent.edwards force-pushed feature/m6-guard-enforcement from 47d86c77aa
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / typecheck (pull_request) Successful in 53s
CI / lint (pull_request) Successful in 3m24s
CI / build (pull_request) Successful in 15s
CI / helm (pull_request) Successful in 23s
CI / quality (pull_request) Successful in 3m48s
CI / security (pull_request) Successful in 4m4s
CI / integration_tests (pull_request) Successful in 6m5s
CI / unit_tests (pull_request) Successful in 6m24s
CI / docker (pull_request) Successful in 1m22s
CI / e2e_tests (pull_request) Successful in 8m34s
CI / coverage (pull_request) Successful in 8m40s
CI / status-check (pull_request) Successful in 2s
CI / benchmark-regression (pull_request) Successful in 52m25s
to 45c0242ef5
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 20s
CI / helm (pull_request) Successful in 23s
CI / lint (pull_request) Successful in 3m19s
CI / quality (pull_request) Successful in 3m42s
CI / typecheck (pull_request) Successful in 3m55s
CI / security (pull_request) Successful in 4m5s
CI / unit_tests (pull_request) Successful in 4m10s
CI / docker (pull_request) Successful in 1m44s
CI / integration_tests (pull_request) Successful in 6m58s
CI / coverage (pull_request) Successful in 8m57s
CI / e2e_tests (pull_request) Successful in 17m34s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-regression (pull_request) Failing after 38m12s
2026-03-31 03:54:32 +00:00
Compare
Author
Member

@freemo — All four items from review #2910 have been addressed. Here's a summary:

Issue 1: Unrelated Robot Fix Bundled

Reverted robot/resource_dag.robot to its master state. The StaticPool migration and distinct resource type changes are no longer in this commit. They'll need a separate PR/issue.

Issue 2: scope Parameter Should Be an Enum

Created GuardScope(StrEnum) in automation_guard.py with PLAN and SUBPLAN members. Updated check_guard() signature to accept GuardScope instead of str, and updated all call sites (service, Behave steps). The runtime if scope not in {...} validation is no longer needed since Pyright enforces the type statically, and StrEnum construction raises ValueError for invalid values.

Minor Notes

  • scope_label removed — now uses scope.value directly throughout the guard messages.
  • Remediation constants extracted — Six module-level constants (REMEDIATION_DENYLIST, REMEDIATION_ALLOWLIST, REMEDIATION_TOOL_CALL_LIMIT, REMEDIATION_BUDGET, REMEDIATION_WRITE_APPROVAL, REMEDIATION_APPLY_APPROVAL) in automation_guard.py.

Validation

All quality gates pass. Branch rebased onto latest master (532ea100). PR is now mergeable.

@freemo — All four items from review #2910 have been addressed. Here's a summary: ### Issue 1: Unrelated Robot Fix Bundled ✅ Reverted `robot/resource_dag.robot` to its master state. The StaticPool migration and distinct resource type changes are no longer in this commit. They'll need a separate PR/issue. ### Issue 2: `scope` Parameter Should Be an Enum ✅ Created `GuardScope(StrEnum)` in `automation_guard.py` with `PLAN` and `SUBPLAN` members. Updated `check_guard()` signature to accept `GuardScope` instead of `str`, and updated all call sites (service, Behave steps). The runtime `if scope not in {...}` validation is no longer needed since Pyright enforces the type statically, and `StrEnum` construction raises `ValueError` for invalid values. ### Minor Notes ✅ - **`scope_label` removed** — now uses `scope.value` directly throughout the guard messages. - **Remediation constants extracted** — Six module-level constants (`REMEDIATION_DENYLIST`, `REMEDIATION_ALLOWLIST`, `REMEDIATION_TOOL_CALL_LIMIT`, `REMEDIATION_BUDGET`, `REMEDIATION_WRITE_APPROVAL`, `REMEDIATION_APPLY_APPROVAL`) in `automation_guard.py`. ### Validation All quality gates pass. Branch rebased onto latest master (532ea100). PR is now mergeable.
Merge branch 'master' into feature/m6-guard-enforcement
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 20s
CI / helm (pull_request) Successful in 44s
CI / lint (pull_request) Successful in 3m19s
CI / quality (pull_request) Successful in 3m52s
CI / typecheck (pull_request) Successful in 4m1s
CI / security (pull_request) Successful in 4m13s
CI / integration_tests (pull_request) Successful in 7m6s
CI / unit_tests (pull_request) Successful in 7m45s
CI / docker (pull_request) Successful in 1m30s
CI / coverage (pull_request) Successful in 9m52s
CI / e2e_tests (pull_request) Successful in 17m13s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-regression (pull_request) Has been cancelled
1e1eb18327
Merge branch 'master' into feature/m6-guard-enforcement
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 18s
CI / helm (pull_request) Successful in 26s
CI / lint (pull_request) Successful in 3m20s
CI / quality (pull_request) Successful in 3m43s
CI / typecheck (pull_request) Successful in 3m57s
CI / security (pull_request) Successful in 4m7s
CI / unit_tests (pull_request) Successful in 8m59s
CI / docker (pull_request) Successful in 1m21s
CI / coverage (pull_request) Successful in 12m35s
CI / e2e_tests (pull_request) Successful in 17m57s
CI / integration_tests (pull_request) Successful in 24m45s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-regression (pull_request) Has been cancelled
d163f12659
Merge branch 'master' into feature/m6-guard-enforcement
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 15s
CI / helm (pull_request) Successful in 32s
CI / quality (pull_request) Successful in 53s
CI / security (pull_request) Successful in 1m12s
CI / lint (pull_request) Successful in 3m18s
CI / typecheck (pull_request) Successful in 3m55s
CI / unit_tests (pull_request) Successful in 6m5s
CI / docker (pull_request) Successful in 1m22s
CI / coverage (pull_request) Successful in 9m2s
CI / e2e_tests (pull_request) Successful in 20m9s
CI / integration_tests (pull_request) Successful in 24m31s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-regression (pull_request) Successful in 55m24s
967c665bf9
brent.edwards scheduled this pull request to auto merge when all checks succeed 2026-04-01 00:42:50 +00:00
brent.edwards deleted branch feature/m6-guard-enforcement 2026-04-01 01:46:38 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core!1204
No description provided.