feat(automation): add autonomy guardrails and audit trail #204

Closed
opened 2026-02-22 23:40:06 +00:00 by freemo · 2 comments
Owner

Metadata

  • Commit Message: feat(automation): add autonomy guardrails and audit trail
  • Branch: feature/m6-autonomy-guards

Background

Autonomy guardrails enforce max steps, tool budget, and required confirmations during plan execution. An audit trail persists guardrail enforcement events to plan metadata, providing a complete record of autonomy-related decisions and blocks.

Acceptance Criteria

  • Add autonomy guardrails (max steps, tool budget, required confirmations) and persist audit trail to plan metadata.
  • Update docs/reference/automation_profiles.md with audit trail fields and enforcement behavior.

Definition of Done

This issue is complete when:

  • All subtasks below are completed and checked off.
  • A Git commit is created where the first line of the commit message matches
    the Commit Message in Metadata exactly, followed by a blank line, then
    additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in
    Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and
    merged before this issue is marked done.

Subtasks

  • Add autonomy guardrails (max steps, tool budget, required confirmations) and persist audit trail to plan metadata.
  • Update docs/reference/automation_profiles.md with audit trail fields and enforcement behavior.
  • Tests (Behave): Add scenarios for guardrail enforcement and audit trail recording.
  • Tests (Robot): Add Robot test for autonomy guardrail CLI flags.
  • Tests (ASV): Add benchmarks/autonomy_guardrails_bench.py for enforcement overhead.
  • Verify coverage >=97% via nox -s coverage_report.
  • Run nox (all default sessions, including benchmark), fix any errors.

Section: #### M6: Autonomy Hardening + Server Stubs (Day 30)
Status: In Review (PR #443)

## Metadata - **Commit Message**: `feat(automation): add autonomy guardrails and audit trail` - **Branch**: `feature/m6-autonomy-guards` ## Background Autonomy guardrails enforce max steps, tool budget, and required confirmations during plan execution. An audit trail persists guardrail enforcement events to plan metadata, providing a complete record of autonomy-related decisions and blocks. ## Acceptance Criteria - [x] Add autonomy guardrails (max steps, tool budget, required confirmations) and persist audit trail to plan metadata. - [x] Update `docs/reference/automation_profiles.md` with audit trail fields and enforcement behavior. ## Definition of Done This issue is complete when: - All subtasks below are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. ## Subtasks - [x] Add autonomy guardrails (max steps, tool budget, required confirmations) and persist audit trail to plan metadata. - [x] Update `docs/reference/automation_profiles.md` with audit trail fields and enforcement behavior. - [x] Tests (Behave): Add scenarios for guardrail enforcement and audit trail recording. - [x] Tests (Robot): Add Robot test for autonomy guardrail CLI flags. - [x] Tests (ASV): Add `benchmarks/autonomy_guardrails_bench.py` for enforcement overhead. - [x] Verify coverage >=97% via `nox -s coverage_report`. - [x] Run `nox` (all default sessions, including benchmark), fix any errors. **Section**: #### M6: Autonomy Hardening + Server Stubs (Day 30) **Status**: In Review (PR #443)
freemo added this to the v3.5.0 milestone 2026-02-22 23:40:06 +00:00
Author
Owner

Expected completion updated (Day 15 rebaseline): Day 35 / 2026-03-15 (previously Day 30 / 2026-03-10)

**Expected completion updated (Day 15 rebaseline):** Day 35 / 2026-03-15 (previously Day 30 / 2026-03-10)
freemo added the due date 2026-03-09 2026-02-23 18:41:40 +00:00
Member

Implementation Notes

Design Decisions

  1. Domain model design: Created AutonomyGuardrails model with max_steps, tool_budget, and required_confirmations fields. GuardrailAuditEntry captures each enforcement event with timestamp, event type, result, and context. GuardrailAuditTrail aggregates entries.

  2. Service architecture: AutonomyGuardrailService provides three main enforcement checks:

    • check_step_limit() — Validates current step count against max_steps threshold
    • check_tool_budget() — Validates cumulative tool cost against budget cap
    • check_confirmation_required() — Checks if operation requires human confirmation
  3. Audit trail persistence: Guardrail enforcement events are persisted to plan metadata JSON, providing a complete record of all autonomy-related decisions and blocks.

  4. Integration pattern: Service is designed to be called before each tool invocation in the execution pipeline.

Key Code Locations

  • src/cleveragents/domain/models/core/autonomy_guardrails.py — Domain models (245 lines)
  • src/cleveragents/application/services/autonomy_guardrail_service.py — Service layer (303 lines)
  • features/autonomy_guardrails.feature — 35 BDD scenarios (199 lines)
  • robot/autonomy_guardrails.robot — 8 Robot integration tests
  • benchmarks/autonomy_guardrails_bench.py — 6 ASV benchmark suites
  • docs/reference/automation_profiles.md — Updated with guardrail documentation

Test Results

  • Coverage: 97.2% (threshold: 97%)
  • All nox sessions pass (lint, format, typecheck, security_scan, dead_code, unit_tests, integration_tests, docs, build, benchmark, coverage_report)
  • Pyright: 0 errors, 0 warnings

Commit

Branch: feature/m6-autonomy-guards
Commit: 8a6f07dfeat(automation): add autonomy guardrails and audit trail
PR: #443

## Implementation Notes ### Design Decisions 1. **Domain model design**: Created `AutonomyGuardrails` model with `max_steps`, `tool_budget`, and `required_confirmations` fields. `GuardrailAuditEntry` captures each enforcement event with timestamp, event type, result, and context. `GuardrailAuditTrail` aggregates entries. 2. **Service architecture**: `AutonomyGuardrailService` provides three main enforcement checks: - `check_step_limit()` — Validates current step count against max_steps threshold - `check_tool_budget()` — Validates cumulative tool cost against budget cap - `check_confirmation_required()` — Checks if operation requires human confirmation 3. **Audit trail persistence**: Guardrail enforcement events are persisted to plan metadata JSON, providing a complete record of all autonomy-related decisions and blocks. 4. **Integration pattern**: Service is designed to be called before each tool invocation in the execution pipeline. ### Key Code Locations - `src/cleveragents/domain/models/core/autonomy_guardrails.py` — Domain models (245 lines) - `src/cleveragents/application/services/autonomy_guardrail_service.py` — Service layer (303 lines) - `features/autonomy_guardrails.feature` — 35 BDD scenarios (199 lines) - `robot/autonomy_guardrails.robot` — 8 Robot integration tests - `benchmarks/autonomy_guardrails_bench.py` — 6 ASV benchmark suites - `docs/reference/automation_profiles.md` — Updated with guardrail documentation ### Test Results - Coverage: **97.2%** (threshold: 97%) - All nox sessions pass (lint, format, typecheck, security_scan, dead_code, unit_tests, integration_tests, docs, build, benchmark, coverage_report) - Pyright: 0 errors, 0 warnings ### Commit Branch: `feature/m6-autonomy-guards` Commit: `8a6f07d` — `feat(automation): add autonomy guardrails and audit trail` PR: #443
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

2026-03-09

Blocks
#397 Epic: Server & Autonomy Infrastructure
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#204
No description provided.