feat(security): add safety profile enforcement #518

Merged
CoreRasurae merged 1 commit from feature/m7-post-safety into master 2026-03-03 22:29:29 +00:00
Member

Summary

Implements safety profile resolution and enforcement in the tool execution pipeline, replacing the NotImplementedError stub with working precedence logic and runtime safety checks.

Closes #345

Changes

Core Implementation

  • resolve_safety_profile() — plan > action > project > global precedence; returns DEFAULT_SAFETY_PROFILE with GLOBAL provenance when all levels are None
  • ToolExecutionContext.safety_profile — optional field carrying the resolved profile
  • ToolRuntime._enforce_capabilities() — extended with:
    • Unsafe tool gating (allow_unsafe_tools vs ToolCapability.unsafe)
    • Skill category allow-list (allowed_skill_categories vs tool_skill_category metadata)
    • Checkpoint requirement OR-combined from context flag and safety profile
  • ToolSafetyViolationError — new error class in the tool error hierarchy

Files Modified

File Change
src/cleveragents/domain/models/core/safety_profile.py Replace NotImplementedError stub with resolution logic
src/cleveragents/tool/context.py Add safety_profile field to ToolExecutionContext
src/cleveragents/tool/lifecycle.py Add ToolSafetyViolationError, extend _enforce_capabilities()
src/cleveragents/tool/__init__.py Export ToolSafetyViolationError

Files Added

File Purpose
features/safety_profile_enforcement.feature 24 BDD scenarios for enforcement
features/steps/safety_profile_enforcement_steps.py Step definitions
robot/safety_profile_enforcement.robot 9 Robot integration tests
robot/helper_safety_profile_enforcement.py Robot helper
benchmarks/safety_profile_bench.py 4 ASV benchmark suites
docs/reference/safety_profiles.md Reference documentation

Test Updates

  • features/safety_profile.feature — replaced NotImplementedError stub test with 5 resolution precedence scenarios (30 total)
  • features/steps/safety_profile_steps.py — added resolution step definitions

Verification

Check Result
nox -e typecheck 0 errors, 0 warnings
nox -e unit_tests 7735 scenarios, 0 failures
nox -e coverage_report 97% overall
nox -e integration_tests 9/9 passed
nox -e benchmark All suites complete

Backward Compatibility

When no SafetyProfile is set on ToolExecutionContext (the default), the new enforcement checks are skipped entirely. All existing tool execution paths are unaffected.

## Summary Implements safety profile resolution and enforcement in the tool execution pipeline, replacing the `NotImplementedError` stub with working precedence logic and runtime safety checks. Closes #345 ## Changes ### Core Implementation - **`resolve_safety_profile()`** — plan > action > project > global precedence; returns `DEFAULT_SAFETY_PROFILE` with GLOBAL provenance when all levels are None - **`ToolExecutionContext.safety_profile`** — optional field carrying the resolved profile - **`ToolRuntime._enforce_capabilities()`** — extended with: - Unsafe tool gating (`allow_unsafe_tools` vs `ToolCapability.unsafe`) - Skill category allow-list (`allowed_skill_categories` vs `tool_skill_category` metadata) - Checkpoint requirement OR-combined from context flag and safety profile - **`ToolSafetyViolationError`** — new error class in the tool error hierarchy ### Files Modified | File | Change | |------|--------| | `src/cleveragents/domain/models/core/safety_profile.py` | Replace `NotImplementedError` stub with resolution logic | | `src/cleveragents/tool/context.py` | Add `safety_profile` field to `ToolExecutionContext` | | `src/cleveragents/tool/lifecycle.py` | Add `ToolSafetyViolationError`, extend `_enforce_capabilities()` | | `src/cleveragents/tool/__init__.py` | Export `ToolSafetyViolationError` | ### Files Added | File | Purpose | |------|---------| | `features/safety_profile_enforcement.feature` | 24 BDD scenarios for enforcement | | `features/steps/safety_profile_enforcement_steps.py` | Step definitions | | `robot/safety_profile_enforcement.robot` | 9 Robot integration tests | | `robot/helper_safety_profile_enforcement.py` | Robot helper | | `benchmarks/safety_profile_bench.py` | 4 ASV benchmark suites | | `docs/reference/safety_profiles.md` | Reference documentation | ### Test Updates - `features/safety_profile.feature` — replaced `NotImplementedError` stub test with 5 resolution precedence scenarios (30 total) - `features/steps/safety_profile_steps.py` — added resolution step definitions ## Verification | Check | Result | |-------|--------| | `nox -e typecheck` | 0 errors, 0 warnings | | `nox -e unit_tests` | 7735 scenarios, 0 failures | | `nox -e coverage_report` | 97% overall | | `nox -e integration_tests` | 9/9 passed | | `nox -e benchmark` | All suites complete | ## Backward Compatibility When no `SafetyProfile` is set on `ToolExecutionContext` (the default), the new enforcement checks are skipped entirely. All existing tool execution paths are unaffected.
CoreRasurae force-pushed feature/m7-post-safety from e09201af29
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Failing after 14s
CI / build (pull_request) Successful in 14s
CI / quality (pull_request) Successful in 17s
CI / security (pull_request) Successful in 31s
CI / typecheck (pull_request) Successful in 39s
CI / coverage (pull_request) Has been skipped
CI / benchmark-regression (pull_request) Has been skipped
CI / unit_tests (pull_request) Successful in 2m3s
CI / docker (pull_request) Has been skipped
CI / integration_tests (pull_request) Failing after 2m50s
to 5e247d05e2
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / build (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 17s
CI / security (pull_request) Successful in 32s
CI / typecheck (pull_request) Successful in 33s
CI / unit_tests (pull_request) Successful in 1m51s
CI / docker (pull_request) Successful in 40s
CI / integration_tests (pull_request) Successful in 2m48s
CI / coverage (pull_request) Successful in 3m39s
CI / benchmark-regression (pull_request) Successful in 24m15s
2026-03-03 17:12:44 +00:00
Compare
brent.edwards approved these changes 2026-03-03 19:18:24 +00:00
Dismissed
brent.edwards left a comment

Review — PR #518 feat(security): add safety profile enforcement

Verdict: APPROVED with comments

The implementation is well-structured. resolve_safety_profile() implements clean plan > action > project > global precedence with keyword-only args and tuple return. The _enforce_capabilities() extension adds 8 ordered checks (read-only, checkpoint, unsafe gating, skill category, sandbox, human approval, cost limits, retry limits) with a clear error hierarchy (ToolSafetyViolationError, ToolSandboxRequiredError, ToolHumanApprovalRequiredError, ToolCostLimitExceededError, ToolRetryLimitExceededError — all inheriting from ToolRuntimeError). The SafetyProfile model is frozen/immutable with field and cross-field validators. New types are properly re-exported in tool/__init__.py. The ToolExecutionContext.safety_profile field is optional (None = no enforcement), preserving backward compatibility. BDD scenarios are comprehensive with stub tool instances.

No P0 or P1 findings.


P2:should-fix

  1. Missing PR label — Per CONTRIBUTING.md, every PR must carry exactly one Type/ label. This PR has no labels. The linked issue #345 carries Type/Feature.

  2. Missing PR milestone — Per CONTRIBUTING.md, the PR milestone must match the linked issue. This PR has no milestone set. Issue #345 is assigned to milestone v3.6.0.

  3. Commit footer uses Refs: #345 instead of a closing keyword — Per CONTRIBUTING.md, every commit must reference the issue with a closing keyword (e.g. ISSUES CLOSED: #345). The PR body correctly says Closes #345, but the commit footer only has Refs: #345, which will not auto-close the issue on merge.

  4. Scenario count mismatch in PR body — The PR description claims "11 BDD scenarios" in safety_profile_enforcement.feature, but the file actually contains 24 scenarios (verified by counting Scenario: / Scenario Outline: entries plus their Examples expansions). Please update the PR body to reflect the correct count.

P3:nit

  1. Nox flag typo in verification table — The PR body's verification table uses nox -e (not a valid nox flag) instead of nox -s in all entries.

P2 items should be addressed in a follow-up within 3 business days.

## Review — PR #518 `feat(security): add safety profile enforcement` **Verdict: APPROVED with comments** The implementation is well-structured. `resolve_safety_profile()` implements clean `plan > action > project > global` precedence with keyword-only args and tuple return. The `_enforce_capabilities()` extension adds 8 ordered checks (read-only, checkpoint, unsafe gating, skill category, sandbox, human approval, cost limits, retry limits) with a clear error hierarchy (`ToolSafetyViolationError`, `ToolSandboxRequiredError`, `ToolHumanApprovalRequiredError`, `ToolCostLimitExceededError`, `ToolRetryLimitExceededError` — all inheriting from `ToolRuntimeError`). The `SafetyProfile` model is frozen/immutable with field and cross-field validators. New types are properly re-exported in `tool/__init__.py`. The `ToolExecutionContext.safety_profile` field is optional (`None` = no enforcement), preserving backward compatibility. BDD scenarios are comprehensive with stub tool instances. No P0 or P1 findings. --- ### P2:should-fix 1. **Missing PR label** — Per `CONTRIBUTING.md`, every PR must carry exactly one `Type/` label. This PR has no labels. The linked issue #345 carries `Type/Feature`. 2. **Missing PR milestone** — Per `CONTRIBUTING.md`, the PR milestone must match the linked issue. This PR has no milestone set. Issue #345 is assigned to milestone `v3.6.0`. 3. **Commit footer uses `Refs: #345` instead of a closing keyword** — Per `CONTRIBUTING.md`, every commit must reference the issue with a closing keyword (e.g. `ISSUES CLOSED: #345`). The PR body correctly says `Closes #345`, but the commit footer only has `Refs: #345`, which will not auto-close the issue on merge. 4. **Scenario count mismatch in PR body** — The PR description claims "11 BDD scenarios" in `safety_profile_enforcement.feature`, but the file actually contains **24 scenarios** (verified by counting `Scenario:` / `Scenario Outline:` entries plus their `Examples` expansions). Please update the PR body to reflect the correct count. ### P3:nit 5. **Nox flag typo in verification table** — The PR body's verification table uses `nox -e` (not a valid nox flag) instead of `nox -s` in all entries. --- P2 items should be addressed in a follow-up within 3 business days.
CoreRasurae added this to the v3.6.0 milestone 2026-03-03 20:49:28 +00:00
CoreRasurae force-pushed feature/m7-post-safety from 5e247d05e2
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / build (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 17s
CI / security (pull_request) Successful in 32s
CI / typecheck (pull_request) Successful in 33s
CI / unit_tests (pull_request) Successful in 1m51s
CI / docker (pull_request) Successful in 40s
CI / integration_tests (pull_request) Successful in 2m48s
CI / coverage (pull_request) Successful in 3m39s
CI / benchmark-regression (pull_request) Successful in 24m15s
to 57a99f2521
All checks were successful
CI / lint (pull_request) Successful in 15s
CI / typecheck (pull_request) Successful in 37s
CI / benchmark-publish (pull_request) Has been skipped
CI / quality (pull_request) Successful in 35s
CI / build (pull_request) Successful in 15s
CI / security (pull_request) Successful in 1m9s
CI / unit_tests (pull_request) Successful in 4m16s
CI / docker (pull_request) Successful in 39s
CI / coverage (pull_request) Successful in 3m55s
CI / integration_tests (pull_request) Successful in 4m58s
CI / benchmark-regression (pull_request) Successful in 24m48s
2026-03-03 20:54:15 +00:00
Compare
CoreRasurae force-pushed feature/m7-post-safety from 57a99f2521
All checks were successful
CI / lint (pull_request) Successful in 15s
CI / typecheck (pull_request) Successful in 37s
CI / benchmark-publish (pull_request) Has been skipped
CI / quality (pull_request) Successful in 35s
CI / build (pull_request) Successful in 15s
CI / security (pull_request) Successful in 1m9s
CI / unit_tests (pull_request) Successful in 4m16s
CI / docker (pull_request) Successful in 39s
CI / coverage (pull_request) Successful in 3m55s
CI / integration_tests (pull_request) Successful in 4m58s
CI / benchmark-regression (pull_request) Successful in 24m48s
to 809fcd223b
Some checks failed
CI / lint (pull_request) Successful in 15s
CI / benchmark-publish (pull_request) Has been skipped
CI / quality (pull_request) Successful in 26s
CI / typecheck (pull_request) Successful in 33s
CI / security (pull_request) Successful in 33s
CI / build (pull_request) Successful in 39s
CI / unit_tests (pull_request) Successful in 3m48s
CI / integration_tests (pull_request) Successful in 4m24s
CI / docker (pull_request) Successful in 1m0s
CI / coverage (pull_request) Successful in 3m42s
CI / benchmark-regression (pull_request) Has been cancelled
2026-03-03 22:04:11 +00:00
Compare
brent.edwards left a comment

Approved. No notes.

Approved. No notes.
CoreRasurae force-pushed feature/m7-post-safety from 809fcd223b
Some checks failed
CI / lint (pull_request) Successful in 15s
CI / benchmark-publish (pull_request) Has been skipped
CI / quality (pull_request) Successful in 26s
CI / typecheck (pull_request) Successful in 33s
CI / security (pull_request) Successful in 33s
CI / build (pull_request) Successful in 39s
CI / unit_tests (pull_request) Successful in 3m48s
CI / integration_tests (pull_request) Successful in 4m24s
CI / docker (pull_request) Successful in 1m0s
CI / coverage (pull_request) Successful in 3m42s
CI / benchmark-regression (pull_request) Has been cancelled
to b4b96d213c
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / build (pull_request) Successful in 15s
CI / quality (pull_request) Successful in 29s
CI / security (pull_request) Successful in 34s
CI / typecheck (pull_request) Successful in 39s
CI / unit_tests (pull_request) Successful in 2m19s
CI / docker (pull_request) Successful in 40s
CI / integration_tests (pull_request) Successful in 3m1s
CI / coverage (pull_request) Successful in 7m20s
CI / lint (push) Successful in 13s
CI / quality (push) Successful in 16s
CI / build (push) Successful in 18s
CI / security (push) Successful in 31s
CI / typecheck (push) Successful in 35s
CI / benchmark-regression (push) Has been skipped
CI / unit_tests (push) Successful in 2m10s
CI / docker (push) Successful in 39s
CI / integration_tests (push) Successful in 3m4s
CI / coverage (push) Successful in 4m45s
CI / benchmark-publish (push) Successful in 14m51s
CI / benchmark-regression (pull_request) Successful in 26m49s
2026-03-03 22:21:27 +00:00
Compare
CoreRasurae scheduled this pull request to auto merge when all checks succeed 2026-03-03 22:22:18 +00:00
CoreRasurae deleted branch feature/m7-post-safety 2026-03-03 22:29:29 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core!518
No description provided.