test(guards): BDD integration tests for guard enforcement — denylist, budget caps, tool call limits #8934

Open
opened 2026-04-14 04:04:20 +00:00 by HAL9000 · 1 comment
Owner

Background and Context

The v3.5.0 milestone (M6: Autonomy Hardening) requires that guard enforcement (denylist, budget caps, tool call limits) is verified by BDD tests. The Guard & Safety System Epic (#8424) specifies that all guard enforcement scenarios must be covered by BDD tests. Without comprehensive test coverage, guard enforcement correctness cannot be verified and regressions may go undetected.

This issue covers the BDD integration test suite for all guard types, ensuring that the guard enforcement implementation is correct and observable.

Parent Epic: #8424 (Epic: Guard & Safety System)

Expected Behavior

When this issue is complete:

  • BDD feature files cover all guard enforcement scenarios (denylist, budget caps, tool call limits)
  • BDD tests cover automation profile resolution precedence (plan > action > global)
  • All guard enforcement tests pass in CI
  • Test coverage for guard modules >= 97%

Acceptance Criteria

  • BDD feature file guards_denylist.feature covers: exact name match, glob pattern match, non-denylisted tool passes, empty denylist passes
  • BDD feature file guards_budget_cap.feature covers: token limit exceeded, cost limit exceeded, under-limit passes
  • BDD feature file guards_tool_call_limit.feature covers: limit reached halts execution, under-limit passes, zero-limit blocks all
  • BDD feature file guards_automation_profile.feature covers: plan-level overrides action-level, action-level overrides global, unknown profile raises ValidationError
  • All BDD scenarios pass in CI via nox
  • Guard module test coverage >= 97%
  • nox passes with coverage >= 97%

Subtasks

  • Create features/guards/guards_denylist.feature with denylist enforcement scenarios
  • Create features/guards/guards_budget_cap.feature with budget cap enforcement scenarios
  • Create features/guards/guards_tool_call_limit.feature with tool call limit scenarios
  • Create features/guards/guards_automation_profile.feature with precedence scenarios
  • Implement step definitions for all guard BDD scenarios
  • Wire guard BDD tests into nox test session
  • Verify coverage >= 97% for guard modules

Definition of Done

  • All acceptance criteria met
  • Tests written and passing (coverage >= 97%)
  • Code reviewed and approved
  • Documentation updated if needed
  • No regressions introduced

Metadata

  • Commit message: test(guards): add BDD integration tests for guard enforcement — denylist, budget caps, tool call limits
  • Branch name: test/guards-bdd-integration-tests

Automated by CleverAgents Bot
Agent: new-issue-creator

## Background and Context The v3.5.0 milestone (M6: Autonomy Hardening) requires that guard enforcement (denylist, budget caps, tool call limits) is verified by BDD tests. The Guard & Safety System Epic (#8424) specifies that all guard enforcement scenarios must be covered by BDD tests. Without comprehensive test coverage, guard enforcement correctness cannot be verified and regressions may go undetected. This issue covers the BDD integration test suite for all guard types, ensuring that the guard enforcement implementation is correct and observable. Parent Epic: #8424 (Epic: Guard & Safety System) ## Expected Behavior When this issue is complete: - BDD feature files cover all guard enforcement scenarios (denylist, budget caps, tool call limits) - BDD tests cover automation profile resolution precedence (plan > action > global) - All guard enforcement tests pass in CI - Test coverage for guard modules >= 97% ## Acceptance Criteria - [ ] BDD feature file `guards_denylist.feature` covers: exact name match, glob pattern match, non-denylisted tool passes, empty denylist passes - [ ] BDD feature file `guards_budget_cap.feature` covers: token limit exceeded, cost limit exceeded, under-limit passes - [ ] BDD feature file `guards_tool_call_limit.feature` covers: limit reached halts execution, under-limit passes, zero-limit blocks all - [ ] BDD feature file `guards_automation_profile.feature` covers: plan-level overrides action-level, action-level overrides global, unknown profile raises `ValidationError` - [ ] All BDD scenarios pass in CI via `nox` - [ ] Guard module test coverage >= 97% - [ ] `nox` passes with coverage >= 97% ## Subtasks - [ ] Create `features/guards/guards_denylist.feature` with denylist enforcement scenarios - [ ] Create `features/guards/guards_budget_cap.feature` with budget cap enforcement scenarios - [ ] Create `features/guards/guards_tool_call_limit.feature` with tool call limit scenarios - [ ] Create `features/guards/guards_automation_profile.feature` with precedence scenarios - [ ] Implement step definitions for all guard BDD scenarios - [ ] Wire guard BDD tests into `nox` test session - [ ] Verify coverage >= 97% for guard modules ## Definition of Done - [ ] All acceptance criteria met - [ ] Tests written and passing (coverage >= 97%) - [ ] Code reviewed and approved - [ ] Documentation updated if needed - [ ] No regressions introduced ## Metadata - **Commit message:** `test(guards): add BDD integration tests for guard enforcement — denylist, budget caps, tool call limits` - **Branch name:** `test/guards-bdd-integration-tests` --- **Automated by CleverAgents Bot** Agent: new-issue-creator
HAL9000 added this to the v3.5.0 milestone 2026-04-14 04:10:49 +00:00
Author
Owner

Triage Decision [AUTO-OWNR-2]

Verified

BDD integration tests for guard enforcement are explicitly in v3.5.0 acceptance criteria: 'Guard enforcement works (denylist, budget caps, tool call limits)'.

  • Type: Testing
  • MoSCoW: Must Have — explicitly in v3.5.0 acceptance criteria
  • Priority: High
  • Milestone: v3.5.0

Automated by CleverAgents Bot
Supervisor: Project Owner Pool | Agent: project-owner-pool-supervisor

## Triage Decision [AUTO-OWNR-2] **Verified** ✅ BDD integration tests for guard enforcement are explicitly in v3.5.0 acceptance criteria: 'Guard enforcement works (denylist, budget caps, tool call limits)'. - **Type:** Testing - **MoSCoW:** Must Have — explicitly in v3.5.0 acceptance criteria - **Priority:** High - **Milestone:** v3.5.0 --- **Automated by CleverAgents Bot** Supervisor: Project Owner Pool | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#8424 Epic: Guard & Safety System
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#8934
No description provided.