UAT: Autonomy guardrail cost budgets are never enforced during plan execution #7972

Open
opened 2026-04-12 16:16:05 +00:00 by HAL9000 · 0 comments
Owner

Summary

  • Autonomy guardrails expose check_tool_budget, check_budget_hierarchy, and record_plan_cost_to_session, and the service wires in CostBudgetService for session/org limits.
  • PlanExecutor and RuntimeExecuteActor never call these APIs, so tool execution never consults plan/session/org budgets.
  • As a result, plans blow past configured cost limits with no warning or block, violating the specification’s “Cost Controls” requirements.

Steps to reproduce

  1. Configure guardrails for a plan (e.g. AutonomyGuardrails(tool_budget=1.0) or by wiring CostBudgetService + associate_plan_with_session).
  2. Execute the plan via PlanExecutor/RuntimeExecuteActor while issuing tool calls that should exceed the budget.
  3. Observe that execution proceeds without any denial or budget events.

Expected result

  • Before or during each tool invocation, the executor should call into AutonomyGuardrailService.check_tool_budget / .check_budget_hierarchy so that plans exceeding their budget are blocked and audit entries are recorded.

Actual result

  • _enforce_guardrails (src/cleveragents/application/services/plan_executor.py, ~800-818) only checks wall-clock and step counts. No cost checks are performed, and RuntimeExecuteActor.execute simply loops decisions without touching the guardrail service. The budget-specific methods in AutonomyGuardrailService are unused.

Technical notes

  • AutonomyGuardrailService.check_tool_budget and .check_budget_hierarchy are defined around lines 154-210 and 529-586 respectively but never referenced outside tests/benchmarks.
  • PlanExecutor is the natural integration point—after resolving each decision/tool we should consult the guardrail service and update budgets via record_plan_cost_to_session.
  • Until that integration exists, cost budgets configured through guardrails or the cost budget service have zero effect in production.

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: uat-tester

## Summary - Autonomy guardrails expose `check_tool_budget`, `check_budget_hierarchy`, and `record_plan_cost_to_session`, and the service wires in `CostBudgetService` for session/org limits. - `PlanExecutor` and `RuntimeExecuteActor` never call these APIs, so tool execution never consults plan/session/org budgets. - As a result, plans blow past configured cost limits with no warning or block, violating the specification’s “Cost Controls” requirements. ## Steps to reproduce 1. Configure guardrails for a plan (e.g. `AutonomyGuardrails(tool_budget=1.0)` or by wiring `CostBudgetService` + `associate_plan_with_session`). 2. Execute the plan via `PlanExecutor`/`RuntimeExecuteActor` while issuing tool calls that should exceed the budget. 3. Observe that execution proceeds without any denial or budget events. ## Expected result - Before or during each tool invocation, the executor should call into `AutonomyGuardrailService.check_tool_budget` / `.check_budget_hierarchy` so that plans exceeding their budget are blocked and audit entries are recorded. ## Actual result - `_enforce_guardrails` (src/cleveragents/application/services/plan_executor.py, ~800-818) only checks wall-clock and step counts. No cost checks are performed, and `RuntimeExecuteActor.execute` simply loops decisions without touching the guardrail service. The budget-specific methods in `AutonomyGuardrailService` are unused. ## Technical notes - `AutonomyGuardrailService.check_tool_budget` and `.check_budget_hierarchy` are defined around lines 154-210 and 529-586 respectively but never referenced outside tests/benchmarks. - `PlanExecutor` is the natural integration point—after resolving each decision/tool we should consult the guardrail service and update budgets via `record_plan_cost_to_session`. - Until that integration exists, cost budgets configured through guardrails or the cost budget service have zero effect in production. --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: uat-tester
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#7972
No description provided.