UAT: plan_executor.py never calls check_budget_hierarchy() — session/org budget enforcement not integrated into plan execution loop #5582

Open
opened 2026-04-09 07:40:14 +00:00 by HAL9000 · 1 comment
Owner

Bug Report

Feature Area: cost-session-budgets-safety-profiles
Severity: Critical — v3.6.0 deliverable #14 ("Cost and session budget enforcement") is not functional


What Was Tested

Code-level analysis of plan_executor.py and AutonomyGuardrailService.check_budget_hierarchy() integration.

Expected Behavior (from spec)

Per docs/specification.md v3.6.0 deliverable #14:

"Cost and session budget enforcement — Plan exceeding max_cost_per_plan or max_total_cost is blocked"

Per docs/reference/cost_controls.md:

"Block | 100% of limit | Block further provider calls, persist exhaustion event"

The CostBudgetService and AutonomyGuardrailService.check_budget_hierarchy() exist specifically to enforce per-session and per-org spending limits during plan execution.

Actual Behavior (from code analysis)

src/cleveragents/application/services/plan_executor.py never calls check_budget_hierarchy(). The _enforce_guardrails() and _enforce_guardrails_per_step() methods only check:

  • Wall-clock time limits (check_wall_clock)
  • Step limits (check_step_limit)

Neither method calls AutonomyGuardrailService.check_budget_hierarchy() or CostBudgetService.check_budget_hierarchy(). This means:

  1. Session budget limits are never enforced during plan execution — a plan can run indefinitely past the session's max_cost_usd cap
  2. Org budget limits are never enforced during plan execution — org-wide spending limits are tracked but never block execution
  3. The three-tier hierarchy (plan → session → org) is only partially implemented — the plan-level tool_budget in AutonomyGuardrails is checked via check_tool_budget(), but session and org tiers are not

Code Locations

  • src/cleveragents/application/services/plan_executor.py — lines 800-837 (_enforce_guardrails, _enforce_guardrails_per_step)
  • src/cleveragents/application/services/autonomy_guardrail_service.pycheck_budget_hierarchy() method exists but is never called from plan executor
  • src/cleveragents/application/services/cost_budget_service.pycheck_budget_hierarchy() method exists but is never called from plan executor

Steps to Reproduce

  1. Configure a session with max_cost_usd=1.0 via CostBudgetService.configure_session_budget()
  2. Associate a plan with that session via AutonomyGuardrailService.associate_plan_with_session()
  3. Execute a plan that would cost more than $1.00
  4. Observe: plan executes past the session budget limit without being blocked

Impact

This is a critical gap in the v3.6.0 milestone. The session/org budget enforcement feature is implemented at the model and service layer but is not wired into the plan execution pipeline.


Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: uat-tester

## Bug Report **Feature Area**: cost-session-budgets-safety-profiles **Severity**: Critical — v3.6.0 deliverable #14 ("Cost and session budget enforcement") is not functional --- ## What Was Tested Code-level analysis of `plan_executor.py` and `AutonomyGuardrailService.check_budget_hierarchy()` integration. ## Expected Behavior (from spec) Per `docs/specification.md` v3.6.0 deliverable #14: > "Cost and session budget enforcement — Plan exceeding `max_cost_per_plan` or `max_total_cost` is blocked" Per `docs/reference/cost_controls.md`: > "Block | 100% of limit | Block further provider calls, persist exhaustion event" The `CostBudgetService` and `AutonomyGuardrailService.check_budget_hierarchy()` exist specifically to enforce per-session and per-org spending limits during plan execution. ## Actual Behavior (from code analysis) `src/cleveragents/application/services/plan_executor.py` **never calls `check_budget_hierarchy()`**. The `_enforce_guardrails()` and `_enforce_guardrails_per_step()` methods only check: - Wall-clock time limits (`check_wall_clock`) - Step limits (`check_step_limit`) Neither method calls `AutonomyGuardrailService.check_budget_hierarchy()` or `CostBudgetService.check_budget_hierarchy()`. This means: 1. **Session budget limits are never enforced during plan execution** — a plan can run indefinitely past the session's `max_cost_usd` cap 2. **Org budget limits are never enforced during plan execution** — org-wide spending limits are tracked but never block execution 3. **The three-tier hierarchy (plan → session → org) is only partially implemented** — the plan-level `tool_budget` in `AutonomyGuardrails` is checked via `check_tool_budget()`, but session and org tiers are not ## Code Locations - `src/cleveragents/application/services/plan_executor.py` — lines 800-837 (`_enforce_guardrails`, `_enforce_guardrails_per_step`) - `src/cleveragents/application/services/autonomy_guardrail_service.py` — `check_budget_hierarchy()` method exists but is never called from plan executor - `src/cleveragents/application/services/cost_budget_service.py` — `check_budget_hierarchy()` method exists but is never called from plan executor ## Steps to Reproduce 1. Configure a session with `max_cost_usd=1.0` via `CostBudgetService.configure_session_budget()` 2. Associate a plan with that session via `AutonomyGuardrailService.associate_plan_with_session()` 3. Execute a plan that would cost more than $1.00 4. Observe: plan executes past the session budget limit without being blocked ## Impact This is a critical gap in the v3.6.0 milestone. The session/org budget enforcement feature is implemented at the model and service layer but is not wired into the plan execution pipeline. --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: uat-tester
HAL9000 added this to the v3.6.0 milestone 2026-04-09 07:44:10 +00:00
HAL9000 modified the milestone from v3.6.0 to v3.2.0 2026-04-09 07:45:13 +00:00
Author
Owner

Label compliance fix applied:

  • Added missing labels and/or milestone to bring issue into compliance with CONTRIBUTING.md

Automated by CleverAgents Bot
Supervisor: Backlog Grooming | Agent: backlog-groomer

Label compliance fix applied: - Added missing labels and/or milestone to bring issue into compliance with CONTRIBUTING.md --- **Automated by CleverAgents Bot** Supervisor: Backlog Grooming | Agent: backlog-groomer
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#5582
No description provided.