feat(guardrails): implement Per-Session and Per-Org Cost Budgets #675

Merged
freemo merged 1 commit from feature/m6plus-per-session-per-org-cost-budgets into master 2026-03-10 18:24:32 +00:00
Owner

Summary

Implements Forgejo issue #584: Per-Session and Per-Org Cost Budgets.

Adds a three-tier budget hierarchy (per-plan → per-session → per-org) with the tightest limit winning.

Changes

Domain Models (cost_budget.py)

  • BudgetLevel enum: PLAN, SESSION, ORG
  • BudgetCheckResult: frozen Pydantic model with exceeded_level and warning fields
  • SessionCostBudget: tracks per-session accumulated cost with utilization(), remaining(), is_exceeded(), would_exceed() methods
  • OrgCostAccumulator: tracks per-org accumulated cost, requires org_id (min_length=1)
  • ThreadSafeOrgCostAccumulator: thread-safe wrapper with snapshot() method

Application Services (cost_budget_service.py)

  • CostBudgetService: manages budget state for sessions and orgs, enforces the three-tier hierarchy, emits BUDGET_WARNING (once per session) and BUDGET_EXCEEDED events via the event bus

AutonomyGuardrailService Integration

  • associate_plan_with_session(): links a plan to a session for budget tracking
  • check_budget_hierarchy(): checks plan → session → org hierarchy
  • record_plan_cost_to_session(): records cost against session (and org if linked)

Configuration (settings.py)

  • session_max_cost_usd: default session budget cap
  • org_max_cost_usd: default org budget cap
  • budget_warning_threshold: utilisation ratio for warning events (default 0.8)

Integration

  • Session model: cost_budget field, as_cli_dict() includes budget data
  • DI container: CostBudgetService registered as Singleton
  • CLI session show: cost budget panel display

Tests

  • 54 Behave scenarios (features/cost_budgets.feature)
  • 11 Robot Framework integration tests (robot/cost_budgets.robot)
  • ASV benchmarks (benchmarks/bench_budget_check.py)

Verification

  • nox -s lint
  • nox -s typecheck
  • nox -s unit_tests ✓ (9809 scenarios pass)
  • nox -s coverage_report ✓ (98% >= 97% threshold)
  • nox -s integration_tests - all new cost budget tests pass; pre-existing failures unchanged

Closes #584

## Summary Implements Forgejo issue #584: Per-Session and Per-Org Cost Budgets. Adds a three-tier budget hierarchy (per-plan → per-session → per-org) with the tightest limit winning. ## Changes ### Domain Models (`cost_budget.py`) - `BudgetLevel` enum: PLAN, SESSION, ORG - `BudgetCheckResult`: frozen Pydantic model with `exceeded_level` and `warning` fields - `SessionCostBudget`: tracks per-session accumulated cost with `utilization()`, `remaining()`, `is_exceeded()`, `would_exceed()` methods - `OrgCostAccumulator`: tracks per-org accumulated cost, requires `org_id` (min_length=1) - `ThreadSafeOrgCostAccumulator`: thread-safe wrapper with `snapshot()` method ### Application Services (`cost_budget_service.py`) - `CostBudgetService`: manages budget state for sessions and orgs, enforces the three-tier hierarchy, emits `BUDGET_WARNING` (once per session) and `BUDGET_EXCEEDED` events via the event bus ### AutonomyGuardrailService Integration - `associate_plan_with_session()`: links a plan to a session for budget tracking - `check_budget_hierarchy()`: checks plan → session → org hierarchy - `record_plan_cost_to_session()`: records cost against session (and org if linked) ### Configuration (`settings.py`) - `session_max_cost_usd`: default session budget cap - `org_max_cost_usd`: default org budget cap - `budget_warning_threshold`: utilisation ratio for warning events (default 0.8) ### Integration - Session model: `cost_budget` field, `as_cli_dict()` includes budget data - DI container: `CostBudgetService` registered as Singleton - CLI `session show`: cost budget panel display ## Tests - 54 Behave scenarios (`features/cost_budgets.feature`) - 11 Robot Framework integration tests (`robot/cost_budgets.robot`) - ASV benchmarks (`benchmarks/bench_budget_check.py`) ## Verification - `nox -s lint` ✓ - `nox -s typecheck` ✓ - `nox -s unit_tests` ✓ (9809 scenarios pass) - `nox -s coverage_report` ✓ (98% >= 97% threshold) - `nox -s integration_tests` - all new cost budget tests pass; pre-existing failures unchanged Closes #584
freemo force-pushed feature/m6plus-per-session-per-org-cost-budgets from 882dde7930
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 16s
CI / build (pull_request) Successful in 17s
CI / quality (pull_request) Successful in 18s
CI / security (pull_request) Successful in 34s
CI / typecheck (pull_request) Successful in 38s
CI / unit_tests (pull_request) Failing after 2m40s
CI / docker (pull_request) Has been skipped
CI / integration_tests (pull_request) Successful in 3m9s
CI / coverage (pull_request) Successful in 5m5s
CI / benchmark-regression (pull_request) Successful in 31m16s
to 4e435cbc80
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / build (pull_request) Successful in 19s
CI / quality (pull_request) Successful in 19s
CI / security (pull_request) Successful in 35s
CI / typecheck (pull_request) Successful in 40s
CI / unit_tests (pull_request) Failing after 2m24s
CI / docker (pull_request) Has been skipped
CI / integration_tests (pull_request) Successful in 3m25s
CI / coverage (pull_request) Successful in 5m50s
CI / benchmark-regression (pull_request) Has been cancelled
2026-03-10 17:12:55 +00:00
Compare
freemo force-pushed feature/m6plus-per-session-per-org-cost-budgets from 4e435cbc80
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / build (pull_request) Successful in 19s
CI / quality (pull_request) Successful in 19s
CI / security (pull_request) Successful in 35s
CI / typecheck (pull_request) Successful in 40s
CI / unit_tests (pull_request) Failing after 2m24s
CI / docker (pull_request) Has been skipped
CI / integration_tests (pull_request) Successful in 3m25s
CI / coverage (pull_request) Successful in 5m50s
CI / benchmark-regression (pull_request) Has been cancelled
to 5f938f50ea
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 14s
CI / build (pull_request) Successful in 17s
CI / quality (pull_request) Successful in 18s
CI / security (pull_request) Successful in 35s
CI / typecheck (pull_request) Successful in 37s
CI / unit_tests (pull_request) Successful in 2m31s
CI / integration_tests (pull_request) Successful in 3m11s
CI / docker (pull_request) Successful in 40s
CI / coverage (pull_request) Successful in 5m3s
CI / benchmark-regression (pull_request) Successful in 33m36s
2026-03-10 17:32:37 +00:00
Compare
freemo force-pushed feature/m6plus-per-session-per-org-cost-budgets from 5f938f50ea
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 14s
CI / build (pull_request) Successful in 17s
CI / quality (pull_request) Successful in 18s
CI / security (pull_request) Successful in 35s
CI / typecheck (pull_request) Successful in 37s
CI / unit_tests (pull_request) Successful in 2m31s
CI / integration_tests (pull_request) Successful in 3m11s
CI / docker (pull_request) Successful in 40s
CI / coverage (pull_request) Successful in 5m3s
CI / benchmark-regression (pull_request) Successful in 33m36s
to 876217d0ca
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 16s
CI / build (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 24s
CI / typecheck (pull_request) Successful in 37s
CI / security (pull_request) Successful in 44s
CI / unit_tests (pull_request) Successful in 2m40s
CI / integration_tests (pull_request) Successful in 3m19s
CI / docker (pull_request) Successful in 42s
CI / coverage (pull_request) Successful in 5m10s
CI / lint (push) Successful in 12s
CI / build (push) Successful in 16s
CI / quality (push) Successful in 21s
CI / security (push) Successful in 32s
CI / typecheck (push) Successful in 37s
CI / benchmark-regression (push) Has been skipped
CI / unit_tests (push) Failing after 2m36s
CI / docker (push) Has been skipped
CI / integration_tests (push) Successful in 3m16s
CI / coverage (push) Successful in 5m3s
CI / benchmark-publish (push) Successful in 17m54s
CI / benchmark-regression (pull_request) Successful in 31m55s
2026-03-10 18:18:40 +00:00
Compare
freemo added this to the v3.6.0 milestone 2026-03-10 18:18:54 +00:00
freemo self-assigned this 2026-03-10 18:19:20 +00:00
freemo scheduled this pull request to auto merge when all checks succeed 2026-03-10 18:19:37 +00:00
freemo merged commit 876217d0ca into master 2026-03-10 18:24:32 +00:00
freemo deleted branch feature/m6plus-per-session-per-org-cost-budgets 2026-03-10 18:24:32 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core!675
No description provided.