Bug: Potential race condition in CostTracker.record_usage between cost update and budget check #10568

Open
opened 2026-04-18 17:21:01 +00:00 by HAL9000 · 0 comments
Owner

Metadata

  • Commit: Latest commit in main branch
  • Branch: main

Background and Context

In src/cleveragents/providers/cost_tracker.py, the record_usage() method updates the daily costs and then checks the budget. There is a potential race condition where the budget check may be based on stale data if another thread updates _daily_costs between the cost update and the budget check.

The problematic code flow is:

  1. Cost is calculated
  2. _daily_costs is updated (inside lock)
  3. Lock is released
  4. Budget check is performed (outside lock) - RACE CONDITION WINDOW
  5. Another thread could update _daily_costs during step 4

While this is a minor race condition (the budget check will eventually catch the overage), it could lead to multiple requests being approved when only one should be, if the budget is very tight.

Expected Behavior

The budget check should be atomic with the cost update. The method should ensure that the budget check is performed on consistent data without the possibility of another thread modifying costs between the update and the check.

Actual Behavior

There is a race condition between the cost update (inside the lock) and the budget check (outside the lock). Between these two operations, another thread could update _daily_costs, causing the budget check to be based on stale data.

Steps to Reproduce

  1. Create a CostTracker with budget_per_day=10.0
  2. Create two threads that both call record_usage() with a cost of 5.0 simultaneously
  3. Both threads might see the budget as under_budget because the check happens after the lock is released
  4. Expected: Only one request should be approved; Actual: Both might be approved

Acceptance Criteria

  • The budget check is performed atomically with the cost update
  • No race condition exists between cost update and budget check
  • Unit tests verify that concurrent calls to record_usage() correctly enforce budget limits
  • The fix does not introduce deadlocks or performance regressions

Subtasks

  • Analyze the current locking strategy in CostTracker.record_usage()
  • Determine the best approach: extend the lock scope or use a more sophisticated locking strategy
  • Implement the fix to ensure atomic cost update and budget check
  • Add unit tests for concurrent budget enforcement scenarios
  • Update documentation if the locking behavior changes

Definition of Done

  • The race condition is fixed and verified through unit tests
  • All existing tests pass
  • New tests demonstrate that the race condition no longer exists
  • Code review approved
  • Changes are merged to main branch

Automated by CleverAgents Bot
Agent: new-issue-creator

## Metadata - **Commit:** Latest commit in main branch - **Branch:** main ## Background and Context In `src/cleveragents/providers/cost_tracker.py`, the `record_usage()` method updates the daily costs and then checks the budget. There is a potential race condition where the budget check may be based on stale data if another thread updates `_daily_costs` between the cost update and the budget check. The problematic code flow is: 1. Cost is calculated 2. `_daily_costs` is updated (inside lock) 3. Lock is released 4. Budget check is performed (outside lock) - **RACE CONDITION WINDOW** 5. Another thread could update `_daily_costs` during step 4 While this is a minor race condition (the budget check will eventually catch the overage), it could lead to multiple requests being approved when only one should be, if the budget is very tight. ## Expected Behavior The budget check should be atomic with the cost update. The method should ensure that the budget check is performed on consistent data without the possibility of another thread modifying costs between the update and the check. ## Actual Behavior There is a race condition between the cost update (inside the lock) and the budget check (outside the lock). Between these two operations, another thread could update `_daily_costs`, causing the budget check to be based on stale data. ### Steps to Reproduce 1. Create a CostTracker with `budget_per_day=10.0` 2. Create two threads that both call `record_usage()` with a cost of `5.0` simultaneously 3. Both threads might see the budget as `under_budget` because the check happens after the lock is released 4. Expected: Only one request should be approved; Actual: Both might be approved ## Acceptance Criteria - [ ] The budget check is performed atomically with the cost update - [ ] No race condition exists between cost update and budget check - [ ] Unit tests verify that concurrent calls to `record_usage()` correctly enforce budget limits - [ ] The fix does not introduce deadlocks or performance regressions ## Subtasks - [ ] Analyze the current locking strategy in `CostTracker.record_usage()` - [ ] Determine the best approach: extend the lock scope or use a more sophisticated locking strategy - [ ] Implement the fix to ensure atomic cost update and budget check - [ ] Add unit tests for concurrent budget enforcement scenarios - [ ] Update documentation if the locking behavior changes ## Definition of Done - The race condition is fixed and verified through unit tests - All existing tests pass - New tests demonstrate that the race condition no longer exists - Code review approved - Changes are merged to main branch --- **Automated by CleverAgents Bot** Agent: new-issue-creator
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#10568
No description provided.