fix(concurrency): protect CostTracker._daily_costs with a threading.Lock #3035

2026-04-05T04:11:02Z

freemo commented

2026-04-05 04:11:02 +00:00

Summary

Fixes a classic TOCTOU (time-of-check/time-of-use) race condition in CostTracker.record_usage where concurrent threads executing parallel plan steps could silently overwrite each other's cost increments on _daily_costs, leading to under-reported spend and incorrect budget enforcement.

Changes

src/cleveragents/providers/cost_tracker.py — Added _daily_costs_lock: threading.Lock to CostTracker.__init__ and wrapped the read-modify-write sequence in record_usage inside with self._daily_costs_lock:. Extended the same lock to guard reads in check_daily_budget and get_daily_spend to ensure full visibility of writes across threads.
features/cost_controls.feature — Added a new @concurrency Behave scenario that spawns 20 threads concurrently calling record_usage and asserts the final daily spend exactly equals the arithmetic sum of all individual costs, directly proving the race condition is eliminated.
features/steps/cost_controls_steps.py — Implemented step definitions for the new concurrency scenario, including thread coordination via a threading.Barrier and precise floating-point accumulation for the expected total.
robot/cost_controls.robot — Added a Robot Framework integration smoke test case that exercises the concurrent cost-tracking path end-to-end.
robot/helper_cost_controls.py — Added the cost-tracker-concurrent helper function invoked by the Robot test case.

Design Decisions

threading.Lock over threading.RLock: A plain Lock is sufficient because record_usage, check_daily_budget, and get_daily_spend are not re-entrant with respect to each other; using RLock would add unnecessary overhead and obscure intent.
Lock scope kept minimal: Only the read-modify-write critical section is held under the lock, not the entire method body, to avoid holding the lock during any future I/O or logging that may be added.
Reads in check_daily_budget and get_daily_spend also protected: Without locking reads, a thread could observe a partially-written value on platforms where dict assignment is not atomic (CPython's GIL provides some protection, but relying on it is an implementation detail and breaks under alternative runtimes or future GIL-free builds).
Test uses exact equality: The concurrency scenario asserts expected == actual (both computed as sums of the same float literals) rather than an approximate comparison, making the test a strict proof of correctness rather than a probabilistic one.

Testing

Unit tests (Behave): ✅ Pass — new @concurrency scenario added; 20-thread concurrent record_usage produces expected=0.015000, actual=0.015000; all pre-existing cost_controls scenarios continue to pass.
Integration tests (Robot): ✅ Pass — new cost_controls.robot smoke test case passes via helper_cost_controls.py cost-tracker-concurrent.
Type checking: ✅ Pass — nox -e typecheck reports 0 errors; no # type: ignore directives introduced.
Coverage: ≥ 97% (all new branches covered by the concurrency scenario and existing suite).

Modules Affected

src/cleveragents/providers/cost_tracker.py
features/cost_controls.feature
features/steps/cost_controls_steps.py
robot/cost_controls.robot
robot/helper_cost_controls.py

Related Issues

Closes #1919

Automated by CleverAgents Bot
Supervisor: Implementation | Agent: ca-issue-worker

## Summary Fixes a classic TOCTOU (time-of-check/time-of-use) race condition in `CostTracker.record_usage` where concurrent threads executing parallel plan steps could silently overwrite each other's cost increments on `_daily_costs`, leading to under-reported spend and incorrect budget enforcement. ## Changes - **`src/cleveragents/providers/cost_tracker.py`** — Added `_daily_costs_lock: threading.Lock` to `CostTracker.__init__` and wrapped the read-modify-write sequence in `record_usage` inside `with self._daily_costs_lock:`. Extended the same lock to guard reads in `check_daily_budget` and `get_daily_spend` to ensure full visibility of writes across threads. - **`features/cost_controls.feature`** — Added a new `@concurrency` Behave scenario that spawns 20 threads concurrently calling `record_usage` and asserts the final daily spend exactly equals the arithmetic sum of all individual costs, directly proving the race condition is eliminated. - **`features/steps/cost_controls_steps.py`** — Implemented step definitions for the new concurrency scenario, including thread coordination via a `threading.Barrier` and precise floating-point accumulation for the expected total. - **`robot/cost_controls.robot`** — Added a Robot Framework integration smoke test case that exercises the concurrent cost-tracking path end-to-end. - **`robot/helper_cost_controls.py`** — Added the `cost-tracker-concurrent` helper function invoked by the Robot test case. ## Design Decisions - **`threading.Lock` over `threading.RLock`**: A plain `Lock` is sufficient because `record_usage`, `check_daily_budget`, and `get_daily_spend` are not re-entrant with respect to each other; using `RLock` would add unnecessary overhead and obscure intent. - **Lock scope kept minimal**: Only the read-modify-write critical section is held under the lock, not the entire method body, to avoid holding the lock during any future I/O or logging that may be added. - **Reads in `check_daily_budget` and `get_daily_spend` also protected**: Without locking reads, a thread could observe a partially-written value on platforms where dict assignment is not atomic (CPython's GIL provides *some* protection, but relying on it is an implementation detail and breaks under alternative runtimes or future GIL-free builds). - **Test uses exact equality**: The concurrency scenario asserts `expected == actual` (both computed as sums of the same float literals) rather than an approximate comparison, making the test a strict proof of correctness rather than a probabilistic one. ## Testing - **Unit tests (Behave):** ✅ Pass — new `@concurrency` scenario added; 20-thread concurrent `record_usage` produces `expected=0.015000, actual=0.015000`; all pre-existing `cost_controls` scenarios continue to pass. - **Integration tests (Robot):** ✅ Pass — new `cost_controls.robot` smoke test case passes via `helper_cost_controls.py cost-tracker-concurrent`. - **Type checking:** ✅ Pass — `nox -e typecheck` reports 0 errors; no `# type: ignore` directives introduced. - **Coverage:** ≥ 97% (all new branches covered by the concurrency scenario and existing suite). ## Modules Affected - `src/cleveragents/providers/cost_tracker.py` - `features/cost_controls.feature` - `features/steps/cost_controls_steps.py` - `robot/cost_controls.robot` - `robot/helper_cost_controls.py` ## Related Issues Closes #1919 --- **Automated by CleverAgents Bot** Supervisor: Implementation | Agent: ca-issue-worker

freemo added 1 commit 2026-04-05 04:11:03 +00:00

fix(concurrency): protect CostTracker._daily_costs with a threading.Lock

CI / typecheck (pull_request) Successful in 59s

Details

CI / security (pull_request) Successful in 54s

Details

CI / quality (pull_request) Successful in 40s

Details

CI / lint (pull_request) Successful in 3m21s

Details

CI / build (pull_request) Successful in 17s

Details

CI / helm (pull_request) Successful in 21s

Details

CI / unit_tests (pull_request) Successful in 6m50s

Details

CI / e2e_tests (pull_request) Successful in 19m16s

Details

CI / docker (pull_request) Successful in 1m21s

Details

CI / integration_tests (pull_request) Successful in 21m37s

Details

CI / coverage (pull_request) Successful in 14m51s

Details

CI / status-check (pull_request) Successful in 1s

Details

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / benchmark-regression (pull_request) Successful in 57m40s

Details

5a3678cd6c

The record_usage method performed an unprotected read-modify-write on
_daily_costs, a classic TOCTOU race condition. In multi-threaded execution
(ThreadPoolExecutor for parallel plan steps), two threads could interleave
their .get() read and = write, causing one thread's cost increment to be
silently overwritten by the other.

Changes:
- Add _daily_costs_lock: threading.Lock to CostTracker.__init__
- Wrap the read-modify-write in record_usage with self._daily_costs_lock
- Protect reads in check_daily_budget and get_daily_spend with the same lock
- Add Behave @concurrency scenario: 20 threads concurrently call record_usage;
  final daily spend must equal the sum of all individual costs
- Add Robot Framework integration smoke test via helper_cost_controls.py
  cost-tracker-concurrent command

All accesses to _daily_costs are now lock-protected. nox -e typecheck passes
with zero errors. Existing cost_controls tests continue to pass.

ISSUES CLOSED: #1919

freemo added this to the v3.7.0 milestone 2026-04-05 04:11:18 +00:00

fix(concurrency): protect CostTracker._daily_costs with a threading.Lock #3035

Summary

Changes

Design Decisions

Testing

Modules Affected

Related Issues

✅ PR Review — APPROVED (posted as comment due to self-review restriction)

Review Summary

Specification Alignment

Implementation Quality

Test Quality

Standards Compliance

Concerns Found: None

✅ Independent PR Review — APPROVED (posted as comment due to self-review restriction)

Review Scope

Specification Alignment ✅

Implementation Quality ✅

Test Quality ✅

Standards Compliance ✅

CI Status ✅

Minor Observation (Non-blocking)

Verdict

✅ Independent PR Review — APPROVED (posted as comment due to self-review restriction)

Review Scope

Specification Alignment ✅

Implementation Quality ✅

Test Quality ✅

Standards Compliance ✅

CI Status ✅

Non-blocking Observations

Verdict

Code Review — LGTM ✅

Review Checklist

Decision: LGTM — Proceeding to merge when CI passes. Note potential conflict with PR #3164.