UAT: ContextTierService documented as single-threaded only but used in multi-threaded context via DI container — missing thread-safety warning in container wiring #2096

Open
opened 2026-04-03 04:02:18 +00:00 by freemo · 1 comment
Owner

Metadata

  • Branch: fix/acms-tier-service-thread-safety-warning
  • Commit Message: fix(acms): add thread-safety guard to ContextTierService DI wiring
  • Milestone: v3.7.0
  • Parent Epic: #396

Summary

ContextTierService in src/cleveragents/application/services/context_tiers.py has an explicit docstring warning:

Note

: This service is designed for single-threaded use. The in-memory tier stores are plain dict instances without synchronisation. Concurrent callers must coordinate externally.

However, the ContextTierService is registered in the DI container and accessed via container.context_tier_service(). In a server-mode deployment (A2A protocol), multiple concurrent requests can call container.context_tier_service() and use the same service instance simultaneously, violating the single-threaded contract.

The ParallelStrategyExecutor in acms_pipeline.py uses ThreadPoolExecutor to run strategies concurrently. If any strategy accesses the ContextTierService (e.g., TieredStrategy which reads tier data), concurrent strategy execution would cause race conditions on the tier stores.

Expected Behavior

Either:

  1. ContextTierService should be made thread-safe using appropriate locking (e.g., threading.RLock), OR
  2. The DI container should register ContextTierService as a per-request (non-singleton) service to prevent shared state, OR
  3. The ParallelStrategyExecutor should be documented as incompatible with strategies that access ContextTierService

Actual Behavior

  • ContextTierService is a singleton in the DI container (shared across all callers)
  • ParallelStrategyExecutor runs strategies in parallel threads
  • No locking protects the _hot, _warm, _cold dict stores
  • Race conditions can occur when concurrent strategies read/write tier data

Code Locations

  • Service: src/cleveragents/application/services/context_tiers.py_hot, _warm, _cold are plain dict instances
  • Executor: src/cleveragents/application/services/acms_pipeline.pyParallelStrategyExecutor uses ThreadPoolExecutor
  • DI container: src/cleveragents/application/container.pycontext_tier_service() registration

Impact

  • Medium severity in single-process mode (most current usage)
  • High severity in server mode (A2A protocol) where concurrent requests share the service
  • The _touch() method mutates last_accessed and access_count on TieredFragment objects — these mutations are not atomic

Subtasks

  • Add threading.RLock to ContextTierService to protect all tier store mutations
  • Wrap all read-modify-write operations (store, get, promote, demote, evict_lru) with the lock
  • Add a BDD scenario testing concurrent access to ContextTierService
  • Update the docstring to reflect the new thread-safety guarantee
  • OR: Register ContextTierService as per-request in the DI container (document the trade-off)

Definition of Done

  • ContextTierService is safe for concurrent use OR is documented as per-request with appropriate DI wiring
  • BDD test for concurrent access added and passing
  • No regressions

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/acms-tier-service-thread-safety-warning` - **Commit Message**: `fix(acms): add thread-safety guard to ContextTierService DI wiring` - **Milestone**: v3.7.0 - **Parent Epic**: #396 ## Summary `ContextTierService` in `src/cleveragents/application/services/context_tiers.py` has an explicit docstring warning: > **Note**: This service is designed for **single-threaded** use. The in-memory tier stores are plain `dict` instances without synchronisation. Concurrent callers must coordinate externally. However, the `ContextTierService` is registered in the DI container and accessed via `container.context_tier_service()`. In a server-mode deployment (A2A protocol), multiple concurrent requests can call `container.context_tier_service()` and use the same service instance simultaneously, violating the single-threaded contract. The `ParallelStrategyExecutor` in `acms_pipeline.py` uses `ThreadPoolExecutor` to run strategies concurrently. If any strategy accesses the `ContextTierService` (e.g., `TieredStrategy` which reads tier data), concurrent strategy execution would cause race conditions on the tier stores. ## Expected Behavior Either: 1. `ContextTierService` should be made thread-safe using appropriate locking (e.g., `threading.RLock`), OR 2. The DI container should register `ContextTierService` as a per-request (non-singleton) service to prevent shared state, OR 3. The `ParallelStrategyExecutor` should be documented as incompatible with strategies that access `ContextTierService` ## Actual Behavior - `ContextTierService` is a singleton in the DI container (shared across all callers) - `ParallelStrategyExecutor` runs strategies in parallel threads - No locking protects the `_hot`, `_warm`, `_cold` dict stores - Race conditions can occur when concurrent strategies read/write tier data ## Code Locations - **Service**: `src/cleveragents/application/services/context_tiers.py` — `_hot`, `_warm`, `_cold` are plain `dict` instances - **Executor**: `src/cleveragents/application/services/acms_pipeline.py` — `ParallelStrategyExecutor` uses `ThreadPoolExecutor` - **DI container**: `src/cleveragents/application/container.py` — `context_tier_service()` registration ## Impact - Medium severity in single-process mode (most current usage) - High severity in server mode (A2A protocol) where concurrent requests share the service - The `_touch()` method mutates `last_accessed` and `access_count` on `TieredFragment` objects — these mutations are not atomic ## Subtasks - [ ] Add `threading.RLock` to `ContextTierService` to protect all tier store mutations - [ ] Wrap all read-modify-write operations (`store`, `get`, `promote`, `demote`, `evict_lru`) with the lock - [ ] Add a BDD scenario testing concurrent access to `ContextTierService` - [ ] Update the docstring to reflect the new thread-safety guarantee - [ ] OR: Register `ContextTierService` as per-request in the DI container (document the trade-off) ## Definition of Done - [ ] `ContextTierService` is safe for concurrent use OR is documented as per-request with appropriate DI wiring - [ ] BDD test for concurrent access added and passing - [ ] No regressions --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-new-issue-creator
freemo added this to the v3.7.0 milestone 2026-04-03 04:02:48 +00:00
freemo self-assigned this 2026-04-03 16:58:07 +00:00
Author
Owner

MoSCoW classification: Should Have

Rationale: This issue addresses a spec requirement or important quality improvement. It should be included in the milestone if possible.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

MoSCoW classification: **Should Have** Rationale: This issue addresses a spec requirement or important quality improvement. It should be included in the milestone if possible. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#396 Epic: ACMS Context Pipeline
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#2096
No description provided.