BUG-HUNT: [concurrency] context_strategies.py can_handle methods mutate instance state — wrong strategy gets contaminated query under concurrent use #7495

Open
opened 2026-04-10 20:49:45 +00:00 by HAL9000 · 1 comment
Owner

Bug Report: Concurrency — can_handle Methods Mutate Instance State During Strategy Selection

Severity Assessment

  • Impact: Under concurrent use, one strategy's assemble uses the query/focus set by a different context request; context assembly produces wrong results
  • Likelihood: High — any multithreaded deployment using context assembly
  • Priority: High

Location

  • File: src/cleveragents/application/services/context_strategies.py
  • Functions: SimpleKeywordStrategy.can_handle, SemanticEmbeddingStrategy.can_handle, BreadthDepthNavigatorStrategy.can_handle
  • Lines: ~113–116, ~194–197, ~269–275
  • Category: concurrency

Description

All three can_handle methods assign to self._query or self._focus as a side effect of being called during strategy selection. can_handle is called during the pipeline query to pick the best strategy. This creates two problems:

  1. Wrong-strategy contamination: If pipeline calls can_handle on strategy A, then B, then selects A — A's self._query may have been overwritten by a later call from a different thread.
  2. Thread safety: Two threads using the same strategy instance will corrupt each other's _query/_focus state.

Evidence

# SimpleKeywordStrategy
def can_handle(self, request: dict[str, Any]) -> float:
    query = str(request.get("query", "") or "")
    self._query = query      # ← mutates SHARED instance state during selection
    return 0.3

# Same pattern in SemanticEmbeddingStrategy and BreadthDepthNavigatorStrategy

Thread race scenario:

  • Thread A calls can_handle(request_A)self._query = "user query A"
  • Thread B calls can_handle(request_B)self._query = "user query B" (OVERWRITES A)
  • Thread A calls assemble(...) → uses self._query = "user query B" (WRONG)

Expected Behavior

can_handle should be a pure query function with no side effects on instance state.

Actual Behavior

can_handle mutates shared instance state, causing incorrect query/focus values in subsequent assemble calls under concurrent use.

Suggested Fix

Do not mutate instance state in can_handle. Extract query/focus inside assemble from the request:

def can_handle(self, request: dict[str, Any]) -> float:
    # Pure: return confidence without side effects
    return 0.3 if request.get("query") else 0.1

def assemble(self, fragments, budget, *, request: dict[str, Any] | None = None) -> ...:
    query = str((request or {}).get("query", "") or "")
    ...

Category

concurrency

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_, and @tdd_expected_fail to prove the bug exists before fixing it.


Automated by CleverAgents Bot
Supervisor: Bug Detection Pool | Agent: bug-hunt-pool-supervisor

## Bug Report: Concurrency — `can_handle` Methods Mutate Instance State During Strategy Selection ### Severity Assessment - **Impact**: Under concurrent use, one strategy's `assemble` uses the query/focus set by a different context request; context assembly produces wrong results - **Likelihood**: High — any multithreaded deployment using context assembly - **Priority**: High ### Location - **File**: `src/cleveragents/application/services/context_strategies.py` - **Functions**: `SimpleKeywordStrategy.can_handle`, `SemanticEmbeddingStrategy.can_handle`, `BreadthDepthNavigatorStrategy.can_handle` - **Lines**: ~113–116, ~194–197, ~269–275 - **Category**: concurrency ### Description All three `can_handle` methods assign to `self._query` or `self._focus` as a side effect of being called during strategy selection. `can_handle` is called during the pipeline query to pick the best strategy. This creates two problems: 1. **Wrong-strategy contamination**: If pipeline calls `can_handle` on strategy A, then B, then selects A — A's `self._query` may have been overwritten by a later call from a different thread. 2. **Thread safety**: Two threads using the same strategy instance will corrupt each other's `_query`/`_focus` state. ### Evidence ```python # SimpleKeywordStrategy def can_handle(self, request: dict[str, Any]) -> float: query = str(request.get("query", "") or "") self._query = query # ← mutates SHARED instance state during selection return 0.3 # Same pattern in SemanticEmbeddingStrategy and BreadthDepthNavigatorStrategy ``` **Thread race scenario:** - Thread A calls `can_handle(request_A)` → `self._query = "user query A"` - Thread B calls `can_handle(request_B)` → `self._query = "user query B"` (OVERWRITES A) - Thread A calls `assemble(...)` → uses `self._query = "user query B"` (WRONG) ### Expected Behavior `can_handle` should be a pure query function with no side effects on instance state. ### Actual Behavior `can_handle` mutates shared instance state, causing incorrect query/focus values in subsequent `assemble` calls under concurrent use. ### Suggested Fix Do not mutate instance state in `can_handle`. Extract query/focus inside `assemble` from the request: ```python def can_handle(self, request: dict[str, Any]) -> float: # Pure: return confidence without side effects return 0.3 if request.get("query") else 0.1 def assemble(self, fragments, budget, *, request: dict[str, Any] | None = None) -> ...: query = str((request or {}).get("query", "") or "") ... ``` ### Category concurrency ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it. --- **Automated by CleverAgents Bot** Supervisor: Bug Detection Pool | Agent: bug-hunt-pool-supervisor
HAL9000 added this to the v3.5.0 milestone 2026-04-10 21:39:18 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • Priority: High — Concurrency/data integrity bug in autonomy hardening components that impacts M6 milestone functionality
  • Milestone: v3.5.0 (M6: Autonomy Hardening) — This component is core to autonomous execution, guardrails, and context management
  • Story Points: 3 (M) — Bug fix with clear reproduction path
  • MoSCoW: Must Have — Autonomy hardening requires correct concurrency and data integrity
  • Type: Bug

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

Issue triaged by project owner: - **State**: Verified - **Priority**: High — Concurrency/data integrity bug in autonomy hardening components that impacts M6 milestone functionality - **Milestone**: v3.5.0 (M6: Autonomy Hardening) — This component is core to autonomous execution, guardrails, and context management - **Story Points**: 3 (M) — Bug fix with clear reproduction path - **MoSCoW**: Must Have — Autonomy hardening requires correct concurrency and data integrity - **Type**: Bug --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#7495
No description provided.