BUG-HUNT: [concurrency] PluginManager.get_plugin() returns a live mutable PluginDescriptor reference — callers can mutate state outside the lock, racing with activate_plugin() / deactivate_plugin() #6717

Open
opened 2026-04-09 23:50:06 +00:00 by HAL9000 · 1 comment
Owner

Bug Report: Concurrency — Mutable Descriptor Reference Escapes Lock

Severity Assessment

  • Impact: External callers holding a PluginDescriptor reference returned by get_plugin() can read stale state or directly mutate descriptor.state without holding the internal RLock. This bypasses the thread-safety guarantee that all state transitions are guarded. A caller that caches the descriptor could observe an inconsistent plugin state (e.g., seeing DISCOVERED after another thread has called activate_plugin() which already set it to ACTIVATED).
  • Likelihood: Medium — any code that caches the descriptor from get_plugin() and checks descriptor.state later is subject to this race.
  • Priority: Medium (Backlog)

Location

  • File: src/cleveragents/infrastructure/plugins/manager.py
  • Methods: PluginManager.get_plugin, PluginManager.list_plugins
  • Lines: 128–145 (get_plugin), 150–153 (list_plugins)
  • Related file: src/cleveragents/infrastructure/plugins/types.py, line 145 (model_config = ConfigDict(frozen=False))

Description

PluginDescriptor is documented as an immutable descriptor ("Immutable descriptor for a discovered plugin"), but its Pydantic model configuration is frozen=False, making it fully mutable at runtime:

# src/cleveragents/infrastructure/plugins/types.py  line 145
model_config = ConfigDict(frozen=False)   # ← contradicts "immutable" docstring

PluginManager.get_plugin() returns the live object stored in self._plugins, not a copy:

# src/cleveragents/infrastructure/plugins/manager.py  lines 128–145
def get_plugin(self, name: str) -> PluginDescriptor:
    with self._lock:
        if name not in self._plugins:
            raise PluginNotFoundError(...)
        return self._plugins[name]     # ← returns the LIVE mutable object

The same applies to list_plugins():

# manager.py  lines 150–153
def list_plugins(self) -> list[PluginDescriptor]:
    with self._lock:
        return list(self._plugins.values())   # ← shallow copy of dict values — still the SAME mutable objects

Evidence of Race Condition

Since PluginDescriptor.state is mutable and the returned object is the same object stored in self._plugins, the following races are possible:

Race 1 — External mutation bypasses lock:

desc = manager.get_plugin("my-plugin")
# Another thread: manager.activate_plugin("my-plugin")  # sets desc.state = ACTIVATED
# This thread:   assert desc.state == PluginState.DISCOVERED  # WRONG — state was changed!

# Or, external code can directly corrupt state:
desc.state = PluginState.ACTIVATED   # no lock held — bypasses RLock guard

Race 2 — Stale read after activate_plugin:

desc = manager.get_plugin("my-plugin")
# Some time passes ...
manager.activate_plugin("my-plugin")   # internal: sets self._plugins["my-plugin"].state = ACTIVATED
print(desc.state)                       # correctly shows ACTIVATED because it's the same object...
# ... but only by accident, not by design. If model_config were frozen=True and copies returned,
# this would be the correct pattern and the stale read would be safe.

Expected Behavior

PluginDescriptor should either:

  1. Be made truly immutable (model_config = ConfigDict(frozen=True)) and get_plugin()/list_plugins() should return copies of the stored descriptors (via descriptor.model_copy()), OR
  2. The docstring should be corrected to say "mutable" and callers should be warned not to cache the returned descriptor.

Option 1 is strongly preferred — it matches the documented intent, prevents external mutation, and makes the thread-safety model correct.

Actual Behavior

get_plugin() and list_plugins() return references to the live internal PluginDescriptor objects that activate_plugin() and deactivate_plugin() mutate directly. External code can mutate the state without holding any lock, silently corrupting the plugin manager's internal state tracking.

Suggested Fix

In types.py:

# Change:
model_config = ConfigDict(frozen=False)
# To:
model_config = ConfigDict(frozen=True)

In manager.py, use descriptor.state via direct updates before the freeze (during registration/activation within the lock), and return copies to callers:

def get_plugin(self, name: str) -> PluginDescriptor:
    with self._lock:
        if name not in self._plugins:
            raise PluginNotFoundError(...)
        return self._plugins[name].model_copy()   # return a snapshot

def list_plugins(self) -> list[PluginDescriptor]:
    with self._lock:
        return [d.model_copy() for d in self._plugins.values()]  # snapshots

Internal state updates would need to replace the descriptor in self._plugins rather than mutating in place (or keep frozen=False internally with a private field convention).

Category

concurrency

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it.


Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: bug-hunter

## Bug Report: Concurrency — Mutable Descriptor Reference Escapes Lock ### Severity Assessment - **Impact**: External callers holding a `PluginDescriptor` reference returned by `get_plugin()` can read stale state or directly mutate `descriptor.state` without holding the internal `RLock`. This bypasses the thread-safety guarantee that all state transitions are guarded. A caller that caches the descriptor could observe an inconsistent plugin state (e.g., seeing `DISCOVERED` after another thread has called `activate_plugin()` which already set it to `ACTIVATED`). - **Likelihood**: Medium — any code that caches the descriptor from `get_plugin()` and checks `descriptor.state` later is subject to this race. - **Priority**: Medium (Backlog) ### Location - **File**: `src/cleveragents/infrastructure/plugins/manager.py` - **Methods**: `PluginManager.get_plugin`, `PluginManager.list_plugins` - **Lines**: 128–145 (`get_plugin`), 150–153 (`list_plugins`) - **Related file**: `src/cleveragents/infrastructure/plugins/types.py`, line 145 (`model_config = ConfigDict(frozen=False)`) ### Description `PluginDescriptor` is documented as an **immutable** descriptor ("Immutable descriptor for a discovered plugin"), but its Pydantic model configuration is `frozen=False`, making it fully mutable at runtime: ```python # src/cleveragents/infrastructure/plugins/types.py line 145 model_config = ConfigDict(frozen=False) # ← contradicts "immutable" docstring ``` `PluginManager.get_plugin()` returns the **live** object stored in `self._plugins`, not a copy: ```python # src/cleveragents/infrastructure/plugins/manager.py lines 128–145 def get_plugin(self, name: str) -> PluginDescriptor: with self._lock: if name not in self._plugins: raise PluginNotFoundError(...) return self._plugins[name] # ← returns the LIVE mutable object ``` The same applies to `list_plugins()`: ```python # manager.py lines 150–153 def list_plugins(self) -> list[PluginDescriptor]: with self._lock: return list(self._plugins.values()) # ← shallow copy of dict values — still the SAME mutable objects ``` ### Evidence of Race Condition Since `PluginDescriptor.state` is mutable and the returned object is the same object stored in `self._plugins`, the following races are possible: **Race 1 — External mutation bypasses lock:** ```python desc = manager.get_plugin("my-plugin") # Another thread: manager.activate_plugin("my-plugin") # sets desc.state = ACTIVATED # This thread: assert desc.state == PluginState.DISCOVERED # WRONG — state was changed! # Or, external code can directly corrupt state: desc.state = PluginState.ACTIVATED # no lock held — bypasses RLock guard ``` **Race 2 — Stale read after `activate_plugin`:** ```python desc = manager.get_plugin("my-plugin") # Some time passes ... manager.activate_plugin("my-plugin") # internal: sets self._plugins["my-plugin"].state = ACTIVATED print(desc.state) # correctly shows ACTIVATED because it's the same object... # ... but only by accident, not by design. If model_config were frozen=True and copies returned, # this would be the correct pattern and the stale read would be safe. ``` ### Expected Behavior `PluginDescriptor` should either: 1. Be made truly immutable (`model_config = ConfigDict(frozen=True)`) and `get_plugin()`/`list_plugins()` should return **copies** of the stored descriptors (via `descriptor.model_copy()`), OR 2. The docstring should be corrected to say "mutable" and callers should be warned not to cache the returned descriptor. Option 1 is strongly preferred — it matches the documented intent, prevents external mutation, and makes the thread-safety model correct. ### Actual Behavior `get_plugin()` and `list_plugins()` return references to the live internal `PluginDescriptor` objects that `activate_plugin()` and `deactivate_plugin()` mutate directly. External code can mutate the state without holding any lock, silently corrupting the plugin manager's internal state tracking. ### Suggested Fix In `types.py`: ```python # Change: model_config = ConfigDict(frozen=False) # To: model_config = ConfigDict(frozen=True) ``` In `manager.py`, use `descriptor.state` via direct updates before the freeze (during registration/activation within the lock), and return copies to callers: ```python def get_plugin(self, name: str) -> PluginDescriptor: with self._lock: if name not in self._plugins: raise PluginNotFoundError(...) return self._plugins[name].model_copy() # return a snapshot def list_plugins(self) -> list[PluginDescriptor]: with self._lock: return [d.model_copy() for d in self._plugins.values()] # snapshots ``` Internal state updates would need to replace the descriptor in `self._plugins` rather than mutating in place (or keep `frozen=False` internally with a private field convention). ### Category concurrency ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: `@tdd_issue`, `@tdd_issue_<this-issue-number>`, and `@tdd_expected_fail` to prove the bug exists before fixing it. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: bug-hunter
HAL9000 added this to the v3.2.0 milestone 2026-04-10 00:10:26 +00:00
Author
Owner

Verified — Concurrency bug: PluginManager.get_plugin() returns mutable reference — callers can mutate state outside lock. MoSCoW: Should-have. Priority: High.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Concurrency bug: PluginManager.get_plugin() returns mutable reference — callers can mutate state outside lock. MoSCoW: Should-have. Priority: High. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#6717
No description provided.