BUG-HUNT: [concurrency] Race condition in Settings singleton initialization #2559

Open
opened 2026-04-03 18:54:04 +00:00 by freemo · 1 comment
Owner

Metadata

  • Branch: fix/settings-singleton-thread-safety
  • Commit Message: fix(config): add threading.Lock to Settings singleton to prevent race condition
  • Milestone: v3.7.0
  • Parent Epic: #362

Background and Context

The Settings class in src/cleveragents/config/settings.py implements a singleton pattern via get_settings() and reset() class methods. Neither method is thread-safe. In a multi-threaded environment — especially during application startup — multiple threads can simultaneously observe _instance is None and each proceed to create a new Settings instance, violating the singleton guarantee and producing inconsistent configuration state across the application.

A similar race condition exists in the reset() method, where one thread may be reading _instance while another sets it to None, leading to torn reads and unpredictable behavior.

Current Behavior (Bug)

# src/cleveragents/config/settings.py  lines 678-695
@classmethod
def get_settings(cls: type[Settings]) -> Settings:
    """Return the cached singleton instance (creating it if needed)."""
    if cls._instance is None:          # ← check
        cls._instance = cls()          # ← act  (not atomic — race window here)
    return cls._instance

@classmethod
def reset(cls: type[Settings]) -> None:
    ...
    cls._instance = None               # ← unsynchronized write

Two or more threads calling get_settings() simultaneously when _instance is None can each pass the None check before any of them writes the new instance, resulting in multiple Settings objects being created. The last writer wins, and any thread that already received an earlier instance now holds a stale, orphaned singleton.

Expected Behavior

get_settings() and reset() must be thread-safe. Only one Settings instance must ever be created, regardless of how many threads call get_settings() concurrently. The reset() method must also acquire the lock so that it cannot race with get_settings().

Suggested Fix

Implement a thread-safe singleton using a threading.Lock class variable:

import threading

class Settings(BaseSettings):
    _instance: ClassVar[Settings | None] = None
    _lock: ClassVar[threading.Lock] = threading.Lock()

    @classmethod
    def get_settings(cls: type[Settings]) -> Settings:
        """Return the cached singleton instance (creating it if needed)."""
        if cls._instance is None:
            with cls._lock:
                if cls._instance is None:   # double-checked locking
                    cls._instance = cls()
        return cls._instance

    @classmethod
    def reset(cls: type[Settings]) -> None:
        with cls._lock:
            cls._instance = None

Impact

  • Severity: High — multiple Settings instances in a multi-threaded environment cause inconsistent configuration and unpredictable behavior.
  • Likelihood: High — any multi-threaded startup path (e.g., parallel worker initialization) is affected.
  • Affected file: src/cleveragents/config/settings.py
  • Affected class: Settings
  • Affected methods: get_settings (lines 678–682), reset (lines 685–695)

Subtasks

  • Add _lock: ClassVar[threading.Lock] = threading.Lock() to Settings
  • Refactor get_settings() to use double-checked locking with _lock
  • Refactor reset() to acquire _lock before clearing _instance
  • Update class docstring to document internal thread safety
  • Tests (Behave): Add concurrent get_settings() scenario to prove no race condition
  • Tests (Robot): Add integration test for thread-safe singleton access
  • Verify coverage ≥ 97% via nox -s coverage_report
  • Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • Settings.get_settings() and Settings.reset() are internally thread-safe using threading.Lock.
  • Concurrent calls to get_settings() from multiple threads never produce more than one Settings instance.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
  • All nox stages pass.
  • Coverage ≥ 97%.

Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/settings-singleton-thread-safety` - **Commit Message**: `fix(config): add threading.Lock to Settings singleton to prevent race condition` - **Milestone**: v3.7.0 - **Parent Epic**: #362 ## Background and Context The `Settings` class in `src/cleveragents/config/settings.py` implements a singleton pattern via `get_settings()` and `reset()` class methods. Neither method is thread-safe. In a multi-threaded environment — especially during application startup — multiple threads can simultaneously observe `_instance is None` and each proceed to create a new `Settings` instance, violating the singleton guarantee and producing inconsistent configuration state across the application. A similar race condition exists in the `reset()` method, where one thread may be reading `_instance` while another sets it to `None`, leading to torn reads and unpredictable behavior. ## Current Behavior (Bug) ```python # src/cleveragents/config/settings.py lines 678-695 @classmethod def get_settings(cls: type[Settings]) -> Settings: """Return the cached singleton instance (creating it if needed).""" if cls._instance is None: # ← check cls._instance = cls() # ← act (not atomic — race window here) return cls._instance @classmethod def reset(cls: type[Settings]) -> None: ... cls._instance = None # ← unsynchronized write ``` Two or more threads calling `get_settings()` simultaneously when `_instance is None` can each pass the `None` check before any of them writes the new instance, resulting in multiple `Settings` objects being created. The last writer wins, and any thread that already received an earlier instance now holds a stale, orphaned singleton. ## Expected Behavior `get_settings()` and `reset()` must be thread-safe. Only one `Settings` instance must ever be created, regardless of how many threads call `get_settings()` concurrently. The `reset()` method must also acquire the lock so that it cannot race with `get_settings()`. ## Suggested Fix Implement a thread-safe singleton using a `threading.Lock` class variable: ```python import threading class Settings(BaseSettings): _instance: ClassVar[Settings | None] = None _lock: ClassVar[threading.Lock] = threading.Lock() @classmethod def get_settings(cls: type[Settings]) -> Settings: """Return the cached singleton instance (creating it if needed).""" if cls._instance is None: with cls._lock: if cls._instance is None: # double-checked locking cls._instance = cls() return cls._instance @classmethod def reset(cls: type[Settings]) -> None: with cls._lock: cls._instance = None ``` ## Impact - **Severity**: High — multiple `Settings` instances in a multi-threaded environment cause inconsistent configuration and unpredictable behavior. - **Likelihood**: High — any multi-threaded startup path (e.g., parallel worker initialization) is affected. - **Affected file**: `src/cleveragents/config/settings.py` - **Affected class**: `Settings` - **Affected methods**: `get_settings` (lines 678–682), `reset` (lines 685–695) ## Subtasks - [ ] Add `_lock: ClassVar[threading.Lock] = threading.Lock()` to `Settings` - [ ] Refactor `get_settings()` to use double-checked locking with `_lock` - [ ] Refactor `reset()` to acquire `_lock` before clearing `_instance` - [ ] Update class docstring to document internal thread safety - [ ] Tests (Behave): Add concurrent `get_settings()` scenario to prove no race condition - [ ] Tests (Robot): Add integration test for thread-safe singleton access - [ ] Verify coverage ≥ 97% via `nox -s coverage_report` - [ ] Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - `Settings.get_settings()` and `Settings.reset()` are internally thread-safe using `threading.Lock`. - Concurrent calls to `get_settings()` from multiple threads never produce more than one `Settings` instance. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. - All nox stages pass. - Coverage ≥ 97%. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: ca-new-issue-creator
freemo added this to the v3.7.0 milestone 2026-04-03 18:54:14 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • MoSCoW: Should Have

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Issue triaged by project owner: - **State**: Verified - **MoSCoW**: Should Have --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#362 Epic: Security & Safety Hardening
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#2559
No description provided.