BUG-HUNT: [concurrency] Race condition in A2aLocalFacade singleton initialization #3450

Open
opened 2026-04-05 17:19:01 +00:00 by freemo · 1 comment
Owner

Metadata

  • Branch: fix/v3.4.0/a2a-facade-singleton-thread-safety
  • Commit Message: fix(a2a): protect A2aLocalFacade singleton init with double-checked lock
  • Milestone: v3.4.0
  • Parent Epic: #933

Bug Report: [concurrency] — Race condition in A2aLocalFacade singleton initialization

Severity Assessment

  • Impact: In a multi-threaded environment, multiple instances of the A2aLocalFacade singleton could be created, leading to unpredictable behavior, race conditions, and inconsistent state.
  • Likelihood: High in any multi-threaded use of the CLI or local-mode facade.
  • Priority: Critical

Location

  • File: src/cleveragents/a2a/cli_bootstrap.py
  • Function/Class: get_facade
  • Lines: 54-63

Description

The get_facade function in src/cleveragents/a2a/cli_bootstrap.py uses a global variable _facade_instance to implement a singleton pattern. However, the check for _facade_instance is None and the subsequent creation of the A2aLocalFacade instance are not protected by a lock. This creates a race condition where multiple threads can simultaneously pass the None check and create multiple instances of the facade.

Evidence

_facade_instance: A2aLocalFacade | None = None

def get_facade() -> A2aLocalFacade:
    """Return the process-wide :class:`A2aLocalFacade` instance.

    The facade is lazily constructed on first call and cached for the
    lifetime of the process.
    """
    global _facade_instance
    if _facade_instance is None:
        _facade_instance = _build_facade()
    return _facade_instance

Expected Behavior

The get_facade function should be thread-safe, ensuring that only one instance of A2aLocalFacade is created per process, even when called concurrently from multiple threads.

Suggested Fix

A threading.Lock should be used to protect the critical section where the singleton instance is created, using the double-checked locking pattern:

import threading

_facade_instance: A2aLocalFacade | None = None
_facade_lock = threading.Lock()

def get_facade() -> A2aLocalFacade:
    """Return the process-wide :class:`A2aLocalFacade` instance.

    The facade is lazily constructed on first call and cached for the
    lifetime of the process.
    """
    global _facade_instance
    if _facade_instance is None:
        with _facade_lock:
            if _facade_instance is None:
                _facade_instance = _build_facade()
    return _facade_instance

Category

concurrency

Subtasks

  • Add _facade_lock = threading.Lock() module-level variable to cli_bootstrap.py
  • Refactor get_facade to use double-checked locking pattern
  • Add import threading to cli_bootstrap.py if not already present
  • Write Behave unit test scenario: concurrent calls to get_facade return the same instance
  • Write Behave unit test scenario: _build_facade is called exactly once under concurrent load
  • Run nox -e typecheck and fix any type errors
  • Run nox -e lint and fix any lint errors
  • Run nox -e unit_tests and confirm all scenarios pass
  • Verify coverage >= 97% via nox -e coverage_report

Definition of Done

  • All subtasks above are completed
  • A commit is created with the exact first line: fix(a2a): protect A2aLocalFacade singleton init with double-checked lock
  • The commit is pushed to branch fix/v3.4.0/a2a-facade-singleton-thread-safety
  • A pull request has been submitted, reviewed, and merged
  • All nox stages pass
  • Coverage >= 97%

Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/v3.4.0/a2a-facade-singleton-thread-safety` - **Commit Message**: `fix(a2a): protect A2aLocalFacade singleton init with double-checked lock` - **Milestone**: v3.4.0 - **Parent Epic**: #933 ## Bug Report: [concurrency] — Race condition in A2aLocalFacade singleton initialization ### Severity Assessment - **Impact**: In a multi-threaded environment, multiple instances of the `A2aLocalFacade` singleton could be created, leading to unpredictable behavior, race conditions, and inconsistent state. - **Likelihood**: High in any multi-threaded use of the CLI or local-mode facade. - **Priority**: Critical ### Location - **File**: `src/cleveragents/a2a/cli_bootstrap.py` - **Function/Class**: `get_facade` - **Lines**: 54-63 ### Description The `get_facade` function in `src/cleveragents/a2a/cli_bootstrap.py` uses a global variable `_facade_instance` to implement a singleton pattern. However, the check for `_facade_instance is None` and the subsequent creation of the `A2aLocalFacade` instance are not protected by a lock. This creates a race condition where multiple threads can simultaneously pass the `None` check and create multiple instances of the facade. ### Evidence ```python _facade_instance: A2aLocalFacade | None = None def get_facade() -> A2aLocalFacade: """Return the process-wide :class:`A2aLocalFacade` instance. The facade is lazily constructed on first call and cached for the lifetime of the process. """ global _facade_instance if _facade_instance is None: _facade_instance = _build_facade() return _facade_instance ``` ### Expected Behavior The `get_facade` function should be thread-safe, ensuring that only one instance of `A2aLocalFacade` is created per process, even when called concurrently from multiple threads. ### Suggested Fix A `threading.Lock` should be used to protect the critical section where the singleton instance is created, using the double-checked locking pattern: ```python import threading _facade_instance: A2aLocalFacade | None = None _facade_lock = threading.Lock() def get_facade() -> A2aLocalFacade: """Return the process-wide :class:`A2aLocalFacade` instance. The facade is lazily constructed on first call and cached for the lifetime of the process. """ global _facade_instance if _facade_instance is None: with _facade_lock: if _facade_instance is None: _facade_instance = _build_facade() return _facade_instance ``` ### Category concurrency ## Subtasks - [ ] Add `_facade_lock = threading.Lock()` module-level variable to `cli_bootstrap.py` - [ ] Refactor `get_facade` to use double-checked locking pattern - [ ] Add `import threading` to `cli_bootstrap.py` if not already present - [ ] Write Behave unit test scenario: concurrent calls to `get_facade` return the same instance - [ ] Write Behave unit test scenario: `_build_facade` is called exactly once under concurrent load - [ ] Run `nox -e typecheck` and fix any type errors - [ ] Run `nox -e lint` and fix any lint errors - [ ] Run `nox -e unit_tests` and confirm all scenarios pass - [ ] Verify coverage >= 97% via `nox -e coverage_report` ## Definition of Done - [ ] All subtasks above are completed - [ ] A commit is created with the exact first line: `fix(a2a): protect A2aLocalFacade singleton init with double-checked lock` - [ ] The commit is pushed to branch `fix/v3.4.0/a2a-facade-singleton-thread-safety` - [ ] A pull request has been submitted, reviewed, and merged - [ ] All nox stages pass - [ ] Coverage >= 97% --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: ca-new-issue-creator
freemo added this to the v3.4.0 milestone 2026-04-05 17:19:14 +00:00
freemo modified the milestone from v3.4.0 to v3.5.0 2026-04-05 17:25:15 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • Priority: Critical (confirmed) — Race condition in singleton initialization can create multiple facade instances under concurrent access, leading to inconsistent state. The fix is a simple double-checked lock pattern.
  • Milestone: v3.5.0 (Autonomy Hardening — concurrent execution requires thread-safe facade initialization)
  • Story Points: 2 (S) — Add a threading.Lock and double-checked locking pattern. Trivial fix with clear suggested implementation.
  • MoSCoW: Should Have — Thread safety is important for v3.5.0's parallel execution goals, but the CLI is typically single-threaded so this is unlikely to manifest in current usage.
  • Parent Epic: #933 (A2A Protocol Compliance)

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Issue triaged by project owner: - **State**: Verified - **Priority**: Critical (confirmed) — Race condition in singleton initialization can create multiple facade instances under concurrent access, leading to inconsistent state. The fix is a simple double-checked lock pattern. - **Milestone**: v3.5.0 (Autonomy Hardening — concurrent execution requires thread-safe facade initialization) - **Story Points**: 2 (S) — Add a threading.Lock and double-checked locking pattern. Trivial fix with clear suggested implementation. - **MoSCoW**: Should Have — Thread safety is important for v3.5.0's parallel execution goals, but the CLI is typically single-threaded so this is unlikely to manifest in current usage. - **Parent Epic**: #933 (A2A Protocol Compliance) --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
freemo removed this from the v3.5.0 milestone 2026-04-06 21:05:30 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#3450
No description provided.