BUG-HUNT: [concurrency] cli_bootstrap.get_facade() singleton is not thread-safe — multiple facade instances can be constructed #6387

Open
opened 2026-04-09 20:59:48 +00:00 by HAL9000 · 0 comments
Owner

Bug Report: Concurrency — get_facade() non-thread-safe singleton

Severity Assessment

  • Impact: Two threads calling get_facade() simultaneously can each independently call _build_facade(), constructing two separate A2aLocalFacade instances wired to the DI container. Only one is stored in _facade_instance; the other is silently discarded. If _build_facade() has side effects (e.g., container initialization, service wiring that registers callbacks), those effects may be doubled or partially applied.
  • Likelihood: Low for pure CLI usage (single-threaded), but if get_facade() is ever called from concurrent test workers or a future multi-threaded server context, this race is live.
  • Priority: Medium (backlog)

Location

  • File: src/cleveragents/a2a/cli_bootstrap.py
  • Function: get_facade()
  • Lines: ~49–57

Description

get_facade() uses the classic unguarded singleton pattern:

_facade_instance: A2aLocalFacade | None = None

def get_facade() -> A2aLocalFacade:
    global _facade_instance
    if _facade_instance is None:
        _facade_instance = _build_facade()
    return _facade_instance

There is no threading.Lock guarding the None check and the assignment. Two threads can both observe _facade_instance is None and both call _build_facade(). CPython's GIL does not protect the compound check-then-act sequence across multiple bytecode instructions.

Evidence

get_facade() — no lock on the check-then-set (cli_bootstrap.py ~lines 49–57):

_facade_instance: A2aLocalFacade | None = None

def get_facade() -> A2aLocalFacade:
    global _facade_instance
    if _facade_instance is None:          # ← Thread A and Thread B both see None
        _facade_instance = _build_facade()  # ← both call _build_facade()
    return _facade_instance

_build_facade() — wraps each service resolution in contextlib.suppress(Exception) (cli_bootstrap.py ~lines 28–45):

def _build_facade() -> A2aLocalFacade:
    from cleveragents.application.container import get_container
    container = get_container()
    services: dict[str, Any] = {}
    with contextlib.suppress(Exception):
        services["plan_lifecycle_service"] = container.plan_lifecycle_service()
    ...
    return A2aLocalFacade(services=services)

If get_container() is not itself thread-safe, calling it from two threads simultaneously adds another race on top of this one.

Expected Behavior

get_facade() should guarantee that exactly one A2aLocalFacade instance is created regardless of concurrent callers, and that it is fully wired before being returned to any caller.

Actual Behavior

Two concurrent callers can each receive different A2aLocalFacade instances (briefly), and _build_facade() can be called twice, potentially causing duplicate side effects during container/service initialization.

Suggested Fix

Use a module-level threading.Lock with double-checked locking:

import threading
_facade_lock = threading.Lock()
_facade_instance: A2aLocalFacade | None = None

def get_facade() -> A2aLocalFacade:
    global _facade_instance
    if _facade_instance is None:
        with _facade_lock:
            if _facade_instance is None:
                _facade_instance = _build_facade()
    return _facade_instance

Category

concurrency

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it.


Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: bug-hunter

## Bug Report: Concurrency — `get_facade()` non-thread-safe singleton ### Severity Assessment - **Impact**: Two threads calling `get_facade()` simultaneously can each independently call `_build_facade()`, constructing two separate `A2aLocalFacade` instances wired to the DI container. Only one is stored in `_facade_instance`; the other is silently discarded. If `_build_facade()` has side effects (e.g., container initialization, service wiring that registers callbacks), those effects may be doubled or partially applied. - **Likelihood**: Low for pure CLI usage (single-threaded), but if `get_facade()` is ever called from concurrent test workers or a future multi-threaded server context, this race is live. - **Priority**: Medium (backlog) ### Location - **File**: `src/cleveragents/a2a/cli_bootstrap.py` - **Function**: `get_facade()` - **Lines**: ~49–57 ### Description `get_facade()` uses the classic unguarded singleton pattern: ```python _facade_instance: A2aLocalFacade | None = None def get_facade() -> A2aLocalFacade: global _facade_instance if _facade_instance is None: _facade_instance = _build_facade() return _facade_instance ``` There is no `threading.Lock` guarding the `None` check and the assignment. Two threads can both observe `_facade_instance is None` and both call `_build_facade()`. CPython's GIL does not protect the compound check-then-act sequence across multiple bytecode instructions. ### Evidence **`get_facade()` — no lock on the check-then-set** (cli_bootstrap.py ~lines 49–57): ```python _facade_instance: A2aLocalFacade | None = None def get_facade() -> A2aLocalFacade: global _facade_instance if _facade_instance is None: # ← Thread A and Thread B both see None _facade_instance = _build_facade() # ← both call _build_facade() return _facade_instance ``` **`_build_facade()` — wraps each service resolution in `contextlib.suppress(Exception)`** (cli_bootstrap.py ~lines 28–45): ```python def _build_facade() -> A2aLocalFacade: from cleveragents.application.container import get_container container = get_container() services: dict[str, Any] = {} with contextlib.suppress(Exception): services["plan_lifecycle_service"] = container.plan_lifecycle_service() ... return A2aLocalFacade(services=services) ``` If `get_container()` is not itself thread-safe, calling it from two threads simultaneously adds another race on top of this one. ### Expected Behavior `get_facade()` should guarantee that exactly one `A2aLocalFacade` instance is created regardless of concurrent callers, and that it is fully wired before being returned to any caller. ### Actual Behavior Two concurrent callers can each receive different `A2aLocalFacade` instances (briefly), and `_build_facade()` can be called twice, potentially causing duplicate side effects during container/service initialization. ### Suggested Fix Use a module-level `threading.Lock` with double-checked locking: ```python import threading _facade_lock = threading.Lock() _facade_instance: A2aLocalFacade | None = None def get_facade() -> A2aLocalFacade: global _facade_instance if _facade_instance is None: with _facade_lock: if _facade_instance is None: _facade_instance = _build_facade() return _facade_instance ``` ### Category `concurrency` ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: `@tdd_issue`, `@tdd_issue_<this-issue-number>`, and `@tdd_expected_fail` to prove the bug exists before fixing it. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: bug-hunter
HAL9000 added this to the v3.2.0 milestone 2026-04-09 21:09:08 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#6387
No description provided.