TDD: Non-thread-safe singleton in get_provider_registry() allows duplicate registry instances under concurrent access #10409

Open
opened 2026-04-18 09:34:57 +00:00 by HAL9000 · 1 comment
Owner

Metadata

  • Commit message: test(providers): add failing BDD scenario for get_provider_registry() thread-safety race condition
  • Branch name: tdd/mN-registry-thread-safety

Background and Context

The function get_provider_registry() in cleveragents.providers.registry (lines 753–765 of src/cleveragents/providers/registry.py) implements a singleton pattern without thread-safety guards. Under concurrent access, two threads can each observe _registry is None before either has finished constructing the ProviderRegistry, causing both to independently instantiate a new registry. This violates the singleton contract and can lead to inconsistent provider state across the application.

This TDD issue captures the failing Behave scenario that proves the race condition exists. It must be merged to master before the corresponding bug fix issue is worked on.

Summary

Write a failing Behave scenario that proves the race condition in cleveragents.providers.registry.get_provider_registry — two concurrent callers can each observe _registry is None and independently construct a ProviderRegistry, violating the singleton contract.

Scenario to write

@tdd_issue @tdd_issue_<BUG_ISSUE_NUMBER> @tdd_expected_fail
Scenario: Concurrent calls to get_provider_registry return the same instance
  Given the global provider registry has been reset
  When two threads call get_provider_registry() simultaneously
  Then both threads should receive the identical registry instance
  And only one ProviderRegistry should have been constructed

Note: Replace <BUG_ISSUE_NUMBER> with the actual bug issue number once it is created. The bug issue will depend on (be blocked by) this TDD issue.

Location

  • Module: cleveragents.providers.registry
  • Function: get_provider_registry (lines 753–765 in src/cleveragents/providers/registry.py)
  • Test file: features/providers/test_registry_thread_safety.feature

Expected Behavior

A failing Behave scenario exists under features/providers/ that:

  1. Resets the global _registry singleton to None
  2. Spawns two threads that simultaneously call get_provider_registry()
  3. Asserts both threads received the identical object (same id())
  4. Asserts only one ProviderRegistry was constructed
  5. Fails (assertion fires) when run against the current unfixed code
  6. Is tagged @tdd_expected_fail so CI inverts the result and passes

Acceptance Criteria

  • Behave scenario written under features/providers/
  • Scenario tagged with @tdd_issue, @tdd_issue_<N>, and @tdd_expected_fail
  • Assertion uses AssertionError only (not ValueError, RuntimeError, etc.)
  • Scenario fails (assertion fires) when run against the current unfixed code
  • CI passes because @tdd_expected_fail inverts the result
  • TDD PR merged to master before bug fix work begins

Subtasks

  • Identify the exact lines in get_provider_registry() that are not thread-safe
  • Write the Behave feature file at features/providers/test_registry_thread_safety.feature
  • Implement step definitions for the scenario (Given/When/Then)
  • Verify the scenario fails against the current unfixed code
  • Add @tdd_expected_fail tag and confirm CI passes with the inverted result
  • Open a PR targeting master on branch tdd/mN-registry-thread-safety
  • Update <BUG_ISSUE_NUMBER> tag once the bug issue is created

Definition of Done

  • Failing test merged to master on a tdd/mN-registry-thread-safety branch
  • This issue closed after TDD PR is merged
  • Bug issue updated to depend on this TDD issue

Automated by CleverAgents Bot
Agent: new-issue-creator
Supervisor: Bug Hunt Pool | Agent: bug-hunt-pool-supervisor [AUTO-BUG-8]

## Metadata - **Commit message:** `test(providers): add failing BDD scenario for get_provider_registry() thread-safety race condition` - **Branch name:** `tdd/mN-registry-thread-safety` ## Background and Context The function `get_provider_registry()` in `cleveragents.providers.registry` (lines 753–765 of `src/cleveragents/providers/registry.py`) implements a singleton pattern without thread-safety guards. Under concurrent access, two threads can each observe `_registry is None` before either has finished constructing the `ProviderRegistry`, causing both to independently instantiate a new registry. This violates the singleton contract and can lead to inconsistent provider state across the application. This TDD issue captures the failing Behave scenario that proves the race condition exists. It must be merged to `master` **before** the corresponding bug fix issue is worked on. ## Summary Write a failing Behave scenario that proves the race condition in `cleveragents.providers.registry.get_provider_registry` — two concurrent callers can each observe `_registry is None` and independently construct a `ProviderRegistry`, violating the singleton contract. ## Scenario to write ```gherkin @tdd_issue @tdd_issue_<BUG_ISSUE_NUMBER> @tdd_expected_fail Scenario: Concurrent calls to get_provider_registry return the same instance Given the global provider registry has been reset When two threads call get_provider_registry() simultaneously Then both threads should receive the identical registry instance And only one ProviderRegistry should have been constructed ``` > **Note:** Replace `<BUG_ISSUE_NUMBER>` with the actual bug issue number once it is created. The bug issue will depend on (be blocked by) this TDD issue. ## Location - **Module:** `cleveragents.providers.registry` - **Function:** `get_provider_registry` (lines 753–765 in `src/cleveragents/providers/registry.py`) - **Test file:** `features/providers/test_registry_thread_safety.feature` ## Expected Behavior A failing Behave scenario exists under `features/providers/` that: 1. Resets the global `_registry` singleton to `None` 2. Spawns two threads that simultaneously call `get_provider_registry()` 3. Asserts both threads received the **identical** object (same `id()`) 4. Asserts only **one** `ProviderRegistry` was constructed 5. Fails (assertion fires) when run against the current unfixed code 6. Is tagged `@tdd_expected_fail` so CI inverts the result and passes ## Acceptance Criteria - [ ] Behave scenario written under `features/providers/` - [ ] Scenario tagged with `@tdd_issue`, `@tdd_issue_<N>`, and `@tdd_expected_fail` - [ ] Assertion uses `AssertionError` only (not `ValueError`, `RuntimeError`, etc.) - [ ] Scenario fails (assertion fires) when run against the current unfixed code - [ ] CI passes because `@tdd_expected_fail` inverts the result - [ ] TDD PR merged to master before bug fix work begins ## Subtasks - [x] Identify the exact lines in `get_provider_registry()` that are not thread-safe - [x] Write the Behave feature file at `features/providers/test_registry_thread_safety.feature` - [x] Implement step definitions for the scenario (Given/When/Then) - [x] Verify the scenario fails against the current unfixed code - [x] Add `@tdd_expected_fail` tag and confirm CI passes with the inverted result - [x] Open a PR targeting `master` on branch `tdd/mN-registry-thread-safety` - [ ] Update `<BUG_ISSUE_NUMBER>` tag once the bug issue is created ## Definition of Done - [ ] Failing test merged to master on a `tdd/mN-registry-thread-safety` branch - [ ] This issue closed after TDD PR is merged - [ ] Bug issue updated to depend on this TDD issue --- **Automated by CleverAgents Bot** Agent: new-issue-creator Supervisor: Bug Hunt Pool | Agent: bug-hunt-pool-supervisor [AUTO-BUG-8]
Author
Owner

Implementation Attempt — Tier 1: Haiku — Success

Implemented the TDD failing scenario for the get_provider_registry() thread-safety race condition.

What was done:

  • Created features/providers/test_registry_thread_safety.feature with a Behave scenario tagged @tdd_issue, @tdd_issue_10409, and @tdd_expected_fail
  • Created features/steps/registry_thread_safety_steps.py with step definitions that use threading.Barrier to reliably trigger the race condition
  • The scenario spawns two threads simultaneously and asserts both receive the same singleton instance
  • The scenario fails (assertion fires) against the current unfixed code — "Race condition detected: two different ProviderRegistry instances were created"
  • The @tdd_expected_fail tag inverts the result so CI reports it as passed

Quality gates:

  • lint ✓
  • typecheck ✓
  • unit_tests (targeted) ✓ — 1 scenario passed via @tdd_expected_fail inversion

PR: #10754#10754


Automated by CleverAgents Bot
Supervisor: Implementation Pool | Agent: implementation-worker

**Implementation Attempt** — Tier 1: Haiku — Success Implemented the TDD failing scenario for the `get_provider_registry()` thread-safety race condition. **What was done:** - Created `features/providers/test_registry_thread_safety.feature` with a Behave scenario tagged `@tdd_issue`, `@tdd_issue_10409`, and `@tdd_expected_fail` - Created `features/steps/registry_thread_safety_steps.py` with step definitions that use `threading.Barrier` to reliably trigger the race condition - The scenario spawns two threads simultaneously and asserts both receive the same singleton instance - The scenario **fails** (assertion fires) against the current unfixed code — "Race condition detected: two different ProviderRegistry instances were created" - The `@tdd_expected_fail` tag inverts the result so CI reports it as **passed** **Quality gates:** - lint ✓ - typecheck ✓ - unit_tests (targeted) ✓ — 1 scenario passed via @tdd_expected_fail inversion **PR:** #10754 — https://git.cleverthis.com/cleveragents/cleveragents-core/pulls/10754 --- **Automated by CleverAgents Bot** Supervisor: Implementation Pool | Agent: implementation-worker
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#10409
No description provided.