BUG-HUNT: [concurrency] VocabularyRegistry is not thread-safe #2549

Open
opened 2026-04-03 18:51:46 +00:00 by freemo · 1 comment
Owner

Metadata

  • Branch: fix/vocabulary-registry-thread-safety
  • Commit Message: fix(acms): add threading.Lock to VocabularyRegistry to ensure thread safety
  • Milestone: v3.7.0
  • Parent Epic: #396

Background and Context

The VocabularyRegistry class in src/cleveragents/acms/uko/vocabulary_registry.py is not thread-safe. The class docstring explicitly acknowledges that external synchronization is required if instances are shared across threads. This is an insufficient safeguard — the registry is a shared, mutable resource and must protect its own internal state.

Without internal locking, two or more threads that concurrently call register() or unregister() can produce race conditions that corrupt the registry's internal dictionaries (_by_prefix, _by_iri) and sorted prefix list (_sorted_prefixes). For example:

  1. Thread A calls register(vocab_A) and passes the duplicate-prefix check.
  2. Thread B calls register(vocab_B) with the same prefix and also passes the duplicate-prefix check (before Thread A has written to _by_prefix).
  3. Both threads write to _by_prefix, silently overwriting each other — no DuplicateVocabularyError is raised, and the registry is now in an inconsistent state.

This violates the project's concurrency safety requirements and can lead to silent data corruption in any multi-threaded context where a shared VocabularyRegistry instance is used.

Current Behavior (Bug)

# src/cleveragents/acms/uko/vocabulary_registry.py  lines 49-95
class VocabularyRegistry:
    """Registry of Layer 2 paradigm vocabularies.
    ...
    Mutation Safety
    ~~~~~~~~~~~~~~~
    `register()` and `unregister()` mutate internal state.  The
    registry does **not** make defensive copies of vocabularies on
    retrieval; callers receive the original frozen
    `ParadigmVocabulary` instances.
    ...
    """

    def __init__(
        self,
        vocabularies: tuple[ParadigmVocabulary, ...] = (),
    ) -> None:
        self._by_prefix: dict[str, ParadigmVocabulary] = {}
        self._by_iri: dict[str, ParadigmVocabulary] = {}
        self._sorted_prefixes: list[str] = []
        for vocab in vocabularies:
            self.register(vocab)

    def register(self, vocabulary: ParadigmVocabulary) -> None:
        # No lock — check-then-act is not atomic
        if vocabulary.prefix in self._by_prefix:
            raise DuplicateVocabularyError(...)
        if vocabulary.iri in self._by_iri:
            raise DuplicateVocabularyError(...)
        self._by_prefix[vocabulary.prefix] = vocabulary
        self._by_iri[vocabulary.iri] = vocabulary
        insort(self._sorted_prefixes, vocabulary.prefix)

Two threads calling register() simultaneously can both pass the duplicate checks before either has written, resulting in silent overwrites and an inconsistent registry state.

Expected Behavior

VocabularyRegistry must be internally thread-safe. All mutating methods (register(), unregister()) must acquire an internal threading.Lock before performing any check-then-act operations, ensuring atomicity. Read-only methods (lookup_by_prefix(), lookup_by_iri(), list_prefixes(), list_all()) should also acquire the lock to prevent torn reads during concurrent mutations.

Suggested Fix

Add a threading.Lock to __init__ and wrap all mutating (and read) methods:

import threading

def __init__(self, vocabularies: tuple[ParadigmVocabulary, ...] = ()) -> None:
    self._lock = threading.Lock()
    self._by_prefix: dict[str, ParadigmVocabulary] = {}
    self._by_iri: dict[str, ParadigmVocabulary] = {}
    self._sorted_prefixes: list[str] = []
    for vocab in vocabularies:
        self.register(vocab)

def register(self, vocabulary: ParadigmVocabulary) -> None:
    with self._lock:
        if vocabulary.prefix in self._by_prefix:
            raise DuplicateVocabularyError(...)
        if vocabulary.iri in self._by_iri:
            raise DuplicateVocabularyError(...)
        self._by_prefix[vocabulary.prefix] = vocabulary
        self._by_iri[vocabulary.iri] = vocabulary
        insort(self._sorted_prefixes, vocabulary.prefix)

Impact

  • Severity: High — silent data corruption in multi-threaded environments.
  • Likelihood: High — any multi-threaded agent runtime sharing a VocabularyRegistry instance is affected.
  • Affected file: src/cleveragents/acms/uko/vocabulary_registry.py
  • Affected class: VocabularyRegistry (lines 49–95)

Subtasks

  • Add self._lock = threading.Lock() to VocabularyRegistry.__init__
  • Wrap register() body in with self._lock:
  • Wrap unregister() body in with self._lock: (if method exists)
  • Wrap all read methods (lookup_by_prefix, lookup_by_iri, list_prefixes, list_all) in with self._lock: to prevent torn reads
  • Update class docstring to reflect internal thread safety (remove external-synchronization caveat)
  • Tests (Behave): Add concurrent register() scenario to prove no race condition
  • Tests (Robot): Add integration test for thread-safe registry usage
  • Verify coverage ≥ 97% via nox -s coverage_report
  • Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • VocabularyRegistry acquires an internal threading.Lock in all mutating and read methods.
  • Concurrent calls to register() from multiple threads never produce silent overwrites or inconsistent state.
  • The class docstring no longer requires callers to provide external synchronization.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
  • All nox stages pass.
  • Coverage ≥ 97%.

Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/vocabulary-registry-thread-safety` - **Commit Message**: `fix(acms): add threading.Lock to VocabularyRegistry to ensure thread safety` - **Milestone**: v3.7.0 - **Parent Epic**: #396 ## Background and Context The `VocabularyRegistry` class in `src/cleveragents/acms/uko/vocabulary_registry.py` is not thread-safe. The class docstring explicitly acknowledges that external synchronization is required if instances are shared across threads. This is an insufficient safeguard — the registry is a shared, mutable resource and must protect its own internal state. Without internal locking, two or more threads that concurrently call `register()` or `unregister()` can produce race conditions that corrupt the registry's internal dictionaries (`_by_prefix`, `_by_iri`) and sorted prefix list (`_sorted_prefixes`). For example: 1. Thread A calls `register(vocab_A)` and passes the duplicate-prefix check. 2. Thread B calls `register(vocab_B)` with the same prefix and also passes the duplicate-prefix check (before Thread A has written to `_by_prefix`). 3. Both threads write to `_by_prefix`, silently overwriting each other — no `DuplicateVocabularyError` is raised, and the registry is now in an inconsistent state. This violates the project's concurrency safety requirements and can lead to silent data corruption in any multi-threaded context where a shared `VocabularyRegistry` instance is used. ## Current Behavior (Bug) ```python # src/cleveragents/acms/uko/vocabulary_registry.py lines 49-95 class VocabularyRegistry: """Registry of Layer 2 paradigm vocabularies. ... Mutation Safety ~~~~~~~~~~~~~~~ `register()` and `unregister()` mutate internal state. The registry does **not** make defensive copies of vocabularies on retrieval; callers receive the original frozen `ParadigmVocabulary` instances. ... """ def __init__( self, vocabularies: tuple[ParadigmVocabulary, ...] = (), ) -> None: self._by_prefix: dict[str, ParadigmVocabulary] = {} self._by_iri: dict[str, ParadigmVocabulary] = {} self._sorted_prefixes: list[str] = [] for vocab in vocabularies: self.register(vocab) def register(self, vocabulary: ParadigmVocabulary) -> None: # No lock — check-then-act is not atomic if vocabulary.prefix in self._by_prefix: raise DuplicateVocabularyError(...) if vocabulary.iri in self._by_iri: raise DuplicateVocabularyError(...) self._by_prefix[vocabulary.prefix] = vocabulary self._by_iri[vocabulary.iri] = vocabulary insort(self._sorted_prefixes, vocabulary.prefix) ``` Two threads calling `register()` simultaneously can both pass the duplicate checks before either has written, resulting in silent overwrites and an inconsistent registry state. ## Expected Behavior `VocabularyRegistry` must be internally thread-safe. All mutating methods (`register()`, `unregister()`) must acquire an internal `threading.Lock` before performing any check-then-act operations, ensuring atomicity. Read-only methods (`lookup_by_prefix()`, `lookup_by_iri()`, `list_prefixes()`, `list_all()`) should also acquire the lock to prevent torn reads during concurrent mutations. ## Suggested Fix Add a `threading.Lock` to `__init__` and wrap all mutating (and read) methods: ```python import threading def __init__(self, vocabularies: tuple[ParadigmVocabulary, ...] = ()) -> None: self._lock = threading.Lock() self._by_prefix: dict[str, ParadigmVocabulary] = {} self._by_iri: dict[str, ParadigmVocabulary] = {} self._sorted_prefixes: list[str] = [] for vocab in vocabularies: self.register(vocab) def register(self, vocabulary: ParadigmVocabulary) -> None: with self._lock: if vocabulary.prefix in self._by_prefix: raise DuplicateVocabularyError(...) if vocabulary.iri in self._by_iri: raise DuplicateVocabularyError(...) self._by_prefix[vocabulary.prefix] = vocabulary self._by_iri[vocabulary.iri] = vocabulary insort(self._sorted_prefixes, vocabulary.prefix) ``` ## Impact - **Severity**: High — silent data corruption in multi-threaded environments. - **Likelihood**: High — any multi-threaded agent runtime sharing a `VocabularyRegistry` instance is affected. - **Affected file**: `src/cleveragents/acms/uko/vocabulary_registry.py` - **Affected class**: `VocabularyRegistry` (lines 49–95) ## Subtasks - [ ] Add `self._lock = threading.Lock()` to `VocabularyRegistry.__init__` - [ ] Wrap `register()` body in `with self._lock:` - [ ] Wrap `unregister()` body in `with self._lock:` (if method exists) - [ ] Wrap all read methods (`lookup_by_prefix`, `lookup_by_iri`, `list_prefixes`, `list_all`) in `with self._lock:` to prevent torn reads - [ ] Update class docstring to reflect internal thread safety (remove external-synchronization caveat) - [ ] Tests (Behave): Add concurrent `register()` scenario to prove no race condition - [ ] Tests (Robot): Add integration test for thread-safe registry usage - [ ] Verify coverage ≥ 97% via `nox -s coverage_report` - [ ] Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - `VocabularyRegistry` acquires an internal `threading.Lock` in all mutating and read methods. - Concurrent calls to `register()` from multiple threads never produce silent overwrites or inconsistent state. - The class docstring no longer requires callers to provide external synchronization. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. - All nox stages pass. - Coverage ≥ 97%. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: ca-new-issue-creator
freemo added this to the v3.7.0 milestone 2026-04-03 18:51:59 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • MoSCoW: Should Have — Spec compliance or quality improvement.

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Issue triaged by project owner: - **State**: Verified - **MoSCoW**: Should Have — Spec compliance or quality improvement. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#396 Epic: ACMS Context Pipeline
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#2549
No description provided.