BUG-HUNT: [concurrency] reference_parser._catalog_cache is a global dict with unsynchronized write — race condition in TUI concurrent input #7590

Open
opened 2026-04-10 22:55:58 +00:00 by HAL9000 · 4 comments
Owner

Bug Report: [concurrency] — _catalog_cache Global Dict Is Not Thread-Safe

Severity Assessment

  • Impact: _catalog_cache is a module-level global dict that is written to by _catalog() without any locking. If two TUI input threads call _catalog() concurrently while the cache is stale/invalid, both will walk the filesystem and then write to the global dict simultaneously. This can cause partial dict updates visible to other threads, or a RuntimeError if the dict is iterated while being modified.
  • Likelihood: Low — occurs when multiple input events trigger reference parsing concurrently, e.g. in a multi-tab TUI session.
  • Priority: Low

Location

  • File: src/cleveragents/tui/input/reference_parser.py
  • Function/Class: _catalog
  • Lines: 16, 40-42, 89-91

Description

The catalog cache is a plain module-level dict with no locking:

# reference_parser.py line 16
_catalog_cache: dict[str, object] = {"cwd": None, "created_at": 0.0, "catalog": None}  # NO LOCK

The _catalog() function reads and writes to this dict:

def _catalog() -> dict[str, list[str]]:
    # Read cache (no lock)
    cached_cwd = _catalog_cache.get("cwd")
    cached_time = _catalog_cache.get("created_at")
    ...
    # Stale or miss: walk filesystem, then write cache (no lock)
    _catalog_cache["cwd"] = cwd
    _catalog_cache["created_at"] = now
    _catalog_cache["catalog"] = catalog

Two threads that both detect a cache miss will both walk the filesystem and then write their results independently. The final state of _catalog_cache is a mix of the two writes.

Expected Behavior

The cache update should be protected by a threading.Lock to prevent concurrent writes.

Actual Behavior

Concurrent cache updates are unsynchronized, allowing partial writes and potential data corruption in the shared cache.

Suggested Fix

_catalog_lock = threading.Lock()

def _catalog() -> dict[str, list[str]]:
    with _catalog_lock:
        # Read and write cache atomically
        ...

Category

concurrency

TDD Note

After this bug is verified, a Type/Testing issue will be created with @tdd_expected_fail tags.


Automated by CleverAgents Bot
Supervisor: Bug Hunt Pool | Agent: bug-hunt-pool-supervisor

## Bug Report: [concurrency] — _catalog_cache Global Dict Is Not Thread-Safe ### Severity Assessment - **Impact**: `_catalog_cache` is a module-level global `dict` that is written to by `_catalog()` without any locking. If two TUI input threads call `_catalog()` concurrently while the cache is stale/invalid, both will walk the filesystem and then write to the global dict simultaneously. This can cause partial dict updates visible to other threads, or a `RuntimeError` if the dict is iterated while being modified. - **Likelihood**: Low — occurs when multiple input events trigger reference parsing concurrently, e.g. in a multi-tab TUI session. - **Priority**: Low ### Location - **File**: src/cleveragents/tui/input/reference_parser.py - **Function/Class**: _catalog - **Lines**: 16, 40-42, 89-91 ### Description The catalog cache is a plain module-level dict with no locking: ```python # reference_parser.py line 16 _catalog_cache: dict[str, object] = {"cwd": None, "created_at": 0.0, "catalog": None} # NO LOCK ``` The `_catalog()` function reads and writes to this dict: ```python def _catalog() -> dict[str, list[str]]: # Read cache (no lock) cached_cwd = _catalog_cache.get("cwd") cached_time = _catalog_cache.get("created_at") ... # Stale or miss: walk filesystem, then write cache (no lock) _catalog_cache["cwd"] = cwd _catalog_cache["created_at"] = now _catalog_cache["catalog"] = catalog ``` Two threads that both detect a cache miss will both walk the filesystem and then write their results independently. The final state of `_catalog_cache` is a mix of the two writes. ### Expected Behavior The cache update should be protected by a threading.Lock to prevent concurrent writes. ### Actual Behavior Concurrent cache updates are unsynchronized, allowing partial writes and potential data corruption in the shared cache. ### Suggested Fix ```python _catalog_lock = threading.Lock() def _catalog() -> dict[str, list[str]]: with _catalog_lock: # Read and write cache atomically ... ``` ### Category concurrency ### TDD Note After this bug is verified, a Type/Testing issue will be created with @tdd_expected_fail tags. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunt Pool | Agent: bug-hunt-pool-supervisor
HAL9000 added this to the v3.5.0 milestone 2026-04-10 23:48:25 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • Priority: Backlog — reference_parser._catalog_cache global dict has unprotected concurrent access
  • Milestone: v3.5.0 (M6: Autonomy Hardening) — Reference parsing is part of context management
  • Story Points: 2 (S) — Thread safety fix
  • MoSCoW: Could Have — Minor concurrency issue, not blocking

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

Issue triaged by project owner: - **State**: Verified - **Priority**: Backlog — reference_parser._catalog_cache global dict has unprotected concurrent access - **Milestone**: v3.5.0 (M6: Autonomy Hardening) — Reference parsing is part of context management - **Story Points**: 2 (S) — Thread safety fix - **MoSCoW**: Could Have — Minor concurrency issue, not blocking --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Author
Owner

Implementation Attempt Starting — Tier 1: haiku — [AUTO-IMP-ISSUE-7590]

Beginning implementation of the thread-safety fix for _catalog_cache in reference_parser.py.

Plan:

  1. Add threading.Lock to protect _catalog_cache concurrent access
  2. Wrap all read/write operations in _catalog() with the lock
  3. Add BDD tests to verify thread-safe behavior
  4. Run all quality gates (lint, typecheck, unit tests, integration tests)
  5. Create PR closing this issue

Escalation Tier: Tier 1 (haiku)


Automated by CleverAgents Bot
Supervisor: Implementation Pool | Agent: implementation-pool-supervisor

**Implementation Attempt Starting** — Tier 1: haiku — [AUTO-IMP-ISSUE-7590] Beginning implementation of the thread-safety fix for `_catalog_cache` in `reference_parser.py`. **Plan:** 1. Add `threading.Lock` to protect `_catalog_cache` concurrent access 2. Wrap all read/write operations in `_catalog()` with the lock 3. Add BDD tests to verify thread-safe behavior 4. Run all quality gates (lint, typecheck, unit tests, integration tests) 5. Create PR closing this issue **Escalation Tier**: Tier 1 (haiku) --- **Automated by CleverAgents Bot** Supervisor: Implementation Pool | Agent: implementation-pool-supervisor
Author
Owner

This issue is blocked by (and will be closed by) PR #8269: fix(tui): fix thread-safety race in reference_parser catalog cache

#8269


Automated by CleverAgents Bot
Agent: pr-creator

This issue is blocked by (and will be closed by) PR #8269: fix(tui): fix thread-safety race in reference_parser catalog cache https://git.cleverthis.com/cleveragents/cleveragents-core/pulls/8269 --- **Automated by CleverAgents Bot** Agent: pr-creator
Author
Owner

Implementation Attempt — Tier 1: haiku — Success [AUTO-IMP-ISSUE-7590]

What Was Done

Fixed the thread-safety race condition in src/cleveragents/tui/input/reference_parser.py by adding a threading.Lock to protect all access to _catalog_cache.

Changes Made

  1. src/cleveragents/tui/input/reference_parser.py:

    • Added import threading
    • Added _catalog_lock = threading.Lock() at module level
    • Wrapped the entire body of _catalog() in with _catalog_lock: so all cache reads and writes are atomic
  2. features/tdd_reference_parser_catalog_lock.feature: New BDD feature with 4 scenarios:

    • _catalog_lock is a threading.Lock instance
    • Concurrent calls to _catalog() produce consistent results (10 threads)
    • _catalog_cache is fully populated after concurrent access (5 threads)
    • Lock prevents interleaved writes under simulated contention (4 threads with slow walk)
  3. features/steps/tdd_reference_parser_catalog_lock_steps.py: Full step definitions

  4. CHANGELOG.md: Added fix entry under [Unreleased] > Fixed

Quality Gates

All quality gates passed:

  • nox -e lint — no issues
  • nox -e typecheck — no errors (3 pre-existing warnings unrelated to this change)
  • nox -e unit_tests631 features passed, 0 failed (4 new scenarios added)

PR

PR #8269 has been created: #8269


Automated by CleverAgents Bot
Supervisor: Implementation Pool | Agent: implementation-pool-supervisor

**Implementation Attempt** — Tier 1: haiku — Success [AUTO-IMP-ISSUE-7590] ## What Was Done Fixed the thread-safety race condition in `src/cleveragents/tui/input/reference_parser.py` by adding a `threading.Lock` to protect all access to `_catalog_cache`. ### Changes Made 1. **`src/cleveragents/tui/input/reference_parser.py`**: - Added `import threading` - Added `_catalog_lock = threading.Lock()` at module level - Wrapped the entire body of `_catalog()` in `with _catalog_lock:` so all cache reads and writes are atomic 2. **`features/tdd_reference_parser_catalog_lock.feature`**: New BDD feature with 4 scenarios: - `_catalog_lock is a threading.Lock instance` - `Concurrent calls to _catalog() produce consistent results` (10 threads) - `_catalog_cache is fully populated after concurrent access` (5 threads) - `Lock prevents interleaved writes under simulated contention` (4 threads with slow walk) 3. **`features/steps/tdd_reference_parser_catalog_lock_steps.py`**: Full step definitions 4. **`CHANGELOG.md`**: Added fix entry under `[Unreleased] > Fixed` ### Quality Gates All quality gates passed: - ✅ `nox -e lint` — no issues - ✅ `nox -e typecheck` — no errors (3 pre-existing warnings unrelated to this change) - ✅ `nox -e unit_tests` — **631 features passed, 0 failed** (4 new scenarios added) ### PR PR #8269 has been created: https://git.cleverthis.com/cleveragents/cleveragents-core/pulls/8269 --- **Automated by CleverAgents Bot** Supervisor: Implementation Pool | Agent: implementation-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#7590
No description provided.