BUG-HUNT: [concurrency] LockService.acquire() has TOCTOU race allowing duplicate lock acquisition #7330

Open
opened 2026-04-10 17:07:59 +00:00 by HAL9000 · 3 comments
Owner

Bug Report: Concurrency — TOCTOU Race in LockService.acquire()

Severity Assessment

  • Impact: Two concurrent callers acquiring the same resource lock can both observe no existing lock (SELECT returns None), both insert a new lock row, leading to duplicate lock entries or a unique constraint violation crash. Alternatively, if the DB constraint doesn't prevent it, two owners could both believe they hold an exclusive lock simultaneously.
  • Likelihood: Low-Medium — requires concurrent plan execution on the same resource
  • Priority: Medium

Location

  • File: src/cleveragents/application/services/lock_service.py
  • Function/Class: LockService.acquire()
  • Lines: 125–230

Description

acquire() implements an advisory lock with a SELECT → check → INSERT pattern. The SELECT (line 165–169) and INSERT (lines 209–216) are separate operations within the same transaction, but there is no database-level locking (no SELECT ... FOR UPDATE) between them.

Two concurrent callers requesting the same (resource_type, resource_id) can:

  1. Both run the SELECT and both get existing = None
  2. Both decide to insert a new lock row
  3. Both do session.add(lock) and session.commit()

If the database has a unique constraint on (resource_type, resource_id), one of the commits will fail with an IntegrityError. If no unique constraint exists, both inserts succeed and the resource is "locked" by two owners simultaneously.

Evidence

# lock_service.py lines 163-224
session = self._session_factory()
try:
    stmt = select(LockModel).where(...)
    existing: LockModel | None = session.execute(stmt).scalar_one_or_none()
    # ← Thread A and Thread B both return existing=None here

    if existing is not None:
        ...  # both skip this

    # No existing lock — create new
    lock = LockModel(owner_id=owner_id, ...)
    session.add(lock)      # ← Thread A inserts
    session.commit()       # ← Thread A commits
    # ← Thread B also inserts and tries to commit — race!

Expected Behavior

The check-and-insert should be atomic. Use SELECT ... FOR UPDATE (for PostgreSQL) or a SQLite-level serialization strategy to prevent the TOCTOU window.

Actual Behavior

Two concurrent acquires on the same resource can both succeed, potentially yielding a constraint error or dual ownership.

Suggested Fix

Use a database-level upsert with a unique constraint on (resource_type, resource_id), or use INSERT OR IGNORE + UPDATE in a single atomic SQL statement for SQLite. Alternatively, use SERIALIZABLE isolation level for the lock acquisition transaction.

Category

concurrency

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it.


Automated by CleverAgents Bot
Supervisor: Bug Detection Pool | Agent: bug-hunt-pool-supervisor

## Bug Report: Concurrency — TOCTOU Race in LockService.acquire() ### Severity Assessment - **Impact**: Two concurrent callers acquiring the same resource lock can both observe no existing lock (SELECT returns None), both insert a new lock row, leading to duplicate lock entries or a unique constraint violation crash. Alternatively, if the DB constraint doesn't prevent it, two owners could both believe they hold an exclusive lock simultaneously. - **Likelihood**: Low-Medium — requires concurrent plan execution on the same resource - **Priority**: Medium ### Location - **File**: `src/cleveragents/application/services/lock_service.py` - **Function/Class**: `LockService.acquire()` - **Lines**: 125–230 ### Description `acquire()` implements an advisory lock with a SELECT → check → INSERT pattern. The SELECT (line 165–169) and INSERT (lines 209–216) are separate operations within the same transaction, but there is no database-level locking (no `SELECT ... FOR UPDATE`) between them. Two concurrent callers requesting the same `(resource_type, resource_id)` can: 1. Both run the SELECT and both get `existing = None` 2. Both decide to insert a new lock row 3. Both do `session.add(lock)` and `session.commit()` If the database has a unique constraint on `(resource_type, resource_id)`, one of the commits will fail with an `IntegrityError`. If no unique constraint exists, both inserts succeed and the resource is "locked" by two owners simultaneously. ### Evidence ```python # lock_service.py lines 163-224 session = self._session_factory() try: stmt = select(LockModel).where(...) existing: LockModel | None = session.execute(stmt).scalar_one_or_none() # ← Thread A and Thread B both return existing=None here if existing is not None: ... # both skip this # No existing lock — create new lock = LockModel(owner_id=owner_id, ...) session.add(lock) # ← Thread A inserts session.commit() # ← Thread A commits # ← Thread B also inserts and tries to commit — race! ``` ### Expected Behavior The check-and-insert should be atomic. Use `SELECT ... FOR UPDATE` (for PostgreSQL) or a SQLite-level serialization strategy to prevent the TOCTOU window. ### Actual Behavior Two concurrent acquires on the same resource can both succeed, potentially yielding a constraint error or dual ownership. ### Suggested Fix Use a database-level upsert with a unique constraint on `(resource_type, resource_id)`, or use `INSERT OR IGNORE` + `UPDATE` in a single atomic SQL statement for SQLite. Alternatively, use `SERIALIZABLE` isolation level for the lock acquisition transaction. ### Category concurrency ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: `@tdd_issue`, `@tdd_issue_<this-issue-number>`, and `@tdd_expected_fail` to prove the bug exists before fixing it. --- **Automated by CleverAgents Bot** Supervisor: Bug Detection Pool | Agent: bug-hunt-pool-supervisor
Author
Owner

Verified — Concurrency bug: LockService.acquire() TOCTOU race allows duplicate lock acquisition. MoSCoW: Must-have. Priority: High.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Concurrency bug: LockService.acquire() TOCTOU race allows duplicate lock acquisition. MoSCoW: Must-have. Priority: High. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Author
Owner

Verified — Concurrency bug: LockService.acquire() TOCTOU race allows duplicate lock acquisition. MoSCoW: Must-have. Priority: High.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Concurrency bug: LockService.acquire() TOCTOU race allows duplicate lock acquisition. MoSCoW: Must-have. Priority: High. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Author
Owner

Verified — Concurrency bug: LockService.acquire() TOCTOU race allows duplicate lock acquisition. MoSCoW: Must-have. Priority: High.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Concurrency bug: LockService.acquire() TOCTOU race allows duplicate lock acquisition. MoSCoW: Must-have. Priority: High. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#7330
No description provided.