Concurrency Bug: Unsafe concurrent access to _sqlite_checkpoints in DatabaseResourceHandler #8110

Open
opened 2026-04-13 03:34:21 +00:00 by HAL9000 · 1 comment
Owner

Metadata

  • Commit message: fix(database): add threading.Lock to protect _sqlite_checkpoints in DatabaseResourceHandler
  • Branch name: fix/database-resource-handler-sqlite-checkpoints-thread-safety
  • Module: src/cleveragents/resource/handlers/database.py
  • Class: DatabaseResourceHandler
  • Field: self._sqlite_checkpoints
  • Lines: 482, 889, 944, 958

Background and Context

The DatabaseResourceHandler uses an instance-level dictionary, self._sqlite_checkpoints, to manage open SQLite connections for its SAVEPOINT-based checkpointing feature. The resolve_handler function in resolver.py caches handler instances for performance. Because the application is multi-threaded, the same cached DatabaseResourceHandler instance can be used concurrently by multiple threads processing different plans.

The _sqlite_checkpoints dictionary is read from and written to in the create_checkpoint and rollback_to methods without any locking. If two threads call create_checkpoint on the same handler instance at the same time, a race condition can occur, potentially causing one of the checkpoints to be lost. A similar race can occur between a read in rollback_to and a write in create_checkpoint.

Expected Behavior

All access to the shared self._sqlite_checkpoints dictionary should be protected by a thread lock to ensure atomic updates and prevent race conditions. Concurrent calls to create_checkpoint and rollback_to must not interfere with each other.

Acceptance Criteria

  • A threading.Lock is added to the DatabaseResourceHandler.__init__ method.
  • The lock is acquired before any read or write access to self._sqlite_checkpoints in create_checkpoint_sqlite and _rollback_sqlite.
  • The lock is released after the access is complete (use with statement for safety).
  • Concurrent calls to create_checkpoint and rollback_to do not cause race conditions.
  • All existing tests continue to pass.

Subtasks

  • Add a threading.Lock instance variable to DatabaseResourceHandler.__init__ (line ~482).
  • Wrap all accesses to self._sqlite_checkpoints in create_checkpoint_sqlite (line ~889) with the lock using a with statement.
  • Wrap all accesses to self._sqlite_checkpoints in _rollback_sqlite (lines ~944, ~958) with the lock using a with statement.
  • Add a unit test to verify thread safety of concurrent create_checkpoint and rollback_to calls.

Definition of Done

  • All subtasks are complete and reviewed.
  • No race conditions exist on self._sqlite_checkpoints under concurrent access.
  • The fix is merged into the master branch.
  • Test coverage remains at or above the project threshold (≥ 97%).

Automated by CleverAgents Bot
Agent: new-issue-creator

## Metadata - **Commit message:** `fix(database): add threading.Lock to protect _sqlite_checkpoints in DatabaseResourceHandler` - **Branch name:** `fix/database-resource-handler-sqlite-checkpoints-thread-safety` - **Module:** `src/cleveragents/resource/handlers/database.py` - **Class:** `DatabaseResourceHandler` - **Field:** `self._sqlite_checkpoints` - **Lines:** 482, 889, 944, 958 ## Background and Context The `DatabaseResourceHandler` uses an instance-level dictionary, `self._sqlite_checkpoints`, to manage open SQLite connections for its SAVEPOINT-based checkpointing feature. The `resolve_handler` function in `resolver.py` caches handler instances for performance. Because the application is multi-threaded, the same cached `DatabaseResourceHandler` instance can be used concurrently by multiple threads processing different plans. The `_sqlite_checkpoints` dictionary is read from and written to in the `create_checkpoint` and `rollback_to` methods without any locking. If two threads call `create_checkpoint` on the same handler instance at the same time, a race condition can occur, potentially causing one of the checkpoints to be lost. A similar race can occur between a read in `rollback_to` and a write in `create_checkpoint`. ## Expected Behavior All access to the shared `self._sqlite_checkpoints` dictionary should be protected by a thread lock to ensure atomic updates and prevent race conditions. Concurrent calls to `create_checkpoint` and `rollback_to` must not interfere with each other. ## Acceptance Criteria - [ ] A `threading.Lock` is added to the `DatabaseResourceHandler.__init__` method. - [ ] The lock is acquired before any read or write access to `self._sqlite_checkpoints` in `create_checkpoint_sqlite` and `_rollback_sqlite`. - [ ] The lock is released after the access is complete (use `with` statement for safety). - [ ] Concurrent calls to `create_checkpoint` and `rollback_to` do not cause race conditions. - [ ] All existing tests continue to pass. ## Subtasks - [ ] Add a `threading.Lock` instance variable to `DatabaseResourceHandler.__init__` (line ~482). - [ ] Wrap all accesses to `self._sqlite_checkpoints` in `create_checkpoint_sqlite` (line ~889) with the lock using a `with` statement. - [ ] Wrap all accesses to `self._sqlite_checkpoints` in `_rollback_sqlite` (lines ~944, ~958) with the lock using a `with` statement. - [ ] Add a unit test to verify thread safety of concurrent `create_checkpoint` and `rollback_to` calls. ## Definition of Done - All subtasks are complete and reviewed. - No race conditions exist on `self._sqlite_checkpoints` under concurrent access. - The fix is merged into the `master` branch. - Test coverage remains at or above the project threshold (≥ 97%). --- **Automated by CleverAgents Bot** Agent: new-issue-creator
HAL9000 added this to the v3.4.0 milestone 2026-04-13 03:34:32 +00:00
Author
Owner

Verified — Unsafe concurrent access to SQLite checkpoints is a data integrity risk, especially critical for v3.5.0's parallel execution requirements (10+ concurrent subplans). Must Have fix for v3.4.0. Verified.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Unsafe concurrent access to SQLite checkpoints is a data integrity risk, especially critical for v3.5.0's parallel execution requirements (10+ concurrent subplans). **Must Have** fix for v3.4.0. Verified. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#8110
No description provided.