UAT: LockService compares lock expiry using ISO string comparison instead of datetime objects — incorrect ordering for timezone-aware timestamps #3951

Open
opened 2026-04-06 07:48:13 +00:00 by freemo · 1 comment
Owner

Metadata

  • Branch: fix/concurrency/lock-service-datetime-comparison
  • Commit Message: fix(concurrency): use datetime comparison instead of ISO string comparison in LockService
  • Milestone: None — backlog
  • Parent Epic: TBD — needs manual linking

Backlog note: This issue was discovered during autonomous UAT operation
on an active milestone. It does not block milestone completion and has been
placed in the backlog for human review and future milestone assignment.


Background and Context

The LockService provides advisory locking for plan and project resources as part of the concurrency controls subsystem. Lock expiry must be correctly determined: a lock is expired when now >= expires_at. This comparison must be semantically correct for all valid timestamp representations.

Currently, LockService stores timestamps as ISO 8601 strings and compares them using Python string comparison, which is fragile and can produce incorrect results when timezone-aware and naive timestamps are mixed.


Current Behavior

LockService stores timestamps as ISO 8601 strings in the database and compares them using Python string comparison:

# src/cleveragents/application/services/lock_service.py

# In acquire():
now_iso = now.isoformat()
expires_iso = expires.isoformat()
...
existing_expires = str(existing.expires_at)
if existing_expires >= now_iso:  # STRING COMPARISON
    raise LockConflictError(...)

# In renew():
now_iso = now.isoformat()
if str(lock.expires_at) < now_iso:  # STRING COMPARISON
    raise LockExpiredError(...)

# In cleanup_expired():
now_iso = datetime.now(tz=UTC).isoformat()
stmt = delete(LockModel).where(LockModel.expires_at < now_iso)  # SQL string comparison

# In count_stale_locks():
now_iso = datetime.now(tz=UTC).isoformat()
stmt = select(...).where(LockModel.expires_at < now_iso)  # SQL string comparison

# In is_locked():
now_iso = datetime.now(tz=UTC).isoformat()
stmt = select(...).where(LockModel.expires_at >= now_iso)  # SQL string comparison

The problem with ISO string comparison:

ISO 8601 strings are only lexicographically sortable when they have the same format and timezone representation. The datetime.isoformat() method in Python produces different formats depending on the timezone:

  • UTC-aware datetime: "2026-04-06T12:00:00+00:00"
  • Naive datetime: "2026-04-06T12:00:00"

Comparing "2026-04-06T12:00:00+00:00" with "2026-04-06T12:00:00" using string comparison gives incorrect results because "+" (ASCII 43) sorts before digits (ASCII 48–57), so "2026-04-06T12:00:00+00:00" < "2026-04-06T12:00:00" evaluates to True — meaning a UTC-aware timestamp would be considered earlier than a naive timestamp of the same moment.

Additionally, the SQL-level string comparison (LockModel.expires_at < now_iso) is database-dependent. SQLite stores everything as text and performs lexicographic comparison, which has the same issues.

Concrete failure scenario:

  1. Lock acquired with expires_at = datetime.now(UTC).isoformat()"2026-04-06T12:05:00+00:00"
  2. Later, now_iso = datetime.now(UTC).isoformat()"2026-04-06T12:03:00+00:00" (2 minutes earlier)
  3. String comparison: "2026-04-06T12:05:00+00:00" >= "2026-04-06T12:03:00+00:00"True ✓ (correct here)
  4. But if the database stored the timestamp without timezone info: "2026-04-06T12:05:00" vs "2026-04-06T12:03:00+00:00" → fragile and format-dependent

The real risk is when the format is inconsistent across different code paths or database migrations, breaking mutual exclusion guarantees.


Expected Behavior

Lock expiry comparisons use proper datetime objects (or SQLAlchemy DateTime columns with timezone support) rather than ISO string comparison. The comparison now >= expires_at is always semantically correct regardless of timezone representation.


Acceptance Criteria

  • All in-Python timestamp comparisons in LockService use datetime objects, not ISO strings
  • All SQL-level timestamp comparisons use SQLAlchemy DateTime typed columns, not string literals
  • LockModel.expires_at column is declared as DateTime(timezone=True) in the SQLAlchemy model
  • Existing tests pass and new regression tests cover timezone-aware vs naive timestamp edge cases
  • No regressions in acquire(), renew(), cleanup_expired(), count_stale_locks(), or is_locked()

Subtasks

  • Audit src/cleveragents/application/services/lock_service.py — identify all ISO string comparison sites
  • Update LockModel.expires_at column to DateTime(timezone=True) if not already typed
  • Refactor acquire() to compare datetime objects: datetime.fromisoformat(str(existing.expires_at)) >= now
  • Refactor renew() to compare datetime objects: datetime.fromisoformat(str(lock.expires_at)) < now
  • Refactor cleanup_expired() SQL filter to use datetime parameter binding instead of ISO string
  • Refactor count_stale_locks() SQL filter to use datetime parameter binding instead of ISO string
  • Refactor is_locked() SQL filter to use datetime parameter binding instead of ISO string
  • Tests (pytest/Behave): Add regression scenarios for timezone-aware vs naive timestamp comparisons
  • Verify coverage ≥ 97% via nox -s coverage_report
  • Run nox (all default sessions) and fix any errors

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional details about the implementation.
  • The commit is pushed to the remote on the branch fix/concurrency/lock-service-datetime-comparison.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
  • All nox stages pass.
  • Coverage ≥ 97%.

Supporting Information

  • Feature Area: Async and Concurrency Patterns — Concurrency Controls / Lock Service
  • Severity: Medium — incorrect lock expiry comparison can cause locks to be treated as expired when they are not (or vice versa), breaking mutual exclusion guarantees
  • Code Location: src/cleveragents/application/services/lock_service.pyacquire(), renew(), cleanup_expired(), count_stale_locks(), is_locked() methods
  • Discovered by: UAT code-level analysis

Recommended Fix:

from datetime import datetime, UTC

# Instead of string comparison:
existing_expires_dt = datetime.fromisoformat(str(existing.expires_at))
if existing_expires_dt >= now:  # datetime comparison
    raise LockConflictError(...)

For SQL-level comparisons, use SQLAlchemy's DateTime type which handles proper comparison at the database level.


Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/concurrency/lock-service-datetime-comparison` - **Commit Message**: `fix(concurrency): use datetime comparison instead of ISO string comparison in LockService` - **Milestone**: None — backlog - **Parent Epic**: TBD — needs manual linking > **Backlog note:** This issue was discovered during autonomous UAT operation > on an active milestone. It does not block milestone completion and has been > placed in the backlog for human review and future milestone assignment. --- ## Background and Context The `LockService` provides advisory locking for plan and project resources as part of the concurrency controls subsystem. Lock expiry must be correctly determined: a lock is expired when `now >= expires_at`. This comparison must be semantically correct for all valid timestamp representations. Currently, `LockService` stores timestamps as ISO 8601 strings and compares them using Python string comparison, which is fragile and can produce incorrect results when timezone-aware and naive timestamps are mixed. --- ## Current Behavior `LockService` stores timestamps as ISO 8601 strings in the database and compares them using Python string comparison: ```python # src/cleveragents/application/services/lock_service.py # In acquire(): now_iso = now.isoformat() expires_iso = expires.isoformat() ... existing_expires = str(existing.expires_at) if existing_expires >= now_iso: # STRING COMPARISON raise LockConflictError(...) # In renew(): now_iso = now.isoformat() if str(lock.expires_at) < now_iso: # STRING COMPARISON raise LockExpiredError(...) # In cleanup_expired(): now_iso = datetime.now(tz=UTC).isoformat() stmt = delete(LockModel).where(LockModel.expires_at < now_iso) # SQL string comparison # In count_stale_locks(): now_iso = datetime.now(tz=UTC).isoformat() stmt = select(...).where(LockModel.expires_at < now_iso) # SQL string comparison # In is_locked(): now_iso = datetime.now(tz=UTC).isoformat() stmt = select(...).where(LockModel.expires_at >= now_iso) # SQL string comparison ``` **The problem with ISO string comparison:** ISO 8601 strings are only lexicographically sortable when they have the same format and timezone representation. The `datetime.isoformat()` method in Python produces different formats depending on the timezone: - UTC-aware datetime: `"2026-04-06T12:00:00+00:00"` - Naive datetime: `"2026-04-06T12:00:00"` Comparing `"2026-04-06T12:00:00+00:00"` with `"2026-04-06T12:00:00"` using string comparison gives incorrect results because `"+"` (ASCII 43) sorts before digits (ASCII 48–57), so `"2026-04-06T12:00:00+00:00" < "2026-04-06T12:00:00"` evaluates to `True` — meaning a UTC-aware timestamp would be considered **earlier** than a naive timestamp of the same moment. Additionally, the SQL-level string comparison (`LockModel.expires_at < now_iso`) is database-dependent. SQLite stores everything as text and performs lexicographic comparison, which has the same issues. **Concrete failure scenario:** 1. Lock acquired with `expires_at = datetime.now(UTC).isoformat()` → `"2026-04-06T12:05:00+00:00"` 2. Later, `now_iso = datetime.now(UTC).isoformat()` → `"2026-04-06T12:03:00+00:00"` (2 minutes earlier) 3. String comparison: `"2026-04-06T12:05:00+00:00" >= "2026-04-06T12:03:00+00:00"` → `True` ✓ (correct here) 4. But if the database stored the timestamp without timezone info: `"2026-04-06T12:05:00"` vs `"2026-04-06T12:03:00+00:00"` → fragile and format-dependent The real risk is when the format is inconsistent across different code paths or database migrations, breaking mutual exclusion guarantees. --- ## Expected Behavior Lock expiry comparisons use proper `datetime` objects (or SQLAlchemy `DateTime` columns with timezone support) rather than ISO string comparison. The comparison `now >= expires_at` is always semantically correct regardless of timezone representation. --- ## Acceptance Criteria - [ ] All in-Python timestamp comparisons in `LockService` use `datetime` objects, not ISO strings - [ ] All SQL-level timestamp comparisons use SQLAlchemy `DateTime` typed columns, not string literals - [ ] `LockModel.expires_at` column is declared as `DateTime(timezone=True)` in the SQLAlchemy model - [ ] Existing tests pass and new regression tests cover timezone-aware vs naive timestamp edge cases - [ ] No regressions in `acquire()`, `renew()`, `cleanup_expired()`, `count_stale_locks()`, or `is_locked()` --- ## Subtasks - [ ] Audit `src/cleveragents/application/services/lock_service.py` — identify all ISO string comparison sites - [ ] Update `LockModel.expires_at` column to `DateTime(timezone=True)` if not already typed - [ ] Refactor `acquire()` to compare `datetime` objects: `datetime.fromisoformat(str(existing.expires_at)) >= now` - [ ] Refactor `renew()` to compare `datetime` objects: `datetime.fromisoformat(str(lock.expires_at)) < now` - [ ] Refactor `cleanup_expired()` SQL filter to use `datetime` parameter binding instead of ISO string - [ ] Refactor `count_stale_locks()` SQL filter to use `datetime` parameter binding instead of ISO string - [ ] Refactor `is_locked()` SQL filter to use `datetime` parameter binding instead of ISO string - [ ] Tests (pytest/Behave): Add regression scenarios for timezone-aware vs naive timestamp comparisons - [ ] Verify coverage ≥ 97% via `nox -s coverage_report` - [ ] Run `nox` (all default sessions) and fix any errors --- ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional details about the implementation. - The commit is pushed to the remote on the branch `fix/concurrency/lock-service-datetime-comparison`. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. - All nox stages pass. - Coverage ≥ 97%. --- ## Supporting Information - **Feature Area:** Async and Concurrency Patterns — Concurrency Controls / Lock Service - **Severity:** Medium — incorrect lock expiry comparison can cause locks to be treated as expired when they are not (or vice versa), breaking mutual exclusion guarantees - **Code Location:** `src/cleveragents/application/services/lock_service.py` — `acquire()`, `renew()`, `cleanup_expired()`, `count_stale_locks()`, `is_locked()` methods - **Discovered by:** UAT code-level analysis **Recommended Fix:** ```python from datetime import datetime, UTC # Instead of string comparison: existing_expires_dt = datetime.fromisoformat(str(existing.expires_at)) if existing_expires_dt >= now: # datetime comparison raise LockConflictError(...) ``` For SQL-level comparisons, use SQLAlchemy's `DateTime` type which handles proper comparison at the database level. --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-new-issue-creator
Author
Owner

⚠️ Orphan Issue — Needs Manual Epic Linking

This issue was created during autonomous UAT operation and does not yet have a parent Epic assigned. Per project conventions, every issue must be linked to a parent Epic using Forgejo's dependency system (child blocks parent).

Suggested parent Epics for human review:

  • #368: Epic: Subplans & Parallelism — most relevant, as parallel execution requires correct concurrency/locking semantics
  • #397: Epic: Server & Autonomy Infrastructure — if the lock service is considered core server infrastructure

Action required: A project maintainer should:

  1. Identify the correct parent Epic for this LockService datetime comparison bug
  2. Create the dependency link so that this issue blocks the parent Epic:
    POST /api/v1/repos/cleveragents/cleveragents-core/issues/3951/blocks
    Body: {"owner": "cleveragents", "repo": "cleveragents-core", "index": <PARENT_EPIC_NUMBER>}
    

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-new-issue-creator

⚠️ **Orphan Issue — Needs Manual Epic Linking** This issue was created during autonomous UAT operation and does not yet have a parent Epic assigned. Per project conventions, every issue must be linked to a parent Epic using Forgejo's dependency system (child **blocks** parent). **Suggested parent Epics for human review:** - `#368: Epic: Subplans & Parallelism` — most relevant, as parallel execution requires correct concurrency/locking semantics - `#397: Epic: Server & Autonomy Infrastructure` — if the lock service is considered core server infrastructure **Action required:** A project maintainer should: 1. Identify the correct parent Epic for this `LockService` datetime comparison bug 2. Create the dependency link so that this issue **blocks** the parent Epic: ``` POST /api/v1/repos/cleveragents/cleveragents-core/issues/3951/blocks Body: {"owner": "cleveragents", "repo": "cleveragents-core", "index": <PARENT_EPIC_NUMBER>} ``` --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-new-issue-creator
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#3951
No description provided.