database/error_pattern_repository: ErrorPatternRepository uses in-memory storage only — all patterns lost on restart #10341

Open
opened 2026-04-18 08:54:22 +00:00 by HAL9000 · 1 comment
Owner

Metadata

  • Commit message: fix(database): persist ErrorPatternRepository to SQLite via SQLAlchemy ORM
  • Branch name: fix/error-pattern-repository-persistence

Background and Context

ErrorPatternRepository stores all error patterns in a plain Python dict with no database persistence. Every process restart silently discards all accumulated error patterns, breaking the error-recovery feedback loop. This is inconsistent with how LLMTraceRepository, CheckpointRepository, and other repositories work — all of which persist to SQLite via SQLAlchemy.

Expected Behavior

Error patterns should be persisted to the SQLite database and survive process restarts, consistent with how LLMTraceRepository, CheckpointRepository, and other repositories work.

Acceptance Criteria

  • An ErrorPatternModel SQLAlchemy ORM model exists in models.py
  • An Alembic migration for the error_patterns table is present and applies cleanly
  • ErrorPatternRepository uses the session-factory pattern (ADR-007) — no in-memory self._patterns dict
  • create, get, update, delete, list_all, match_context, count, and statistics all operate against the database
  • Error patterns survive a process restart (verified by the test in #10332)
  • nox passes with coverage ≥ 97%

Subtasks

  • Add ErrorPatternModel SQLAlchemy ORM model to models.py
  • Write and apply Alembic migration for the error_patterns table
  • Rewrite ErrorPatternRepository to use the session-factory pattern (ADR-007)
  • Remove the in-memory self._patterns dict
  • Verify all CRUD and query methods work against the database
  • Confirm the failing test from #10332 now passes

Definition of Done

This issue is closed when:

  • All acceptance criteria above are met
  • The failing test introduced in #10332 passes
  • nox passes with coverage ≥ 97%
  • A PR has been merged to the main branch

Summary

ErrorPatternRepository stores all error patterns in a plain Python dict with no database persistence. Every process restart silently discards all accumulated error patterns, breaking the error-recovery feedback loop.

Evidence

src/cleveragents/infrastructure/database/error_pattern_repository.py:

class ErrorPatternRepository:
    """In-memory CRUD + pattern-matching repository for error patterns."""

    def __init__(self) -> None:
        self._patterns: dict[str, ErrorPattern] = {}   # ← pure in-memory, no DB

All CRUD methods (create, get, update, delete, list_all, match_context, count, statistics) operate exclusively on self._patterns. There is no SQLAlchemy session, no ORM model class, and no Alembic migration for an error_patterns table.

Impact

  • All error patterns accumulated during a session are lost when the process restarts
  • The match_context() method always returns an empty list after restart, preventing error pattern matching
  • Error frequency statistics (statistics()) are reset to zero on every restart
  • This silently degrades error recovery quality without any warning

Steps to Reproduce

from cleveragents.infrastructure.database.error_pattern_repository import ErrorPatternRepository
from cleveragents.domain.models.core.error_pattern import ErrorPattern
from datetime import datetime

pattern = ErrorPattern(
    pattern_id="01HXYZ1234567890ABCDEFGHIJ",
    pattern="ImportError",
    keywords=["import", "module"],
    frequency=1,
    last_seen=datetime.utcnow(),
)

repo1 = ErrorPatternRepository()
repo1.create(pattern)

# Simulate restart
repo2 = ErrorPatternRepository()
assert repo2.get(pattern.pattern_id) is None  # ← data lost

Expected Behaviour

Error patterns should be persisted to the SQLite database and survive process restarts, consistent with how LLMTraceRepository, CheckpointRepository, and other repositories work.

Fix Path

  1. Add an ErrorPatternModel SQLAlchemy ORM model to models.py
  2. Add an Alembic migration for the error_patterns table
  3. Rewrite ErrorPatternRepository to use the session-factory pattern (ADR-007)
  4. Remove the in-memory self._patterns dict

Blocked By

Depends on TDD issue #10332 (failing test must be written first).


Automated by CleverAgents Bot
Agent: new-issue-creator

## Metadata - **Commit message:** `fix(database): persist ErrorPatternRepository to SQLite via SQLAlchemy ORM` - **Branch name:** `fix/error-pattern-repository-persistence` ## Background and Context `ErrorPatternRepository` stores all error patterns in a plain Python `dict` with no database persistence. Every process restart silently discards all accumulated error patterns, breaking the error-recovery feedback loop. This is inconsistent with how `LLMTraceRepository`, `CheckpointRepository`, and other repositories work — all of which persist to SQLite via SQLAlchemy. ## Expected Behavior Error patterns should be persisted to the SQLite database and survive process restarts, consistent with how `LLMTraceRepository`, `CheckpointRepository`, and other repositories work. ## Acceptance Criteria - [ ] An `ErrorPatternModel` SQLAlchemy ORM model exists in `models.py` - [ ] An Alembic migration for the `error_patterns` table is present and applies cleanly - [ ] `ErrorPatternRepository` uses the session-factory pattern (ADR-007) — no in-memory `self._patterns` dict - [ ] `create`, `get`, `update`, `delete`, `list_all`, `match_context`, `count`, and `statistics` all operate against the database - [ ] Error patterns survive a process restart (verified by the test in #10332) - [ ] `nox` passes with coverage ≥ 97% ## Subtasks - [ ] Add `ErrorPatternModel` SQLAlchemy ORM model to `models.py` - [ ] Write and apply Alembic migration for the `error_patterns` table - [ ] Rewrite `ErrorPatternRepository` to use the session-factory pattern (ADR-007) - [ ] Remove the in-memory `self._patterns` dict - [ ] Verify all CRUD and query methods work against the database - [ ] Confirm the failing test from #10332 now passes ## Definition of Done This issue is closed when: - All acceptance criteria above are met - The failing test introduced in #10332 passes - `nox` passes with coverage ≥ 97% - A PR has been merged to the main branch --- ## Summary `ErrorPatternRepository` stores all error patterns in a plain Python `dict` with no database persistence. Every process restart silently discards all accumulated error patterns, breaking the error-recovery feedback loop. ## Evidence `src/cleveragents/infrastructure/database/error_pattern_repository.py`: ```python class ErrorPatternRepository: """In-memory CRUD + pattern-matching repository for error patterns.""" def __init__(self) -> None: self._patterns: dict[str, ErrorPattern] = {} # ← pure in-memory, no DB ``` All CRUD methods (`create`, `get`, `update`, `delete`, `list_all`, `match_context`, `count`, `statistics`) operate exclusively on `self._patterns`. There is no SQLAlchemy session, no ORM model class, and no Alembic migration for an `error_patterns` table. ## Impact - All error patterns accumulated during a session are lost when the process restarts - The `match_context()` method always returns an empty list after restart, preventing error pattern matching - Error frequency statistics (`statistics()`) are reset to zero on every restart - This silently degrades error recovery quality without any warning ## Steps to Reproduce ```python from cleveragents.infrastructure.database.error_pattern_repository import ErrorPatternRepository from cleveragents.domain.models.core.error_pattern import ErrorPattern from datetime import datetime pattern = ErrorPattern( pattern_id="01HXYZ1234567890ABCDEFGHIJ", pattern="ImportError", keywords=["import", "module"], frequency=1, last_seen=datetime.utcnow(), ) repo1 = ErrorPatternRepository() repo1.create(pattern) # Simulate restart repo2 = ErrorPatternRepository() assert repo2.get(pattern.pattern_id) is None # ← data lost ``` ## Expected Behaviour Error patterns should be persisted to the SQLite database and survive process restarts, consistent with how `LLMTraceRepository`, `CheckpointRepository`, and other repositories work. ## Fix Path 1. Add an `ErrorPatternModel` SQLAlchemy ORM model to `models.py` 2. Add an Alembic migration for the `error_patterns` table 3. Rewrite `ErrorPatternRepository` to use the session-factory pattern (ADR-007) 4. Remove the in-memory `self._patterns` dict ## Blocked By Depends on TDD issue #10332 (failing test must be written first). --- **Automated by CleverAgents Bot** Agent: new-issue-creator
HAL9000 added this to the v3.5.0 milestone 2026-04-18 09:00:27 +00:00
Author
Owner

[GROOMED] PRIORITY/CRITICAL BUG - TRIAGE COMPLETE

Grooming Analysis

Issue Validity: VALID

This is a real, actionable bug with clear impact:

  • Problem: ErrorPatternRepository uses in-memory dict with no database persistence
  • Impact: All error patterns are lost on process restart, breaking error-recovery feedback loop
  • Evidence: Clear code example showing the issue in error_pattern_repository.py
  • Reproducibility: Steps to reproduce provided

Label Verification: ALL REQUIRED LABELS PRESENT

  • State/Unverified - Present (id: 846)
  • Type/Bug - Present (id: 849)
  • Priority/Critical - Present (id: 858) ⚠️ CRITICAL PRIORITY

Milestone Assignment: ASSIGNED

  • Assigned to: v3.5.0 (M6: Autonomy Hardening)
  • Rationale: Database infrastructure work fits within autonomy hardening milestone

Issue Quality: EXCELLENT

  • Well-documented with clear background and context
  • Acceptance criteria clearly defined (6 items)
  • Subtasks broken down (6 items)
  • Definition of Done explicitly stated
  • Fix path clearly outlined
  • Dependency noted: Depends on TDD issue #10332

Triage Decision: MOVE TO STATE/VERIFIED

Recommendation: This issue should be moved from State/Unverified to State/Verified as it meets all triage criteria:

  1. Validity confirmed - real, actionable bug
  2. All required labels present
  3. Milestone assigned
  4. Well-documented with clear acceptance criteria
  5. Not an orphan - has clear relationship to #10332

⚠️ CRITICAL PRIORITY FLAG

This is a Priority/Critical issue with no assignee. It requires immediate human attention for:

  • Assignment to a developer
  • Coordination with #10332 (TDD issue)
  • Scheduling within the v3.5.0 milestone

Summary

Issue #10341 is VALID and READY for development. All triage criteria met. Recommend immediate assignment to a developer and coordination with the TDD issue #10332.


Automated by CleverAgents Bot
Supervisor: Grooming | Agent: grooming-pool-supervisor

[GROOMED] ✅ **PRIORITY/CRITICAL BUG - TRIAGE COMPLETE** ## Grooming Analysis ### Issue Validity: ✅ VALID This is a real, actionable bug with clear impact: - **Problem**: `ErrorPatternRepository` uses in-memory dict with no database persistence - **Impact**: All error patterns are lost on process restart, breaking error-recovery feedback loop - **Evidence**: Clear code example showing the issue in `error_pattern_repository.py` - **Reproducibility**: Steps to reproduce provided ### Label Verification: ✅ ALL REQUIRED LABELS PRESENT - ✅ **State/Unverified** - Present (id: 846) - ✅ **Type/Bug** - Present (id: 849) - ✅ **Priority/Critical** - Present (id: 858) ⚠️ **CRITICAL PRIORITY** ### Milestone Assignment: ✅ ASSIGNED - **Assigned to**: v3.5.0 (M6: Autonomy Hardening) - **Rationale**: Database infrastructure work fits within autonomy hardening milestone ### Issue Quality: ✅ EXCELLENT - ✅ Well-documented with clear background and context - ✅ Acceptance criteria clearly defined (6 items) - ✅ Subtasks broken down (6 items) - ✅ Definition of Done explicitly stated - ✅ Fix path clearly outlined - ✅ Dependency noted: Depends on TDD issue #10332 ### Triage Decision: ✅ MOVE TO STATE/VERIFIED **Recommendation**: This issue should be moved from `State/Unverified` to `State/Verified` as it meets all triage criteria: 1. ✅ Validity confirmed - real, actionable bug 2. ✅ All required labels present 3. ✅ Milestone assigned 4. ✅ Well-documented with clear acceptance criteria 5. ✅ Not an orphan - has clear relationship to #10332 ### ⚠️ CRITICAL PRIORITY FLAG This is a **Priority/Critical** issue with no assignee. It requires immediate human attention for: - Assignment to a developer - Coordination with #10332 (TDD issue) - Scheduling within the v3.5.0 milestone ## Summary Issue #10341 is **VALID and READY for development**. All triage criteria met. Recommend immediate assignment to a developer and coordination with the TDD issue #10332. --- **Automated by CleverAgents Bot** Supervisor: Grooming | Agent: grooming-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#10341
No description provided.