BUG-HUNT: [resource] Changeset ID accumulation causes memory leak in PlanExecutionContext #7163

Open
opened 2026-04-10 08:22:00 +00:00 by HAL9000 · 1 comment
Owner

Metadata

  • Branch: bugfix/plan-execution-context-changeset-id-memory-leak
  • Commit Message: fix(application): remove completed changesets from active list in PlanExecutionContext
  • Milestone: backlog
  • Parent Epic: #7023

Backlog note: This issue was discovered during autonomous operation
on milestone v3.2.0. It does not block milestone completion and has been
placed in the backlog for human review and future milestone assignment.

Bug Report: Resource Management — Changeset ID Accumulation Memory Leak

Severity Assessment

  • Impact: Memory consumption grows indefinitely in long-running plan execution contexts
  • Likelihood: Certain in any long-running plan that calls start_changeset() repeatedly
  • Priority: Medium (resource leak, not a crash or data corruption)

Location

  • File: src/cleveragents/application/services/plan_execution_context.py
  • Method: start_changeset()
  • Lines: ~160–165

Background and Context

PlanExecutionContext maintains _active_changeset_ids — a list[str] that is appended to by start_changeset() on every call. In long-running plans (e.g., multi-step agentic workflows, batch processing, or plans with many correction cycles), start_changeset() is called repeatedly. Because there is no mechanism to remove completed changeset IDs from the list, the list grows without bound for the lifetime of the context object.

This is a classic unbounded-accumulation resource leak: the data structure is append-only with no eviction, pruning, or cleanup path.

Current Behavior (Bug)

_active_changeset_ids is append-only with no removal mechanism:

# src/cleveragents/application/services/plan_execution_context.py, lines ~160-165

def start_changeset(self) -> str:
    """Start a new changeset for this plan."""
    changeset_id = self._changeset_store.start(self._plan_id)
    self._active_changeset_ids.append(changeset_id)   # LEAK: IDs accumulate, never removed
    self._logger.info("Changeset started", changeset_id=changeset_id)
    return changeset_id

There is no corresponding complete_changeset(), remove_changeset(), or automatic cleanup that removes a changeset ID from _active_changeset_ids once the changeset is committed or closed.

Expected Behavior

Completed changesets should be removed from _active_changeset_ids to prevent unbounded memory growth. Specifically:

  1. A method (e.g., complete_changeset(changeset_id: str)) should remove the given ID from _active_changeset_ids once the changeset is committed.
  2. Alternatively, _active_changeset_ids could be bounded to only track the current active changeset (a single str | None rather than a list), if the design intent is that only one changeset is active at a time.
  3. In either case, the list must not grow indefinitely.

Evidence

# plan_execution_context.py — append-only, no removal:
self._active_changeset_ids.append(changeset_id)   # start_changeset(), line ~163
# No corresponding removal anywhere in the class

A grep for _active_changeset_ids in the file reveals only append() calls and reads — no remove(), pop(), clear(), or reassignment that would shrink the list.

Suggested Fix

Option A — Add explicit complete_changeset() method:

def complete_changeset(self, changeset_id: str) -> None:
    """Mark a changeset as complete and remove it from the active list."""
    try:
        self._active_changeset_ids.remove(changeset_id)
    except ValueError:
        self._logger.warning("complete_changeset_id_not_found", changeset_id=changeset_id)

Option B — Replace list with a single active changeset slot (if only one is active at a time):

self._active_changeset_id: str | None = None

def start_changeset(self) -> str:
    changeset_id = self._changeset_store.start(self._plan_id)
    self._active_changeset_id = changeset_id
    return changeset_id

def complete_changeset(self) -> None:
    self._active_changeset_id = None

The correct option depends on whether multiple changesets can be active simultaneously (Option A) or only one at a time (Option B). Review the specification and callers to determine the intended semantics.

Impact

  • Memory consumption grows indefinitely in long-running PlanExecutionContext instances (e.g., plans with many correction cycles or large batch operations).
  • No immediate crash — the leak is gradual and may only manifest in production under sustained load or long-lived plan sessions.
  • Severity: Medium — resource leak with no immediate functional impact but degrades system health over time.

Subtasks

  • Audit all call sites of start_changeset() to determine whether multiple changesets can be active simultaneously or only one at a time
  • Choose the appropriate fix strategy (Option A or Option B per Suggested Fix above) based on the specification and call-site analysis
  • Implement complete_changeset() (or equivalent cleanup) in PlanExecutionContext
  • Update all callers of start_changeset() to invoke the cleanup method when the changeset is done
  • Tests (Behave): Add scenario that calls start_changeset() N times and asserts _active_changeset_ids does not grow beyond expected bounds after completion
  • Tests (Behave): Add scenario verifying complete_changeset() removes the correct ID
  • Verify coverage >= 97% via nox -s coverage_report
  • Run nox (all default sessions), fix any errors

Definition of Done

  • _active_changeset_ids (or its replacement) does not grow unboundedly across repeated start_changeset() / complete cycles
  • A cleanup method or mechanism exists and is called by all relevant callers
  • BDD scenarios cover the memory-leak scenario and the cleanup path
  • No # type: ignore annotations introduced
  • All nox stages pass
  • Coverage >= 97%
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done

Automated by CleverAgents Bot
Supervisor: Acting on behalf of: Bug Hunter | Agent: new-issue-creator

## Metadata - **Branch**: `bugfix/plan-execution-context-changeset-id-memory-leak` - **Commit Message**: `fix(application): remove completed changesets from active list in PlanExecutionContext` - **Milestone**: backlog - **Parent Epic**: #7023 > **Backlog note:** This issue was discovered during autonomous operation > on milestone v3.2.0. It does not block milestone completion and has been > placed in the backlog for human review and future milestone assignment. ## Bug Report: Resource Management — Changeset ID Accumulation Memory Leak ### Severity Assessment - **Impact**: Memory consumption grows indefinitely in long-running plan execution contexts - **Likelihood**: Certain in any long-running plan that calls `start_changeset()` repeatedly - **Priority**: Medium (resource leak, not a crash or data corruption) ### Location - **File**: `src/cleveragents/application/services/plan_execution_context.py` - **Method**: `start_changeset()` - **Lines**: ~160–165 ### Background and Context `PlanExecutionContext` maintains `_active_changeset_ids` — a `list[str]` that is appended to by `start_changeset()` on every call. In long-running plans (e.g., multi-step agentic workflows, batch processing, or plans with many correction cycles), `start_changeset()` is called repeatedly. Because there is no mechanism to remove completed changeset IDs from the list, the list grows without bound for the lifetime of the context object. This is a classic unbounded-accumulation resource leak: the data structure is append-only with no eviction, pruning, or cleanup path. ### Current Behavior (Bug) `_active_changeset_ids` is append-only with no removal mechanism: ```python # src/cleveragents/application/services/plan_execution_context.py, lines ~160-165 def start_changeset(self) -> str: """Start a new changeset for this plan.""" changeset_id = self._changeset_store.start(self._plan_id) self._active_changeset_ids.append(changeset_id) # LEAK: IDs accumulate, never removed self._logger.info("Changeset started", changeset_id=changeset_id) return changeset_id ``` There is no corresponding `complete_changeset()`, `remove_changeset()`, or automatic cleanup that removes a changeset ID from `_active_changeset_ids` once the changeset is committed or closed. ### Expected Behavior Completed changesets should be removed from `_active_changeset_ids` to prevent unbounded memory growth. Specifically: 1. A method (e.g., `complete_changeset(changeset_id: str)`) should remove the given ID from `_active_changeset_ids` once the changeset is committed. 2. Alternatively, `_active_changeset_ids` could be bounded to only track the *current* active changeset (a single `str | None` rather than a list), if the design intent is that only one changeset is active at a time. 3. In either case, the list must not grow indefinitely. ### Evidence ```python # plan_execution_context.py — append-only, no removal: self._active_changeset_ids.append(changeset_id) # start_changeset(), line ~163 # No corresponding removal anywhere in the class ``` A grep for `_active_changeset_ids` in the file reveals only `append()` calls and reads — no `remove()`, `pop()`, `clear()`, or reassignment that would shrink the list. ### Suggested Fix **Option A — Add explicit `complete_changeset()` method:** ```python def complete_changeset(self, changeset_id: str) -> None: """Mark a changeset as complete and remove it from the active list.""" try: self._active_changeset_ids.remove(changeset_id) except ValueError: self._logger.warning("complete_changeset_id_not_found", changeset_id=changeset_id) ``` **Option B — Replace list with a single active changeset slot (if only one is active at a time):** ```python self._active_changeset_id: str | None = None def start_changeset(self) -> str: changeset_id = self._changeset_store.start(self._plan_id) self._active_changeset_id = changeset_id return changeset_id def complete_changeset(self) -> None: self._active_changeset_id = None ``` The correct option depends on whether multiple changesets can be active simultaneously (Option A) or only one at a time (Option B). Review the specification and callers to determine the intended semantics. ### Impact - **Memory consumption grows indefinitely** in long-running `PlanExecutionContext` instances (e.g., plans with many correction cycles or large batch operations). - **No immediate crash** — the leak is gradual and may only manifest in production under sustained load or long-lived plan sessions. - **Severity**: Medium — resource leak with no immediate functional impact but degrades system health over time. ## Subtasks - [ ] Audit all call sites of `start_changeset()` to determine whether multiple changesets can be active simultaneously or only one at a time - [ ] Choose the appropriate fix strategy (Option A or Option B per Suggested Fix above) based on the specification and call-site analysis - [ ] Implement `complete_changeset()` (or equivalent cleanup) in `PlanExecutionContext` - [ ] Update all callers of `start_changeset()` to invoke the cleanup method when the changeset is done - [ ] Tests (Behave): Add scenario that calls `start_changeset()` N times and asserts `_active_changeset_ids` does not grow beyond expected bounds after completion - [ ] Tests (Behave): Add scenario verifying `complete_changeset()` removes the correct ID - [ ] Verify coverage >= 97% via `nox -s coverage_report` - [ ] Run `nox` (all default sessions), fix any errors ## Definition of Done - [ ] `_active_changeset_ids` (or its replacement) does not grow unboundedly across repeated `start_changeset()` / complete cycles - [ ] A cleanup method or mechanism exists and is called by all relevant callers - [ ] BDD scenarios cover the memory-leak scenario and the cleanup path - [ ] No `# type: ignore` annotations introduced - [ ] All nox stages pass - [ ] Coverage >= 97% - [ ] A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation - [ ] The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly - [ ] The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done --- **Automated by CleverAgents Bot** Supervisor: Acting on behalf of: Bug Hunter | Agent: new-issue-creator
Author
Owner

Verified — Critical resource bug: changeset ID accumulation memory leak in PlanExecutionContext. MoSCoW: Must-have. Priority: Critical.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Critical resource bug: changeset ID accumulation memory leak in PlanExecutionContext. MoSCoW: Must-have. Priority: Critical. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#7163
No description provided.