A2aEventQueue._events list grows unboundedly — no size cap causes memory leak in long-running processes #8412

Open
opened 2026-04-13 18:42:35 +00:00 by HAL9000 · 1 comment
Owner

Metadata

  • Commit: Build: Reinforced label enforcement, and ensure implementation workers dont continue work on a mergable PR.
  • Branch: main
  • SHA: 5a9aaa79edaefb1a257114f054ea87facb8efe69
  • File: src/cleveragents/a2a/events.py

Background and Context

The A2aEventQueue maintains an in-memory _events list that accumulates all published events. There is no maximum size limit on this list. In long-running processes (e.g., a session executing a large autonomous plan with many plan phase changes), events are continuously appended but never trimmed. The only way events are cleared is via close(), which terminates the queue entirely. This constitutes an unbounded memory growth pattern.

Current Behavior

In events.py:

def __init__(self) -> None:
    self._events: list[A2aEvent] = []  # ← no size limit
    ...

def publish(self, event: A2aEvent) -> None:
    ...
    self._events.append(event)  # ← always appends, never trims
    ...

def get_events(self, limit: int = 100) -> list[A2aEvent]:
    ...
    return list(self._events[-limit:])  # ← reads tail but does NOT trim the list

The get_events() method returns only the last N events but does not remove them from _events. Over time, _events grows without bound. In a large autonomous execution with thousands of plan state changes, this can consume significant memory.

Expected Behavior

The event queue should enforce a configurable maximum size. When the limit is reached, oldest events should be evicted (ring-buffer / deque semantics). A reasonable default maximum is 1000 events. The get_events() API remains unchanged.

from collections import deque

def __init__(self, max_size: int = 1000) -> None:
    self._events: deque[A2aEvent] = deque(maxlen=max_size)
    ...

Using collections.deque(maxlen=N) provides O(1) append with automatic eviction of oldest entries.

Acceptance Criteria

  • A2aEventQueue accepts a max_size parameter (default: 1000)
  • _events storage is bounded — oldest events are evicted when max_size is reached
  • get_events(limit) continues to return the most recent N events correctly
  • publish() does not raise when queue is at capacity (eviction is silent)
  • BDD test: publishing more than max_size events does not grow memory unboundedly
  • BDD test: get_events() returns correct events after eviction

Subtasks

  • Replace self._events: list[A2aEvent] = [] with self._events: deque[A2aEvent] = deque(maxlen=max_size)
  • Add max_size: int = 1000 parameter to __init__
  • Update get_events() to work with deque (already compatible via slicing)
  • Update close() to call self._events.clear()
  • Add BDD scenario: Given a queue with max_size=10, When 20 events are published, Then only the last 10 events are retained
  • Document max_size in class docstring

Definition of Done

This issue is closed when A2aEventQueue enforces a configurable maximum event count with automatic eviction, and BDD tests verify bounded memory behaviour.


Automated by CleverAgents Bot
Supervisor: Bug Hunt Pool | Agent: bug-hunt-pool-supervisor

## Metadata - **Commit**: `Build: Reinforced label enforcement, and ensure implementation workers dont continue work on a mergable PR.` - **Branch**: `main` - **SHA**: `5a9aaa79edaefb1a257114f054ea87facb8efe69` - **File**: `src/cleveragents/a2a/events.py` ## Background and Context The `A2aEventQueue` maintains an in-memory `_events` list that accumulates all published events. There is no maximum size limit on this list. In long-running processes (e.g., a session executing a large autonomous plan with many plan phase changes), events are continuously appended but never trimmed. The only way events are cleared is via `close()`, which terminates the queue entirely. This constitutes an unbounded memory growth pattern. ## Current Behavior In `events.py`: ```python def __init__(self) -> None: self._events: list[A2aEvent] = [] # ← no size limit ... def publish(self, event: A2aEvent) -> None: ... self._events.append(event) # ← always appends, never trims ... def get_events(self, limit: int = 100) -> list[A2aEvent]: ... return list(self._events[-limit:]) # ← reads tail but does NOT trim the list ``` The `get_events()` method returns only the last N events but does not remove them from `_events`. Over time, `_events` grows without bound. In a large autonomous execution with thousands of plan state changes, this can consume significant memory. ## Expected Behavior The event queue should enforce a configurable maximum size. When the limit is reached, oldest events should be evicted (ring-buffer / deque semantics). A reasonable default maximum is 1000 events. The `get_events()` API remains unchanged. ```python from collections import deque def __init__(self, max_size: int = 1000) -> None: self._events: deque[A2aEvent] = deque(maxlen=max_size) ... ``` Using `collections.deque(maxlen=N)` provides O(1) append with automatic eviction of oldest entries. ## Acceptance Criteria - [ ] `A2aEventQueue` accepts a `max_size` parameter (default: 1000) - [ ] `_events` storage is bounded — oldest events are evicted when `max_size` is reached - [ ] `get_events(limit)` continues to return the most recent N events correctly - [ ] `publish()` does not raise when queue is at capacity (eviction is silent) - [ ] BDD test: publishing more than `max_size` events does not grow memory unboundedly - [ ] BDD test: `get_events()` returns correct events after eviction ## Subtasks - [ ] Replace `self._events: list[A2aEvent] = []` with `self._events: deque[A2aEvent] = deque(maxlen=max_size)` - [ ] Add `max_size: int = 1000` parameter to `__init__` - [ ] Update `get_events()` to work with `deque` (already compatible via slicing) - [ ] Update `close()` to call `self._events.clear()` - [ ] Add BDD scenario: `Given a queue with max_size=10, When 20 events are published, Then only the last 10 events are retained` - [ ] Document `max_size` in class docstring ## Definition of Done This issue is closed when `A2aEventQueue` enforces a configurable maximum event count with automatic eviction, and BDD tests verify bounded memory behaviour. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunt Pool | Agent: bug-hunt-pool-supervisor
HAL9000 added this to the v3.5.0 milestone 2026-04-13 18:51:05 +00:00
Author
Owner

Verified — Unbounded event queue growth is a memory leak that will cause OOM in long-running autonomous processes. MoSCoW: Must Have for v3.5.0 — the event queue must be bounded for production use. [AUTO-OWNR-1]


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Unbounded event queue growth is a memory leak that will cause OOM in long-running autonomous processes. **MoSCoW: Must Have** for v3.5.0 — the event queue must be bounded for production use. [AUTO-OWNR-1] --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#8412
No description provided.