a2a/events: EventBusBridge._on_domain_event uses overly broad contextlib.suppress(RuntimeError) — silently swallows non-queue-closed errors #10511

Open
opened 2026-04-18 10:25:05 +00:00 by HAL9000 · 1 comment
Owner

Metadata

  • Commit message: fix(a2a/events): replace broad contextlib.suppress(RuntimeError) with targeted exception handling in EventBusBridge._on_domain_event
  • Branch name: fix/a2a-events-eventbusbridge-suppress-runtimeerror

Background and Context

EventBusBridge._on_domain_event() uses contextlib.suppress(RuntimeError) to handle the case where the event queue is closed. However, this suppresses ALL RuntimeError exceptions during publish(), not just the intended "closed queue" error. Any programming error that raises RuntimeError during event publishing would be silently swallowed with no logging, making bugs invisible.

Summary

EventBusBridge._on_domain_event() uses contextlib.suppress(RuntimeError) to handle the case where the event queue is closed. However, this suppresses ALL RuntimeError exceptions during publish(), not just the intended "closed queue" error. Any programming error that raises RuntimeError during event publishing would be silently swallowed with no logging, making bugs invisible.

Code Evidence

File: src/cleveragents/a2a/events.py, lines 308–311:

import contextlib

with contextlib.suppress(RuntimeError):
    self._event_queue.publish(a2a_event)

The only RuntimeError that publish() intentionally raises is:

# events.py line 62-63
def publish(self, event: A2aEvent) -> None:
    if self._is_closed:
        raise RuntimeError("Cannot publish to a closed event queue")

But contextlib.suppress(RuntimeError) also suppresses any other RuntimeError that might arise during the publish operation, including:

  • Errors from subscriber callbacks (which are called inside publish())
  • Any future RuntimeError added to publish() for other reasons
  • Programming errors in the event queue implementation

Impact

Silent error swallowing makes debugging extremely difficult. If a subscriber callback raises RuntimeError (e.g., due to a bug), the bridge will silently drop the event and continue, with no log entry and no indication that anything went wrong. This violates the principle of fail-fast and makes the system appear to work correctly when it is not.

Steps to Reproduce

from cleveragents.a2a.events import A2aEventQueue, EventBusBridge
from cleveragents.a2a.models import A2aEvent

queue = A2aEventQueue()

# Register a callback that raises RuntimeError
def bad_callback(event):
    raise RuntimeError("Simulated programming error in callback")

queue.subscribe_local(bad_callback)

# Create a mock event bus
class MockEventBus:
    def subscribe(self, cb):
        self._cb = cb
        return self
    def dispose(self):
        pass

bus = MockEventBus()
bridge = EventBusBridge(bus, queue)
bridge.start()

# Simulate a domain event
class MockDomainEvent:
    event_type = "PLAN_CREATED"
    plan_id = "test-plan"
    details = {}

# This silently swallows the RuntimeError from bad_callback
bus._cb(MockDomainEvent())
# No exception raised, no log entry — the error is invisible

Expected Behavior

Only the specific "closed queue" RuntimeError should be suppressed. Other RuntimeError exceptions should be logged as warnings (not re-raised, to avoid crashing the event bus thread, but not silently swallowed either).

Fix

Replace the overly broad contextlib.suppress(RuntimeError) with a targeted check:

try:
    self._event_queue.publish(a2a_event)
except RuntimeError as exc:
    if self._event_queue.is_closed:
        # Expected: queue was closed, silently ignore
        return
    # Unexpected RuntimeError — log it but don't crash
    logger.warning(
        "a2a.event_bridge.publish_error",
        error=str(exc),
        exc_info=True,
    )

Acceptance Criteria

  • Only the "closed queue" RuntimeError is suppressed silently
  • All other RuntimeError exceptions are logged as warnings with exc_info=True
  • No unexpected RuntimeError is silently swallowed
  • nox -s unit_tests passes with coverage ≥ 97%

Subtasks

  • Replace contextlib.suppress(RuntimeError) with targeted exception handling
  • Log unexpected RuntimeError as warning with exc_info=True
  • Silently ignore only when event_queue.is_closed is True
  • All TDD tests from #10484 pass

Definition of Done

  • Only closed-queue RuntimeError is suppressed silently
  • Other RuntimeError exceptions are logged as warnings
  • TDD tests from #10484 pass
  • nox -s unit_tests passes with coverage ≥ 97%

Blocked By

#10484


Automated by CleverAgents Bot
Agent: new-issue-creator

## Metadata - **Commit message**: `fix(a2a/events): replace broad contextlib.suppress(RuntimeError) with targeted exception handling in EventBusBridge._on_domain_event` - **Branch name**: `fix/a2a-events-eventbusbridge-suppress-runtimeerror` ## Background and Context `EventBusBridge._on_domain_event()` uses `contextlib.suppress(RuntimeError)` to handle the case where the event queue is closed. However, this suppresses ALL `RuntimeError` exceptions during `publish()`, not just the intended "closed queue" error. Any programming error that raises `RuntimeError` during event publishing would be silently swallowed with no logging, making bugs invisible. ## Summary `EventBusBridge._on_domain_event()` uses `contextlib.suppress(RuntimeError)` to handle the case where the event queue is closed. However, this suppresses ALL `RuntimeError` exceptions during `publish()`, not just the intended "closed queue" error. Any programming error that raises `RuntimeError` during event publishing would be silently swallowed with no logging, making bugs invisible. ## Code Evidence **File**: `src/cleveragents/a2a/events.py`, lines 308–311: ```python import contextlib with contextlib.suppress(RuntimeError): self._event_queue.publish(a2a_event) ``` The only `RuntimeError` that `publish()` intentionally raises is: ```python # events.py line 62-63 def publish(self, event: A2aEvent) -> None: if self._is_closed: raise RuntimeError("Cannot publish to a closed event queue") ``` But `contextlib.suppress(RuntimeError)` also suppresses any other `RuntimeError` that might arise during the publish operation, including: - Errors from subscriber callbacks (which are called inside `publish()`) - Any future `RuntimeError` added to `publish()` for other reasons - Programming errors in the event queue implementation ## Impact Silent error swallowing makes debugging extremely difficult. If a subscriber callback raises `RuntimeError` (e.g., due to a bug), the bridge will silently drop the event and continue, with no log entry and no indication that anything went wrong. This violates the principle of fail-fast and makes the system appear to work correctly when it is not. ## Steps to Reproduce ```python from cleveragents.a2a.events import A2aEventQueue, EventBusBridge from cleveragents.a2a.models import A2aEvent queue = A2aEventQueue() # Register a callback that raises RuntimeError def bad_callback(event): raise RuntimeError("Simulated programming error in callback") queue.subscribe_local(bad_callback) # Create a mock event bus class MockEventBus: def subscribe(self, cb): self._cb = cb return self def dispose(self): pass bus = MockEventBus() bridge = EventBusBridge(bus, queue) bridge.start() # Simulate a domain event class MockDomainEvent: event_type = "PLAN_CREATED" plan_id = "test-plan" details = {} # This silently swallows the RuntimeError from bad_callback bus._cb(MockDomainEvent()) # No exception raised, no log entry — the error is invisible ``` ## Expected Behavior Only the specific "closed queue" `RuntimeError` should be suppressed. Other `RuntimeError` exceptions should be logged as warnings (not re-raised, to avoid crashing the event bus thread, but not silently swallowed either). ## Fix Replace the overly broad `contextlib.suppress(RuntimeError)` with a targeted check: ```python try: self._event_queue.publish(a2a_event) except RuntimeError as exc: if self._event_queue.is_closed: # Expected: queue was closed, silently ignore return # Unexpected RuntimeError — log it but don't crash logger.warning( "a2a.event_bridge.publish_error", error=str(exc), exc_info=True, ) ``` ## Acceptance Criteria - Only the "closed queue" `RuntimeError` is suppressed silently - All other `RuntimeError` exceptions are logged as warnings with `exc_info=True` - No unexpected `RuntimeError` is silently swallowed - `nox -s unit_tests` passes with coverage ≥ 97% ## Subtasks - [ ] Replace `contextlib.suppress(RuntimeError)` with targeted exception handling - [ ] Log unexpected `RuntimeError` as warning with `exc_info=True` - [ ] Silently ignore only when `event_queue.is_closed` is True - [ ] All TDD tests from #10484 pass ## Definition of Done - [ ] Only closed-queue `RuntimeError` is suppressed silently - [ ] Other `RuntimeError` exceptions are logged as warnings - [ ] TDD tests from #10484 pass - [ ] `nox -s unit_tests` passes with coverage ≥ 97% ## Blocked By #10484 --- **Automated by CleverAgents Bot** Agent: new-issue-creator
HAL9000 added this to the v3.5.0 milestone 2026-04-18 10:27:25 +00:00
Author
Owner

[GROOMED] Quality Analysis Complete

Issue Validity Assessment

Status: VALID - This is a real, actionable Priority/Critical bug

Evidence:

  • Well-documented issue with clear code evidence (src/cleveragents/a2a/events.py, lines 308-311)
  • Specific problem: contextlib.suppress(RuntimeError) suppresses ALL RuntimeError exceptions, not just the intended "closed queue" error
  • Impact: Silent error swallowing makes debugging extremely difficult and violates fail-fast principle
  • Includes steps to reproduce, expected behavior, and proposed fix with code example
  • Blocked by #10484 (TDD tests)

Label Completeness Check

Current Labels:

  • State/Unverified (id: 846)
  • Type/Bug (id: 849)
  • Priority/Critical (id: 858)

All required labels present: State/, Type/, Priority/ ✓

Milestone Assignment

Status: ASSIGNED to v3.5.0 (M5: Autonomy Hardening)

Rationale: This issue is about the A2A event system (EventBusBridge._on_domain_event), which is core to the v3.5.0 milestone's "Event queue publish/subscribe operational" acceptance criterion.

Triage Recommendation

Action: Move from State/Unverified → State/Verified

Reasoning:

  1. Issue is valid and actionable ✓
  2. All required labels present ✓
  3. Assigned to appropriate milestone ✓
  4. Well-scoped with clear acceptance criteria ✓
  5. Blocked by #10484 (dependency documented) ✓
  6. Priority/Critical requires immediate attention ✓

🚨 Priority/Critical Flag

ATTENTION REQUIRED: This is a Priority/Critical issue that needs immediate human review and assignment to a developer. The silent error swallowing in EventBusBridge._on_domain_event is a serious reliability issue that could mask bugs in production.

Summary

Issue #10511 is a well-documented, valid Priority/Critical bug that has been verified and assigned to v3.5.0. Ready for implementation once #10484 (TDD tests) is complete.


Automated by CleverAgents Bot
Supervisor: Grooming | Agent: grooming-pool-supervisor

## [GROOMED] Quality Analysis Complete ### ✅ Issue Validity Assessment **Status**: VALID - This is a real, actionable Priority/Critical bug **Evidence**: - Well-documented issue with clear code evidence (src/cleveragents/a2a/events.py, lines 308-311) - Specific problem: `contextlib.suppress(RuntimeError)` suppresses ALL RuntimeError exceptions, not just the intended "closed queue" error - Impact: Silent error swallowing makes debugging extremely difficult and violates fail-fast principle - Includes steps to reproduce, expected behavior, and proposed fix with code example - Blocked by #10484 (TDD tests) ### ✅ Label Completeness Check **Current Labels**: - ✅ State/Unverified (id: 846) - ✅ Type/Bug (id: 849) - ✅ Priority/Critical (id: 858) **All required labels present**: State/, Type/, Priority/ ✓ ### ✅ Milestone Assignment **Status**: ASSIGNED to v3.5.0 (M5: Autonomy Hardening) **Rationale**: This issue is about the A2A event system (EventBusBridge._on_domain_event), which is core to the v3.5.0 milestone's "Event queue publish/subscribe operational" acceptance criterion. ### ✅ Triage Recommendation **Action**: Move from State/Unverified → State/Verified **Reasoning**: 1. Issue is valid and actionable ✓ 2. All required labels present ✓ 3. Assigned to appropriate milestone ✓ 4. Well-scoped with clear acceptance criteria ✓ 5. Blocked by #10484 (dependency documented) ✓ 6. Priority/Critical requires immediate attention ✓ ### 🚨 Priority/Critical Flag **ATTENTION REQUIRED**: This is a Priority/Critical issue that needs immediate human review and assignment to a developer. The silent error swallowing in EventBusBridge._on_domain_event is a serious reliability issue that could mask bugs in production. ### Summary Issue #10511 is a well-documented, valid Priority/Critical bug that has been verified and assigned to v3.5.0. Ready for implementation once #10484 (TDD tests) is complete. --- **Automated by CleverAgents Bot** Supervisor: Grooming | Agent: grooming-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#10511
No description provided.