BUG-HUNT: [error-handling] Broad exception handling in _cleanup_session_devcontainers can mask cleanup failures #3088

Open
opened 2026-04-05 05:47:59 +00:00 by freemo · 2 comments
Owner

Metadata

  • Branch: fix/error-handling-cleanup-session-devcontainers
  • Commit Message: fix(a2a): replace broad exception suppression in _cleanup_session_devcontainers with granular error handling
  • Milestone: (none — see backlog note below)
  • Parent Epic: #362

Backlog note: This issue was discovered during autonomous operation
on milestone v3.3.0. It does not block milestone completion and has been
placed in the backlog for human review and future milestone assignment.

Bug Report: [error-handling] — Broad exception handling in _cleanup_session_devcontainers can mask cleanup failures

Severity Assessment

  • Impact: Failures in devcontainer cleanup may go unnoticed, potentially leaving orphaned containers running and consuming resources.
  • Likelihood: Medium. Any unexpected issue during the cleanup process will be caught and suppressed.
  • Priority: Medium

Location

  • File: src/cleveragents/a2a/facade.py
  • Function/Class: _cleanup_session_devcontainers
  • Lines: 361-366

Description

The _cleanup_session_devcontainers function in src/cleveragents/a2a/facade.py uses a broad try...except Exception block to wrap the logic for stopping devcontainers. While the intention is to ensure that session closing does not fail due to cleanup issues, this implementation can hide persistent problems.

If an exception occurs during cleanup, it is caught and logged as a warning, but the exception is not re-raised. This means that if there is a recurring issue with the devcontainer cleanup process (e.g., a problem with the Docker environment, incorrect permissions), it will not be escalated and may go unnoticed.

Evidence

        except Exception:
            logger.warning(
                "a2a.session.close.devcontainer_cleanup_failed",
                session_id=session_id,
                exc_info=True,
            )

Expected Behavior

While it is reasonable to prevent cleanup failures from blocking the session close operation, the exceptions should be handled in a more granular way. Specific, expected exceptions that are safe to ignore could be caught, while unexpected exceptions should be logged with a higher severity (e.g., ERROR) or re-raised to ensure they are not missed.

Actual Behavior

All exceptions during devcontainer cleanup are caught and logged as a warning. This can lead to a situation where cleanup is consistently failing, but the issue is not visible without carefully monitoring the logs for warnings.

Suggested Fix

Refine the exception handling to catch more specific exceptions that are known to be non-critical. Any unexpected exceptions should be logged as errors to increase visibility.

        from some_docker_library import DockerConnectionError

        try:
            # ... cleanup logic ...
        except DockerConnectionError as e:
            logger.warning("Could not connect to Docker for cleanup: %s", e)
        except Exception:
            logger.error(
                "An unexpected error occurred during devcontainer cleanup",
                exc_info=True,
            )

Category

error-handling

Subtasks

  • Identify all specific exception types that can be raised by the devcontainer stop logic
  • Replace the broad except Exception with targeted exception handlers for known, non-critical errors
  • Upgrade logging severity to ERROR (or re-raise) for unexpected exceptions in the cleanup path
  • Add/update Behave unit tests covering the new exception handling branches
  • Verify all nox sessions pass

Definition of Done

  • _cleanup_session_devcontainers no longer uses a bare except Exception to suppress all errors
  • Known, non-critical cleanup exceptions are caught specifically and logged at WARNING
  • Unexpected exceptions are logged at ERROR level (or re-raised) to ensure visibility
  • New Behave unit tests cover each exception-handling branch
  • All nox stages pass
  • Coverage >= 97%

Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/error-handling-cleanup-session-devcontainers` - **Commit Message**: `fix(a2a): replace broad exception suppression in _cleanup_session_devcontainers with granular error handling` - **Milestone**: *(none — see backlog note below)* - **Parent Epic**: #362 > **Backlog note:** This issue was discovered during autonomous operation > on milestone v3.3.0. It does not block milestone completion and has been > placed in the backlog for human review and future milestone assignment. ## Bug Report: [error-handling] — Broad exception handling in `_cleanup_session_devcontainers` can mask cleanup failures ### Severity Assessment - **Impact**: Failures in devcontainer cleanup may go unnoticed, potentially leaving orphaned containers running and consuming resources. - **Likelihood**: Medium. Any unexpected issue during the cleanup process will be caught and suppressed. - **Priority**: Medium ### Location - **File**: `src/cleveragents/a2a/facade.py` - **Function/Class**: `_cleanup_session_devcontainers` - **Lines**: 361-366 ### Description The `_cleanup_session_devcontainers` function in `src/cleveragents/a2a/facade.py` uses a broad `try...except Exception` block to wrap the logic for stopping devcontainers. While the intention is to ensure that session closing does not fail due to cleanup issues, this implementation can hide persistent problems. If an exception occurs during cleanup, it is caught and logged as a warning, but the exception is not re-raised. This means that if there is a recurring issue with the devcontainer cleanup process (e.g., a problem with the Docker environment, incorrect permissions), it will not be escalated and may go unnoticed. ### Evidence ```python except Exception: logger.warning( "a2a.session.close.devcontainer_cleanup_failed", session_id=session_id, exc_info=True, ) ``` ### Expected Behavior While it is reasonable to prevent cleanup failures from blocking the session close operation, the exceptions should be handled in a more granular way. Specific, expected exceptions that are safe to ignore could be caught, while unexpected exceptions should be logged with a higher severity (e.g., `ERROR`) or re-raised to ensure they are not missed. ### Actual Behavior All exceptions during devcontainer cleanup are caught and logged as a warning. This can lead to a situation where cleanup is consistently failing, but the issue is not visible without carefully monitoring the logs for warnings. ### Suggested Fix Refine the exception handling to catch more specific exceptions that are known to be non-critical. Any unexpected exceptions should be logged as errors to increase visibility. ```python from some_docker_library import DockerConnectionError try: # ... cleanup logic ... except DockerConnectionError as e: logger.warning("Could not connect to Docker for cleanup: %s", e) except Exception: logger.error( "An unexpected error occurred during devcontainer cleanup", exc_info=True, ) ``` ### Category error-handling ## Subtasks - [ ] Identify all specific exception types that can be raised by the devcontainer stop logic - [ ] Replace the broad `except Exception` with targeted exception handlers for known, non-critical errors - [ ] Upgrade logging severity to `ERROR` (or re-raise) for unexpected exceptions in the cleanup path - [ ] Add/update Behave unit tests covering the new exception handling branches - [ ] Verify all nox sessions pass ## Definition of Done - [ ] `_cleanup_session_devcontainers` no longer uses a bare `except Exception` to suppress all errors - [ ] Known, non-critical cleanup exceptions are caught specifically and logged at `WARNING` - [ ] Unexpected exceptions are logged at `ERROR` level (or re-raised) to ensure visibility - [ ] New Behave unit tests cover each exception-handling branch - [ ] All nox stages pass - [ ] Coverage >= 97% --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: ca-new-issue-creator
Author
Owner

Plan: Transition issue #3088 from State/Unverified to State/Verified.

Current labels detected: State/Unverified, Priority/Backlog, Type/Bug.

Preconditions check:

  • From State/Unverified to State/Verified is allowed (Maintainer triage the issue).
  • No Blocked label present, so no unpausing action needed.

Actions to perform:

  1. Remove the old state label: State/Unverified
  2. Add the new state label: State/Verified

Note: The Forgejo API wrapper being used here does not expose a direct labels API in the available methods. I can perform the changes via a proper label-management API call if you grant permission to use a label-edit operation (or if you’d like me to create an explicit request for label updates via UI). If you’d prefer, I can add a comment with the exact change plan and wait for maintainer approval to modify the labels in the UI.

Proposed final state: State/Verified applied to issue #3088. If the state label already reflects this, I will report that no change is necessary.


Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: ca-issue-state-updater

Plan: Transition issue #3088 from State/Unverified to State/Verified. Current labels detected: State/Unverified, Priority/Backlog, Type/Bug. Preconditions check: - From State/Unverified to State/Verified is allowed (Maintainer triage the issue). - No Blocked label present, so no unpausing action needed. Actions to perform: 1) Remove the old state label: State/Unverified 2) Add the new state label: State/Verified Note: The Forgejo API wrapper being used here does not expose a direct labels API in the available methods. I can perform the changes via a proper label-management API call if you grant permission to use a label-edit operation (or if you’d like me to create an explicit request for label updates via UI). If you’d prefer, I can add a comment with the exact change plan and wait for maintainer approval to modify the labels in the UI. Proposed final state: State/Verified applied to issue #3088. If the state label already reflects this, I will report that no change is necessary. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: ca-issue-state-updater
freemo added this to the v3.6.0 milestone 2026-04-05 06:18:40 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • Priority: Backlog — error handling improvement, not blocking any milestone
  • Milestone: v3.6.0 (Advanced Concepts & Deferred Features)
  • MoSCoW: Could Have — the broad exception handling is a code quality concern but the current behavior (logging at WARNING level) is functional. Refining to granular exception types is a nice-to-have improvement.
  • Parent Epic: #362 (Security & Safety Hardening)

The fix involves identifying specific Docker-related exception types and catching them individually. This is good practice but not urgent.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Issue triaged by project owner: - **State**: Verified - **Priority**: Backlog — error handling improvement, not blocking any milestone - **Milestone**: v3.6.0 (Advanced Concepts & Deferred Features) - **MoSCoW**: Could Have — the broad exception handling is a code quality concern but the current behavior (logging at WARNING level) is functional. Refining to granular exception types is a nice-to-have improvement. - **Parent Epic**: #362 (Security & Safety Hardening) The fix involves identifying specific Docker-related exception types and catching them individually. This is good practice but not urgent. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
freemo removed this from the v3.6.0 milestone 2026-04-07 00:19:59 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#362 Epic: Security & Safety Hardening
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#3088
No description provided.