BUG-HUNT: [error-handling] Broad exception handling in agent graphs #3183

Closed
opened 2026-04-05 07:30:30 +00:00 by freemo · 3 comments
Owner

Metadata

  • Branch: fix/error-handling-agent-graphs-broad-exceptions
  • Commit Message: fix(error-handling): replace broad exception handling in agent graph nodes
  • Milestone: None (backlog)
  • Parent Epic: TBD — see orphan note below

Backlog note: This issue was discovered during autonomous operation
on milestone None. It does not block milestone completion and has been
placed in the backlog for human review and future milestone assignment.

Background and Context

The codebase uses broad except Exception: and except Exception as exc: clauses in several agent graph modules. While this prevents the application from crashing, it violates the project's error-handling guidelines, which mandate that exceptions should propagate unless they can be meaningfully handled locally. Broad exception handlers can hide bugs, complicate debugging, and lead to unexpected silent failures in agent graph execution.

This issue is specifically scoped to the agent graph layer (plan_generation.py, context_analysis.py, auto_debug.py). A related general refactor issue exists at #3155.

Current Behavior

The following locations use broad except Exception: or except Exception as exc: clauses:

src/cleveragents/agents/graphs/plan_generation.py

  • BoundedMemorySaver._prune — catches Exception silently, resets channel_versions = {}
  • PlanGenerationGraph._generate_plan (multiple occurrences) — catches Exception as best-effort fallback, sets original_content = None
  • PlanGenerationGraph._analyze_requirements — broad catch
  • PlanGenerationGraph._validate — broad catch
  • PlanGenerationGraph._analyze_contexts — broad catch

src/cleveragents/agents/graphs/context_analysis.py

  • ContextAnalysisAgent._load_files — catches Exception as exc, appends error string
  • ContextAnalysisAgent._analyze_dependencies — broad catch
  • ContextAnalysisAgent._score_relevance — broad catch
  • ContextAnalysisAgent._summarize_context — broad catch

src/cleveragents/agents/graphs/auto_debug.py

  • AutoDebugAgent._analyze_error — broad catch
  • AutoDebugAgent._generate_fix — broad catch
  • AutoDebugAgent._validate_fix — broad catch

Example evidence:

# plan_generation.py — BoundedMemorySaver._prune
except Exception:  # pragma: no cover - defensive cleanup
    channel_versions = {}

# plan_generation.py — PlanGenerationGraph._generate_plan
except Exception:  # pragma: no cover - best effort
    original_content = None

# context_analysis.py — ContextAnalysisAgent._load_files
except Exception as exc:  # pragma: no cover - defensive
    errors.append(f"Error loading {file_path}: {exc!s}")

Expected Behavior

Each except clause should catch only the specific exception types that are expected and can be meaningfully handled at that point. For example:

  • File I/O operations should catch (OSError, PermissionError, FileNotFoundError)
  • LLM/provider calls should catch the specific exceptions raised by the LLM library
  • Memory/state operations should catch specific state-related exceptions

Exceptions that cannot be handled locally should be allowed to propagate to the top-level handler per the project's error-handling guidelines.

Acceptance Criteria

  • All except Exception: and except Exception as exc: clauses in plan_generation.py, context_analysis.py, and auto_debug.py are replaced with specific exception types
  • Each replacement exception type is the narrowest type that correctly handles the expected failure mode
  • No exception is silently swallowed without either re-raising or having a documented, meaningful recovery action
  • All existing tests continue to pass after the change
  • New BDD scenarios are added to cover the specific exception paths introduced

Supporting Information

  • Related general issue: #3155 "Refactor: Replace broad exception handling with specific exception types"
  • Related specific issues: #3140 (event callback), #3137 (devcontainer cleanup)
  • Project error-handling guidelines: CONTRIBUTING.md § "Error and Exception Handling"

Subtasks

  • Audit all except Exception clauses in src/cleveragents/agents/graphs/plan_generation.py and identify the correct specific exception types for each
  • Audit all except Exception clauses in src/cleveragents/agents/graphs/context_analysis.py and identify the correct specific exception types for each
  • Audit all except Exception clauses in src/cleveragents/agents/graphs/auto_debug.py and identify the correct specific exception types for each
  • Replace broad exception handlers with specific types in plan_generation.py
  • Replace broad exception handlers with specific types in context_analysis.py
  • Replace broad exception handlers with specific types in auto_debug.py
  • Tests (Behave): Add BDD scenarios for specific exception paths in agent graph nodes
  • Verify coverage >=97% via nox -s coverage_report
  • Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly (fix(error-handling): replace broad exception handling in agent graph nodes), followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly (fix/error-handling-agent-graphs-broad-exceptions).
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
  • All nox stages pass.
  • Coverage >= 97%.

Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/error-handling-agent-graphs-broad-exceptions` - **Commit Message**: `fix(error-handling): replace broad exception handling in agent graph nodes` - **Milestone**: None (backlog) - **Parent Epic**: TBD — see orphan note below > **Backlog note:** This issue was discovered during autonomous operation > on milestone None. It does not block milestone completion and has been > placed in the backlog for human review and future milestone assignment. ## Background and Context The codebase uses broad `except Exception:` and `except Exception as exc:` clauses in several agent graph modules. While this prevents the application from crashing, it violates the project's error-handling guidelines, which mandate that exceptions should propagate unless they can be meaningfully handled locally. Broad exception handlers can hide bugs, complicate debugging, and lead to unexpected silent failures in agent graph execution. This issue is specifically scoped to the agent graph layer (`plan_generation.py`, `context_analysis.py`, `auto_debug.py`). A related general refactor issue exists at #3155. ## Current Behavior The following locations use broad `except Exception:` or `except Exception as exc:` clauses: **`src/cleveragents/agents/graphs/plan_generation.py`** - `BoundedMemorySaver._prune` — catches `Exception` silently, resets `channel_versions = {}` - `PlanGenerationGraph._generate_plan` (multiple occurrences) — catches `Exception` as best-effort fallback, sets `original_content = None` - `PlanGenerationGraph._analyze_requirements` — broad catch - `PlanGenerationGraph._validate` — broad catch - `PlanGenerationGraph._analyze_contexts` — broad catch **`src/cleveragents/agents/graphs/context_analysis.py`** - `ContextAnalysisAgent._load_files` — catches `Exception as exc`, appends error string - `ContextAnalysisAgent._analyze_dependencies` — broad catch - `ContextAnalysisAgent._score_relevance` — broad catch - `ContextAnalysisAgent._summarize_context` — broad catch **`src/cleveragents/agents/graphs/auto_debug.py`** - `AutoDebugAgent._analyze_error` — broad catch - `AutoDebugAgent._generate_fix` — broad catch - `AutoDebugAgent._validate_fix` — broad catch Example evidence: ```python # plan_generation.py — BoundedMemorySaver._prune except Exception: # pragma: no cover - defensive cleanup channel_versions = {} # plan_generation.py — PlanGenerationGraph._generate_plan except Exception: # pragma: no cover - best effort original_content = None # context_analysis.py — ContextAnalysisAgent._load_files except Exception as exc: # pragma: no cover - defensive errors.append(f"Error loading {file_path}: {exc!s}") ``` ## Expected Behavior Each `except` clause should catch only the specific exception types that are expected and can be meaningfully handled at that point. For example: - File I/O operations should catch `(OSError, PermissionError, FileNotFoundError)` - LLM/provider calls should catch the specific exceptions raised by the LLM library - Memory/state operations should catch specific state-related exceptions Exceptions that cannot be handled locally should be allowed to propagate to the top-level handler per the project's error-handling guidelines. ## Acceptance Criteria - [ ] All `except Exception:` and `except Exception as exc:` clauses in `plan_generation.py`, `context_analysis.py`, and `auto_debug.py` are replaced with specific exception types - [ ] Each replacement exception type is the narrowest type that correctly handles the expected failure mode - [ ] No exception is silently swallowed without either re-raising or having a documented, meaningful recovery action - [ ] All existing tests continue to pass after the change - [ ] New BDD scenarios are added to cover the specific exception paths introduced ## Supporting Information - Related general issue: #3155 "Refactor: Replace broad exception handling with specific exception types" - Related specific issues: #3140 (event callback), #3137 (devcontainer cleanup) - Project error-handling guidelines: CONTRIBUTING.md § "Error and Exception Handling" ## Subtasks - [ ] Audit all `except Exception` clauses in `src/cleveragents/agents/graphs/plan_generation.py` and identify the correct specific exception types for each - [ ] Audit all `except Exception` clauses in `src/cleveragents/agents/graphs/context_analysis.py` and identify the correct specific exception types for each - [ ] Audit all `except Exception` clauses in `src/cleveragents/agents/graphs/auto_debug.py` and identify the correct specific exception types for each - [ ] Replace broad exception handlers with specific types in `plan_generation.py` - [ ] Replace broad exception handlers with specific types in `context_analysis.py` - [ ] Replace broad exception handlers with specific types in `auto_debug.py` - [ ] Tests (Behave): Add BDD scenarios for specific exception paths in agent graph nodes - [ ] Verify coverage >=97% via `nox -s coverage_report` - [ ] Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly (`fix(error-handling): replace broad exception handling in agent graph nodes`), followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly (`fix/error-handling-agent-graphs-broad-exceptions`). - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. - All nox stages pass. - Coverage >= 97%. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: ca-new-issue-creator
Author
Owner

⚠️ Orphan Notice — Parent Epic Needs Manual Linking

No parent Epic was specified when this issue was created. This issue was discovered during autonomous bug hunting and requires a human maintainer to assign it to the appropriate parent Epic.

Suggested candidates (based on the affected code — agent graph nodes):

  • #397 Epic: Server & Autonomy Infrastructure — covers agent graph infrastructure
  • #362 Epic: Security & Safety Hardening — covers robustness and error handling

Please open the appropriate parent Epic and add this issue (#3183) under its "depends on" list, or open this issue and add the parent under "blocks." The correct direction is: this issue (#3183) blocks the parent Epic (the Epic cannot be complete until this child is done).


Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: ca-new-issue-creator

⚠️ **Orphan Notice — Parent Epic Needs Manual Linking** No parent Epic was specified when this issue was created. This issue was discovered during autonomous bug hunting and requires a human maintainer to assign it to the appropriate parent Epic. **Suggested candidates** (based on the affected code — agent graph nodes): - **#397** Epic: Server & Autonomy Infrastructure — covers agent graph infrastructure - **#362** Epic: Security & Safety Hardening — covers robustness and error handling Please open the appropriate parent Epic and add this issue (#3183) under its "depends on" list, or open this issue and add the parent under "blocks." The correct direction is: **this issue (#3183) blocks the parent Epic** (the Epic cannot be complete until this child is done). --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: ca-new-issue-creator
Author
Owner

Closing as duplicate of #1824.

Both issues target the same problem: overly broad except Exception: handlers in agent graph modules (context_analysis.py, plan_generation.py). This issue (#3183) adds auto_debug.py to the scope, but that additional file can be addressed as a follow-up or by expanding #1824's scope.

#1824 was filed first, already has a milestone (v3.6.0), and covers the core problem. The auto_debug.py scope from this issue has been noted on #1824.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Closing as duplicate of #1824. Both issues target the same problem: overly broad `except Exception:` handlers in agent graph modules (`context_analysis.py`, `plan_generation.py`). This issue (#3183) adds `auto_debug.py` to the scope, but that additional file can be addressed as a follow-up or by expanding #1824's scope. #1824 was filed first, already has a milestone (v3.6.0), and covers the core problem. The `auto_debug.py` scope from this issue has been noted on #1824. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
Author
Owner

Label compliance fix applied:

  • Removed conflicting label: State/Unverified (groomer had incorrectly added this)
  • Kept: State/Verified (the existing state)
  • Reason: Issue had both State/Unverified and State/Verified — conflicting state labels. Per CONTRIBUTING.md, exactly one State/* label is required. State/Verified is the more advanced state and was the original label.

Automated by CleverAgents Bot
Supervisor: Backlog Grooming | Agent: ca-backlog-groomer

Label compliance fix applied: - Removed conflicting label: `State/Unverified` (groomer had incorrectly added this) - Kept: `State/Verified` (the existing state) - Reason: Issue had both `State/Unverified` and `State/Verified` — conflicting state labels. Per CONTRIBUTING.md, exactly one `State/*` label is required. `State/Verified` is the more advanced state and was the original label. --- **Automated by CleverAgents Bot** Supervisor: Backlog Grooming | Agent: ca-backlog-groomer
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#3183
No description provided.