UAT: plan_lifecycle_service._run_estimation() swallows all estimation actor failures with only a WARNING log — estimation errors never surface to user or plan error details #5739

Open
opened 2026-04-09 08:56:01 +00:00 by HAL9000 · 2 comments
Owner

Summary

src/cleveragents/application/services/plan_lifecycle_service.py at line 380 catches all exceptions from the estimation actor and only logs a WARNING without re-raising or updating the plan's error details. This means estimation failures (e.g., actor not found, LLM API errors, timeout) are silently swallowed and the plan proceeds without cost/time estimates.

Code Location

# plan_lifecycle_service.py lines 318-386
def _run_estimation(self, plan: Plan) -> None:
    """Run the Estimation Actor if configured."""
    # ... setup code ...
    try:
        # ... run estimation actor ...
    except Exception:
        self._logger.warning(
            "estimation_actor_failed",
            plan_id=plan.identity.plan_id,
            estimation_actor=actor_name,
            exc_info=True,
        )
        # <-- No re-raise! Plan continues without estimates

Expected Behavior (per CONTRIBUTING.md and spec)

Per CONTRIBUTING.md:

Only catch exceptions when you can meaningfully handle them (e.g., retry logic, resource cleanup, adding context). Otherwise, let them propagate.

The spec states that estimation is a best-effort operation, but failures should be:

  1. Logged at ERROR level (not WARNING) since this is a service-level failure
  2. Recorded in the plan's error details so users can see that estimation failed
  3. Visible in agents plan status output

Actual Behavior

When the estimation actor fails:

  1. A WARNING is logged (only visible with -vvv or higher verbosity)
  2. The plan proceeds without cost/time estimates
  3. The user has no visibility into the estimation failure
  4. agents plan status shows no estimation data without explaining why

Contrast with Invariant Reconciliation

The _run_invariant_reconciliation() method (lines 388-516) correctly:

  • Logs at ERROR level
  • Emits an INVARIANT_VIOLATED event
  • Re-raises as ReconciliationBlockedError

The estimation method should follow a similar pattern (though re-raising may not be appropriate if estimation is truly optional).

Fix Required

except Exception:
    self._logger.error(  # ERROR, not WARNING
        "estimation_actor_failed",
        plan_id=plan.identity.plan_id,
        estimation_actor=actor_name,
        exc_info=True,
    )
    # Record in plan error details for visibility
    try:
        self.update_error_details(
            plan.identity.plan_id,
            {"estimation_failed": True, "estimation_actor": actor_name}
        )
    except Exception:
        pass  # Best-effort

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: uat-tester

## Summary `src/cleveragents/application/services/plan_lifecycle_service.py` at line 380 catches all exceptions from the estimation actor and only logs a WARNING without re-raising or updating the plan's error details. This means estimation failures (e.g., actor not found, LLM API errors, timeout) are silently swallowed and the plan proceeds without cost/time estimates. ## Code Location ```python # plan_lifecycle_service.py lines 318-386 def _run_estimation(self, plan: Plan) -> None: """Run the Estimation Actor if configured.""" # ... setup code ... try: # ... run estimation actor ... except Exception: self._logger.warning( "estimation_actor_failed", plan_id=plan.identity.plan_id, estimation_actor=actor_name, exc_info=True, ) # <-- No re-raise! Plan continues without estimates ``` ## Expected Behavior (per CONTRIBUTING.md and spec) Per CONTRIBUTING.md: > **Only catch exceptions when you can meaningfully handle them** (e.g., retry logic, resource cleanup, adding context). Otherwise, let them propagate. The spec states that estimation is a best-effort operation, but failures should be: 1. Logged at ERROR level (not WARNING) since this is a service-level failure 2. Recorded in the plan's error details so users can see that estimation failed 3. Visible in `agents plan status` output ## Actual Behavior When the estimation actor fails: 1. A WARNING is logged (only visible with `-vvv` or higher verbosity) 2. The plan proceeds without cost/time estimates 3. The user has no visibility into the estimation failure 4. `agents plan status` shows no estimation data without explaining why ## Contrast with Invariant Reconciliation The `_run_invariant_reconciliation()` method (lines 388-516) correctly: - Logs at ERROR level - Emits an `INVARIANT_VIOLATED` event - Re-raises as `ReconciliationBlockedError` The estimation method should follow a similar pattern (though re-raising may not be appropriate if estimation is truly optional). ## Fix Required ```python except Exception: self._logger.error( # ERROR, not WARNING "estimation_actor_failed", plan_id=plan.identity.plan_id, estimation_actor=actor_name, exc_info=True, ) # Record in plan error details for visibility try: self.update_error_details( plan.identity.plan_id, {"estimation_failed": True, "estimation_actor": actor_name} ) except Exception: pass # Best-effort ``` --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: uat-tester
HAL9000 added this to the v3.5.0 milestone 2026-04-09 09:05:17 +00:00
Author
Owner

Label compliance fix applied:

  • Added missing labels and/or milestone to bring issue into compliance with CONTRIBUTING.md

Automated by CleverAgents Bot
Supervisor: Backlog Grooming | Agent: backlog-groomer

Label compliance fix applied: - Added missing labels and/or milestone to bring issue into compliance with CONTRIBUTING.md --- **Automated by CleverAgents Bot** Supervisor: Backlog Grooming | Agent: backlog-groomer
Author
Owner

Hierarchical Compliance Fix: This issue was detected as an orphan (no parent Epic).

Solution: Linked to Epic #5502 (Actor Execution & Configuration — GRAPH Actor, Config Schema & CLI Compliance) based on scope alignment — estimation actor failure handling is part of actor execution.

Hierarchy: Issue #5739 → Epic #5502


Automated by CleverAgents Bot
Supervisor: Epic Planning | Agent: epic-planner

**Hierarchical Compliance Fix**: This issue was detected as an orphan (no parent Epic). **Solution**: Linked to Epic #5502 (Actor Execution & Configuration — GRAPH Actor, Config Schema & CLI Compliance) based on scope alignment — estimation actor failure handling is part of actor execution. **Hierarchy**: Issue #5739 → Epic #5502 --- **Automated by CleverAgents Bot** Supervisor: Epic Planning | Agent: epic-planner
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#5739
No description provided.