UAT: ToolRuntime.execute() records trace as success=True before output validation, causing inconsistent trace state on output schema failure #4809

Open
opened 2026-04-08 19:32:52 +00:00 by HAL9000 · 1 comment
Owner

Bug Report

Feature Area: Tool Lifecycle — execution tracing / JSON Schema output validation
Severity: Medium — trace data is incorrect when output validation fails
Found by: UAT tester instance uat-worker-tool-skill-system
Spec reference: docs/specification.md §Tool Interface — JSON Schema inputs/outputs; execution tracing


What Was Tested

Code-level analysis of src/cleveragents/tool/lifecycle.pyToolRuntime.execute() method.

Expected Behavior (from spec)

The execution trace should accurately reflect the final outcome of a tool execution. If output validation fails, the trace should record success=False.

Actual Behavior

In ToolRuntime.execute(), the execution trace is recorded and added to the context before output validation occurs:

# src/cleveragents/tool/lifecycle.py — ToolRuntime.execute()

# Step 4: Execute with tracing
result = instance.execute(params, ctx)
elapsed_ms = ...

# Trace is recorded here with success=result.success (True on success)
trace.success = result.success   # ← True if tool succeeded
ctx.add_trace(trace)             # ← Trace added to context

# Step 5: Validate outputs — AFTER trace is already recorded
if tool.output_schema and result.data is not None:
    try:
        validate_tool_output(result.data, tool.output_schema)
    except ToolSchemaValidationError as exc:
        logger.warning(...)
        return ToolResult(
            success=False,       # ← Returns failure
            ...
            error=f"Output validation failed: {exc}",
        )
        # But the trace already says success=True!

When output validation fails:

  1. ctx.traces contains a trace with success=True
  2. The method returns a ToolResult(success=False)
  3. The trace and the result are inconsistent

Impact

  • Debugging is misleading: the trace log shows the tool "succeeded" but the caller receives a failure result
  • Audit trails are incorrect for tools with output schema validation failures
  • Monitoring/observability tools that rely on traces will report false positives

Code Location

src/cleveragents/tool/lifecycle.pyToolRuntime.execute() method, approximately lines 280–340.

Fix

Move ctx.add_trace(trace) to after output validation, and update trace.success and trace.error if output validation fails:

# After output validation
if tool.output_schema and result.data is not None:
    try:
        validate_tool_output(result.data, tool.output_schema)
    except ToolSchemaValidationError as exc:
        trace.success = False
        trace.error = f"Output validation failed: {exc}"
        ctx.add_trace(trace)  # Add trace with correct state
        return ToolResult(success=False, ...)

ctx.add_trace(trace)  # Add trace only after all validation passes

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: uat-tester

## Bug Report **Feature Area:** Tool Lifecycle — execution tracing / JSON Schema output validation **Severity:** Medium — trace data is incorrect when output validation fails **Found by:** UAT tester instance `uat-worker-tool-skill-system` **Spec reference:** `docs/specification.md` §Tool Interface — JSON Schema inputs/outputs; execution tracing --- ### What Was Tested Code-level analysis of `src/cleveragents/tool/lifecycle.py` — `ToolRuntime.execute()` method. ### Expected Behavior (from spec) The execution trace should accurately reflect the final outcome of a tool execution. If output validation fails, the trace should record `success=False`. ### Actual Behavior In `ToolRuntime.execute()`, the execution trace is recorded and added to the context **before** output validation occurs: ```python # src/cleveragents/tool/lifecycle.py — ToolRuntime.execute() # Step 4: Execute with tracing result = instance.execute(params, ctx) elapsed_ms = ... # Trace is recorded here with success=result.success (True on success) trace.success = result.success # ← True if tool succeeded ctx.add_trace(trace) # ← Trace added to context # Step 5: Validate outputs — AFTER trace is already recorded if tool.output_schema and result.data is not None: try: validate_tool_output(result.data, tool.output_schema) except ToolSchemaValidationError as exc: logger.warning(...) return ToolResult( success=False, # ← Returns failure ... error=f"Output validation failed: {exc}", ) # But the trace already says success=True! ``` When output validation fails: 1. `ctx.traces` contains a trace with `success=True` 2. The method returns a `ToolResult(success=False)` 3. The trace and the result are inconsistent ### Impact - Debugging is misleading: the trace log shows the tool "succeeded" but the caller receives a failure result - Audit trails are incorrect for tools with output schema validation failures - Monitoring/observability tools that rely on traces will report false positives ### Code Location `src/cleveragents/tool/lifecycle.py` — `ToolRuntime.execute()` method, approximately lines 280–340. ### Fix Move `ctx.add_trace(trace)` to **after** output validation, and update `trace.success` and `trace.error` if output validation fails: ```python # After output validation if tool.output_schema and result.data is not None: try: validate_tool_output(result.data, tool.output_schema) except ToolSchemaValidationError as exc: trace.success = False trace.error = f"Output validation failed: {exc}" ctx.add_trace(trace) # Add trace with correct state return ToolResult(success=False, ...) ctx.add_trace(trace) # Add trace only after all validation passes ``` --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: uat-tester
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • Priority: Medium — spec compliance bug identified by UAT testing
  • Story Points: 3 (M) — targeted fix to align implementation with spec
  • MoSCoW: Must Have — spec compliance is required for correct system behavior

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner

Issue triaged by project owner: - **State**: Verified - **Priority**: Medium — spec compliance bug identified by UAT testing - **Story Points**: 3 (M) — targeted fix to align implementation with spec - **MoSCoW**: Must Have — spec compliance is required for correct system behavior --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner
HAL9000 added this to the v3.5.0 milestone 2026-04-09 03:03:36 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#4809
No description provided.