UAT: OOM exit code (137) not specifically detected in ContainerToolExecutor — OOM kills reported as generic "exit code 137" with no diagnostic #6157

Open
opened 2026-04-09 15:47:35 +00:00 by HAL9000 · 0 comments
Owner

Bug Report

Feature Area: Container Tool Execution — Error Handling
Severity: Non-critical (backlog)
Found by: UAT Testing (container-tool-execution worker)


What Was Tested

Code-level analysis of ContainerToolExecutor._run_command() and execute_tool() error handling against the spec's container error handling requirements.

Expected Behavior

When a container process is killed by the OOM (Out of Memory) killer, Docker/Podman returns exit code 137 (128 + SIGKILL). The system should detect this specific exit code and produce a structured, actionable error message indicating the container ran out of memory, rather than a generic "exit code 137" message.

Actual Behavior

execute_tool() handles non-zero exit codes generically:

# container_executor.py lines 261-274
if exec_result.exit_code != 0:
    logger.warning(
        "container_tool_exec_failed",
        tool_name=tool_name,
        exit_code=exec_result.exit_code,
    )
    partial_output = self._parse_output(exec_result.stdout)
    return ToolResult(
        success=False,
        output=partial_output,
        error=(
            f"Container execution failed (exit code "
            f"{exec_result.exit_code}): "
            f"{exec_result.stderr[:500]}"
        ),
        ...
    )

Exit code 137 (OOM kill) is treated identically to any other non-zero exit code. There is no special handling for:

  • Exit code 137 (OOM kill / SIGKILL)
  • Exit code 139 (segfault / SIGSEGV)
  • Exit code 143 (SIGTERM)

Users will see "Container execution failed (exit code 137)" with no indication that the container ran out of memory.

Code Location

  • src/cleveragents/tool/container_executor.py lines 261-274 — generic non-zero exit handling
  • src/cleveragents/tool/container_executor.pyContainerMetadata model — no oom_killed field

Expected Fix

Add exit code classification to execute_tool():

  • Exit code 137: "Container was killed (OOM or SIGKILL). Consider increasing memory limits."
  • Exit code 139: "Container process crashed (segfault)."
  • Exit code 143: "Container process was terminated (SIGTERM)."

Add oom_killed: bool field to ContainerMetadata for structured reporting.


Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: uat-tester

## Bug Report **Feature Area:** Container Tool Execution — Error Handling **Severity:** Non-critical (backlog) **Found by:** UAT Testing (container-tool-execution worker) --- ## What Was Tested Code-level analysis of `ContainerToolExecutor._run_command()` and `execute_tool()` error handling against the spec's container error handling requirements. ## Expected Behavior When a container process is killed by the OOM (Out of Memory) killer, Docker/Podman returns exit code 137 (128 + SIGKILL). The system should detect this specific exit code and produce a structured, actionable error message indicating the container ran out of memory, rather than a generic "exit code 137" message. ## Actual Behavior `execute_tool()` handles non-zero exit codes generically: ```python # container_executor.py lines 261-274 if exec_result.exit_code != 0: logger.warning( "container_tool_exec_failed", tool_name=tool_name, exit_code=exec_result.exit_code, ) partial_output = self._parse_output(exec_result.stdout) return ToolResult( success=False, output=partial_output, error=( f"Container execution failed (exit code " f"{exec_result.exit_code}): " f"{exec_result.stderr[:500]}" ), ... ) ``` Exit code 137 (OOM kill) is treated identically to any other non-zero exit code. There is no special handling for: - Exit code 137 (OOM kill / SIGKILL) - Exit code 139 (segfault / SIGSEGV) - Exit code 143 (SIGTERM) Users will see "Container execution failed (exit code 137)" with no indication that the container ran out of memory. ## Code Location - `src/cleveragents/tool/container_executor.py` lines 261-274 — generic non-zero exit handling - `src/cleveragents/tool/container_executor.py` — `ContainerMetadata` model — no `oom_killed` field ## Expected Fix Add exit code classification to `execute_tool()`: - Exit code 137: "Container was killed (OOM or SIGKILL). Consider increasing memory limits." - Exit code 139: "Container process crashed (segfault)." - Exit code 143: "Container process was terminated (SIGTERM)." Add `oom_killed: bool` field to `ContainerMetadata` for structured reporting. --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: uat-tester
HAL9000 added this to the v3.6.0 milestone 2026-04-09 21:17:33 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#6157
No description provided.