BUG-HUNT: [boundary] InlineToolExecutor._run_with_timeout passes tool.code as a command-line argument — execution fails silently for large inline scripts exceeding OS ARG_MAX #6597

Open
opened 2026-04-09 21:57:23 +00:00 by HAL9000 · 0 comments
Owner

Bug Report: Boundary — Tool Code Passed via argv Exceeds OS ARG_MAX Limit

Severity Assessment

  • Impact: Any inline tool whose code field exceeds the OS argument-list size limit (typically ~2 MB on Linux, ~256 KB on macOS) will fail to spawn the subprocess with an OSError: [Errno 7] Argument list too long error. The executor catches this as a generic OSError, returning a confusing "Failed to start subprocess" message with no indication that code length is the cause.
  • Likelihood: Medium — large inline tools (e.g., embedding data tables, long switch-case logic, or template strings) can exceed the limit. The limit varies by OS and is not enforced with any code-size check before the Popen call.
  • Priority: Medium

Location

  • File: src/cleveragents/skills/inline_executor.py
  • Function: InlineToolExecutor._run_with_timeout
  • Lines: ~323–338 (subprocess.Popen call)

Description

The inline executor passes both input_data (as JSON) and tool.code as command-line arguments (sys.argv[1] and sys.argv[2]) to the Python subprocess:

# inline_executor.py  lines 323–338
proc = subprocess.Popen(
    [
        sys.executable,
        "-I",
        "-c",
        wrapper,
        json.dumps(input_data),   # sys.argv[1]: JSON input
        tool.code or "",          # sys.argv[2]: ENTIRE tool code string
    ],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    env=child_env,
)

On Linux, the maximum total length of command-line arguments passed to execve() is ARG_MAX, which is typically 2,097,152 bytes (2 MB) but can be lower on some systems. On macOS it is 262,144 bytes (256 KB).

If tool.code alone (or tool.code + json.dumps(input_data)) exceeds this limit, subprocess.Popen raises:

OSError: [Errno 7] Argument list too long

This is caught by the generic OSError handler:

except OSError as exc:
    ...
    return InlineToolResult(
        success=False,
        error_message=f"Failed to start subprocess: {exc}",  # misleading
        ...
    )

There is no pre-check on code size, and the validate_tool() method only checks that tool.code is non-empty:

def validate_tool(self, tool: SkillInlineTool) -> list[str]:
    ...
    if not tool.code:
        errors.append("Inline tool must have non-empty code")
    # ← no max size check

Expected Behavior

Either reject inline tools exceeding a documented code-size limit with a clear error message, or use a mechanism that is not subject to ARG_MAX (e.g., write code to a temp file and pass the file path as argv, or pipe the code via stdin).

Actual Behavior

Large tool code causes OSError: Argument list too long which is surfaced as "Failed to start subprocess: [Errno 7] Argument list too long" — a confusing error with no hint that code length is the problem.

Suggested Fix

Option A — Add a pre-flight size check in validate_tool():

_MAX_INLINE_CODE_BYTES = 512 * 1024  # 512 KB — safely below ARG_MAX on all platforms

def validate_tool(self, tool: SkillInlineTool) -> list[str]:
    errors = []
    if not tool.code:
        errors.append("Inline tool must have non-empty code")
    code_bytes = len((tool.code or "").encode("utf-8"))
    if code_bytes > _MAX_INLINE_CODE_BYTES:
        errors.append(
            f"Inline tool code is too large ({code_bytes} bytes); "
            f"maximum is {_MAX_INLINE_CODE_BYTES} bytes"
        )
    ...

Option B — Write code to a temp file and import it instead of passing via argv:

import tempfile, os
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
    f.write(tool.code)
    code_path = f.name
try:
    proc = subprocess.Popen([sys.executable, "-I", code_path, ...], ...)
finally:
    os.unlink(code_path)

Category

boundary

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_, and @tdd_expected_fail to prove the bug exists before fixing it.


Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: bug-hunter

## Bug Report: Boundary — Tool Code Passed via `argv` Exceeds OS ARG_MAX Limit ### Severity Assessment - **Impact**: Any inline tool whose `code` field exceeds the OS argument-list size limit (typically ~2 MB on Linux, ~256 KB on macOS) will fail to spawn the subprocess with an `OSError: [Errno 7] Argument list too long` error. The executor catches this as a generic `OSError`, returning a confusing `"Failed to start subprocess"` message with no indication that code length is the cause. - **Likelihood**: Medium — large inline tools (e.g., embedding data tables, long switch-case logic, or template strings) can exceed the limit. The limit varies by OS and is not enforced with any code-size check before the `Popen` call. - **Priority**: Medium ### Location - **File**: `src/cleveragents/skills/inline_executor.py` - **Function**: `InlineToolExecutor._run_with_timeout` - **Lines**: ~323–338 (`subprocess.Popen` call) ### Description The inline executor passes both `input_data` (as JSON) and `tool.code` as command-line arguments (`sys.argv[1]` and `sys.argv[2]`) to the Python subprocess: ```python # inline_executor.py lines 323–338 proc = subprocess.Popen( [ sys.executable, "-I", "-c", wrapper, json.dumps(input_data), # sys.argv[1]: JSON input tool.code or "", # sys.argv[2]: ENTIRE tool code string ], stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=child_env, ) ``` On Linux, the maximum total length of command-line arguments passed to `execve()` is `ARG_MAX`, which is typically 2,097,152 bytes (2 MB) but can be lower on some systems. On macOS it is 262,144 bytes (256 KB). If `tool.code` alone (or `tool.code + json.dumps(input_data)`) exceeds this limit, `subprocess.Popen` raises: ``` OSError: [Errno 7] Argument list too long ``` This is caught by the generic `OSError` handler: ```python except OSError as exc: ... return InlineToolResult( success=False, error_message=f"Failed to start subprocess: {exc}", # misleading ... ) ``` There is **no pre-check** on code size, and the `validate_tool()` method only checks that `tool.code` is non-empty: ```python def validate_tool(self, tool: SkillInlineTool) -> list[str]: ... if not tool.code: errors.append("Inline tool must have non-empty code") # ← no max size check ``` ### Expected Behavior Either reject inline tools exceeding a documented code-size limit with a clear error message, or use a mechanism that is not subject to ARG_MAX (e.g., write code to a temp file and pass the file path as argv, or pipe the code via stdin). ### Actual Behavior Large tool code causes `OSError: Argument list too long` which is surfaced as `"Failed to start subprocess: [Errno 7] Argument list too long"` — a confusing error with no hint that code length is the problem. ### Suggested Fix **Option A** — Add a pre-flight size check in `validate_tool()`: ```python _MAX_INLINE_CODE_BYTES = 512 * 1024 # 512 KB — safely below ARG_MAX on all platforms def validate_tool(self, tool: SkillInlineTool) -> list[str]: errors = [] if not tool.code: errors.append("Inline tool must have non-empty code") code_bytes = len((tool.code or "").encode("utf-8")) if code_bytes > _MAX_INLINE_CODE_BYTES: errors.append( f"Inline tool code is too large ({code_bytes} bytes); " f"maximum is {_MAX_INLINE_CODE_BYTES} bytes" ) ... ``` **Option B** — Write code to a temp file and import it instead of passing via argv: ```python import tempfile, os with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f: f.write(tool.code) code_path = f.name try: proc = subprocess.Popen([sys.executable, "-I", code_path, ...], ...) finally: os.unlink(code_path) ``` ### Category boundary ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: bug-hunter
HAL9000 added this to the v3.2.0 milestone 2026-04-09 22:13:19 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#6597
No description provided.