UAT: ToolRunner.execute() does not enforce require_sandbox safety profile field — write tools run unsandboxed without error #4882

Open
opened 2026-04-08 20:13:36 +00:00 by HAL9000 · 0 comments
Owner

Bug Report

Feature Area: Sandbox and Checkpoint — require_sandbox safety profile enforcement
Severity: Medium — safety profile constraint is enforced in one code path but silently bypassed in the primary execution path


What Was Tested

Code-level analysis of require_sandbox enforcement across both tool execution paths:

  • src/cleveragents/tool/runner.pyToolRunner.execute() (primary path used by PlanExecutor)
  • src/cleveragents/tool/lifecycle.pyToolRuntime._enforce_capabilities() (secondary path)

Expected Behavior (from spec)

The require_sandbox field on SafetyProfile is documented as:

Whether a sandbox is required for execution. When True, tools with writes=True must not execute unless a sandbox is active for the plan.

The spec and the safety_profile.py docstring both state:

ToolRuntime._enforce_capabilities enforces the resolved profile's constraints at tool activation and execution time.

Actual Behavior (from code)

ToolRuntime._enforce_capabilities() (lifecycle.py:770) correctly enforces require_sandbox:

# 5. Sandbox requirement via safety profile.
if (
    ctx.safety_profile is not None
    and ctx.safety_profile.require_sandbox
    and cap.writes
    and ctx.sandbox_id is None
):
    raise ToolSandboxRequiredError(...)

ToolRunner.execute() (runner.py:220) — the primary path used by PlanExecutor — does not check require_sandbox at all. It only creates checkpoints for write tools:

# Determine if this tool writes to the sandbox.
_tool_writes = spec.capabilities.writes

# before_tool_execute checkpoint: fire before any write tool runs.
if _tool_writes and self._is_trigger_active("before_tool_execute"):
    self._try_create_tool_checkpoint(...)

There is no require_sandbox check in ToolRunner.execute(). The ToolRunner does not accept a safety_profile parameter at all.

Impact

When PlanExecutor uses ToolRunner (the standard path for both stub and runtime execution), the require_sandbox=True constraint from the safety profile is silently ignored. Write tools execute against real resources even when the safety profile mandates sandbox isolation.

The ToolRuntime path (which does enforce it) is only used when tools are invoked through the ToolRuntime class directly, which is not the standard plan execution path.

Code Location

  • ToolRunner.execute()src/cleveragents/tool/runner.py:220 (missing enforcement)
  • ToolRuntime._enforce_capabilities()src/cleveragents/tool/lifecycle.py:770 (correct enforcement)
  • SafetyProfile.require_sandboxsrc/cleveragents/domain/models/core/safety_profile.py:103

Steps to Reproduce

  1. Create a safety profile with require_sandbox=True
  2. Create a plan using that safety profile, with no sandbox configured
  3. Execute the plan using PlanExecutor with a write tool
  4. Observe: the write tool executes without error, despite require_sandbox=True

Fix Direction

ToolRunner.__init__() should accept an optional safety_profile: SafetyProfile | None parameter. In ToolRunner.execute(), before running the handler, add:

if (
    self._safety_profile is not None
    and self._safety_profile.require_sandbox
    and _tool_writes
    and not sandbox_ref  # no sandbox active
):
    raise ToolSandboxRequiredError(...)

Alternatively, PlanExecutor should check require_sandbox before calling ToolRunner.execute() and raise a PlanError if no sandbox is configured.


Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: uat-tester

## Bug Report **Feature Area:** Sandbox and Checkpoint — `require_sandbox` safety profile enforcement **Severity:** Medium — safety profile constraint is enforced in one code path but silently bypassed in the primary execution path --- ## What Was Tested Code-level analysis of `require_sandbox` enforcement across both tool execution paths: - `src/cleveragents/tool/runner.py` — `ToolRunner.execute()` (primary path used by `PlanExecutor`) - `src/cleveragents/tool/lifecycle.py` — `ToolRuntime._enforce_capabilities()` (secondary path) ## Expected Behavior (from spec) The `require_sandbox` field on `SafetyProfile` is documented as: > Whether a sandbox is required for execution. When `True`, tools with `writes=True` must not execute unless a sandbox is active for the plan. The spec and the `safety_profile.py` docstring both state: > `ToolRuntime._enforce_capabilities` enforces the resolved profile's constraints at tool activation and execution time. ## Actual Behavior (from code) **`ToolRuntime._enforce_capabilities()` (lifecycle.py:770)** correctly enforces `require_sandbox`: ```python # 5. Sandbox requirement via safety profile. if ( ctx.safety_profile is not None and ctx.safety_profile.require_sandbox and cap.writes and ctx.sandbox_id is None ): raise ToolSandboxRequiredError(...) ``` **`ToolRunner.execute()` (runner.py:220)** — the primary path used by `PlanExecutor` — does **not** check `require_sandbox` at all. It only creates checkpoints for write tools: ```python # Determine if this tool writes to the sandbox. _tool_writes = spec.capabilities.writes # before_tool_execute checkpoint: fire before any write tool runs. if _tool_writes and self._is_trigger_active("before_tool_execute"): self._try_create_tool_checkpoint(...) ``` There is no `require_sandbox` check in `ToolRunner.execute()`. The `ToolRunner` does not accept a `safety_profile` parameter at all. ## Impact When `PlanExecutor` uses `ToolRunner` (the standard path for both stub and runtime execution), the `require_sandbox=True` constraint from the safety profile is silently ignored. Write tools execute against real resources even when the safety profile mandates sandbox isolation. The `ToolRuntime` path (which does enforce it) is only used when tools are invoked through the `ToolRuntime` class directly, which is not the standard plan execution path. ## Code Location - `ToolRunner.execute()` — `src/cleveragents/tool/runner.py:220` (missing enforcement) - `ToolRuntime._enforce_capabilities()` — `src/cleveragents/tool/lifecycle.py:770` (correct enforcement) - `SafetyProfile.require_sandbox` — `src/cleveragents/domain/models/core/safety_profile.py:103` ## Steps to Reproduce 1. Create a safety profile with `require_sandbox=True` 2. Create a plan using that safety profile, with no sandbox configured 3. Execute the plan using `PlanExecutor` with a write tool 4. Observe: the write tool executes without error, despite `require_sandbox=True` ## Fix Direction `ToolRunner.__init__()` should accept an optional `safety_profile: SafetyProfile | None` parameter. In `ToolRunner.execute()`, before running the handler, add: ```python if ( self._safety_profile is not None and self._safety_profile.require_sandbox and _tool_writes and not sandbox_ref # no sandbox active ): raise ToolSandboxRequiredError(...) ``` Alternatively, `PlanExecutor` should check `require_sandbox` before calling `ToolRunner.execute()` and raise a `PlanError` if no sandbox is configured. --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: uat-tester
HAL9000 added this to the v3.5.0 milestone 2026-04-08 20:19:02 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#4882
No description provided.