feat(sandbox): implement sandbox boundary algebra and domain computation #548

Closed
opened 2026-03-04 01:02:24 +00:00 by freemo · 1 comment
Owner

Metadata

Field Value
Type Feature
Priority High
MoSCoW Should Have
Points 8
Milestone v3.5.0
Assignee freemo
Parent Epic #369 (Epic: Large Project Autonomy & Context)
Depends On None (builds on existing sandbox infrastructure)

Background

The specification (§ Core Concepts > Resource DAG > Sandbox Boundaries, lines 24657-24672) defines a formal algebra for determining sandbox boundaries:

  • sandbox_boundary(r): Walk up containment edges in the resource DAG to the nearest sandboxable ancestor of resource r.
  • Sandbox domains: All resources sharing the same sandbox_boundary share one sandbox instance.
  • SandboxManager keyed by (plan_id, sandbox_boundary_id): ensures one sandbox per plan per boundary.

Current state: The sandbox infrastructure exists (infrastructure/sandbox/) with SandboxManager, SandboxFactory, GitWorktreeSandbox, CopyOnWriteSandbox, etc. However, there is no implementation of sandbox_boundary(r) or automatic domain computation. The current code requires explicit sandbox configuration per resource rather than automatic boundary inference from the resource DAG.

No grep hits for sandbox_boundary in Python files.

Acceptance Criteria

  1. sandbox_boundary(resource) -> Resource function walks up containment edges to the nearest sandboxable ancestor.
  2. Resources are grouped into sandbox domains: all resources with the same boundary share one sandbox instance.
  3. SandboxManager keying updated from (plan_id, resource_id) to (plan_id, sandbox_boundary_id).
  4. When multiple resources in the same domain need sandboxing, they reuse the existing sandbox instance.
  5. New resources added to a plan automatically inherit their boundary's sandbox.
  6. Boundary computation is cached per plan execution (resource DAG doesn't change mid-execution).

Subtasks

1. Design

  • Define "sandboxable" resource type attribute
  • Design boundary walk algorithm (upward containment traversal)
  • Design domain grouping and sandbox reuse logic

2. Implementation

  • Implement sandbox_boundary() function over resource DAG
  • Implement domain grouping (resources → boundary → shared sandbox)
  • Update SandboxManager keying to use boundary IDs
  • Add caching for boundary computation

3. Testing

  • Unit test: single resource with direct sandboxable parent
  • Unit test: nested resources sharing distant boundary
  • Unit test: multiple boundaries in one plan (separate domains)
  • Unit test: resource with no sandboxable ancestor (error case)
  • Integration test: multi-resource plan with boundary algebra

4. Documentation

  • Docstrings with boundary algebra specification
  • Configuration guide for marking resource types as sandboxable

5. Integration

  • Wire into existing SandboxManager and SandboxFactory
  • Verify compatibility with GitWorktree and CopyOnWrite strategies

6. Observability

  • Log boundary computation results
  • Log sandbox reuse events

7. Security

  • Boundary computation must not allow sandbox escape (resource cannot bypass its boundary)

Definition of Done

  • All acceptance criteria met
  • All subtask checkboxes checked
  • Tests pass in CI
  • Code reviewed and approved
## Metadata | Field | Value | |-------|-------| | **Type** | Feature | | **Priority** | High | | **MoSCoW** | Should Have | | **Points** | 8 | | **Milestone** | v3.5.0 | | **Assignee** | freemo | | **Parent Epic** | #369 (Epic: Large Project Autonomy & Context) | | **Depends On** | None (builds on existing sandbox infrastructure) | ## Background The specification (§ Core Concepts > Resource DAG > Sandbox Boundaries, lines 24657-24672) defines a formal algebra for determining sandbox boundaries: - `sandbox_boundary(r)`: Walk up containment edges in the resource DAG to the nearest sandboxable ancestor of resource `r`. - **Sandbox domains**: All resources sharing the same `sandbox_boundary` share one sandbox instance. - **SandboxManager** keyed by `(plan_id, sandbox_boundary_id)`: ensures one sandbox per plan per boundary. **Current state:** The sandbox infrastructure exists (`infrastructure/sandbox/`) with `SandboxManager`, `SandboxFactory`, `GitWorktreeSandbox`, `CopyOnWriteSandbox`, etc. However, there is no implementation of `sandbox_boundary(r)` or automatic domain computation. The current code requires explicit sandbox configuration per resource rather than automatic boundary inference from the resource DAG. No grep hits for `sandbox_boundary` in Python files. ## Acceptance Criteria 1. `sandbox_boundary(resource) -> Resource` function walks up containment edges to the nearest sandboxable ancestor. 2. Resources are grouped into sandbox domains: all resources with the same boundary share one sandbox instance. 3. `SandboxManager` keying updated from `(plan_id, resource_id)` to `(plan_id, sandbox_boundary_id)`. 4. When multiple resources in the same domain need sandboxing, they reuse the existing sandbox instance. 5. New resources added to a plan automatically inherit their boundary's sandbox. 6. Boundary computation is cached per plan execution (resource DAG doesn't change mid-execution). ## Subtasks ### 1. Design - [ ] Define "sandboxable" resource type attribute - [ ] Design boundary walk algorithm (upward containment traversal) - [ ] Design domain grouping and sandbox reuse logic ### 2. Implementation - [ ] Implement `sandbox_boundary()` function over resource DAG - [ ] Implement domain grouping (resources → boundary → shared sandbox) - [ ] Update SandboxManager keying to use boundary IDs - [ ] Add caching for boundary computation ### 3. Testing - [ ] Unit test: single resource with direct sandboxable parent - [ ] Unit test: nested resources sharing distant boundary - [ ] Unit test: multiple boundaries in one plan (separate domains) - [ ] Unit test: resource with no sandboxable ancestor (error case) - [ ] Integration test: multi-resource plan with boundary algebra ### 4. Documentation - [ ] Docstrings with boundary algebra specification - [ ] Configuration guide for marking resource types as sandboxable ### 5. Integration - [ ] Wire into existing SandboxManager and SandboxFactory - [ ] Verify compatibility with GitWorktree and CopyOnWrite strategies ### 6. Observability - [ ] Log boundary computation results - [ ] Log sandbox reuse events ### 7. Security - [ ] Boundary computation must not allow sandbox escape (resource cannot bypass its boundary) ## Definition of Done - [ ] All acceptance criteria met - [ ] All subtask checkboxes checked - [ ] Tests pass in CI - [ ] Code reviewed and approved
freemo added this to the v3.5.0 milestone 2026-03-04 01:03:30 +00:00
freemo self-assigned this 2026-03-04 01:41:13 +00:00
Author
Owner

Implementation Complete — PR #557

All acceptance criteria have been met:

  1. sandbox_boundary(resource) walks up containment edges to the nearest sandboxable ancestor (resource with capabilities.sandboxable == True and sandbox_strategy != "none")
  2. compute_sandbox_domains() groups resources by shared boundary into sandbox domains
  3. SandboxManager updated with resolve_sandbox_key() that keys by (plan_id, sandbox_boundary_id) instead of (plan_id, resource_id)
  4. BoundaryCache provides per-plan-execution caching of boundary computation results
  5. Error handling for cycles, orphaned resources (no sandboxable ancestor), and invalid inputs

New files:

  • src/cleveragents/infrastructure/sandbox/boundary.py — Core boundary algebra implementation
  • features/sandbox_boundary_algebra.feature — 26 BDD scenarios
  • robot/sandbox_boundary_algebra.robot — 5 integration test cases
  • benchmarks/sandbox_boundary_bench.py — ASV performance benchmarks

Quality gates:

  • Lint: passing
  • Typecheck: 0 errors (Pyright strict)
  • Unit tests: 8160 scenarios passed
  • Integration tests: boundary tests all pass
  • Coverage: 97.0% (meets threshold)

PR: #557

## Implementation Complete — PR #557 All acceptance criteria have been met: 1. **`sandbox_boundary(resource)`** walks up containment edges to the nearest sandboxable ancestor (resource with `capabilities.sandboxable == True` and `sandbox_strategy != "none"`) 2. **`compute_sandbox_domains()`** groups resources by shared boundary into sandbox domains 3. **`SandboxManager`** updated with `resolve_sandbox_key()` that keys by `(plan_id, sandbox_boundary_id)` instead of `(plan_id, resource_id)` 4. **`BoundaryCache`** provides per-plan-execution caching of boundary computation results 5. **Error handling** for cycles, orphaned resources (no sandboxable ancestor), and invalid inputs ### New files: - `src/cleveragents/infrastructure/sandbox/boundary.py` — Core boundary algebra implementation - `features/sandbox_boundary_algebra.feature` — 26 BDD scenarios - `robot/sandbox_boundary_algebra.robot` — 5 integration test cases - `benchmarks/sandbox_boundary_bench.py` — ASV performance benchmarks ### Quality gates: - Lint: passing - Typecheck: 0 errors (Pyright strict) - Unit tests: 8160 scenarios passed - Integration tests: boundary tests all pass - Coverage: 97.0% (meets threshold) PR: https://git.cleverthis.com/cleveragents/cleveragents-core/pulls/557
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#548
No description provided.