feat(sandbox): implement overlay filesystem sandbox strategy #994
No reviewers
Labels
No labels
auto/needs-reevaluation
controller-managed
auto/blocked-by-deps
auto/ci-timeout
auto/claimed-implementer
auto/claimed-merge
auto/claimed-reviewer
auto/driver-down
auto/invariant-violation
auto/last-attempt-tier-0
auto/last-attempt-tier-1
auto/last-attempt-tier-2
auto/last-attempt-tier-min
Automation Tracking
auto/needs-conflict-resolution
auto/needs-implementer
auto/postmortem
auto/ready-to-merge
auto/restart-throttled
auto/revert
auto/sentinel
auto/stale-inactivity
auto/unstable
Blocked
Bounty
$100
Bounty
$1000
Bounty
$10000
Bounty
$20
Bounty
$2000
Bounty
$250
Bounty
$50
Bounty
$500
Bounty
$5000
Bounty
$750
MoSCoW
Could have
MoSCoW
Must have
MoSCoW
Should have
Needs Feedback
Points
1
Points
13
Points
2
Points
21
Points
3
Points
34
Points
5
Points
55
Points
8
Points
88
Priority
Backlog
Priority
CI Blocker
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Signed-off: Owner
Signed-off: Scrum Master
Signed-off: Tech Lead
Spike
State
Completed
State
Duplicate
State
In Progress
State
In Review
State
Paused
State
Unverified
State
Verified
State
Wont Do
Type
Automation
Type
Bug
Type
Discussion
Type
Documentation
Type
Epic
Type
Feature
Type
Legendary
Type
Refactor
Type
Support
Type
Task
Type
Testing
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
cleveragents/cleveragents-core!994
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "feature/m5-overlay-sandbox"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Implement the overlay filesystem sandbox strategy with OverlayFS support and userspace fallback.
Implementation
OverlaySandbox(infrastructure/sandbox/overlay.py, 497 lines):/proc/filesystems+os.geteuid() == 0checkshutil.copytreethe original into merged dir, tracks changes viafilecmpdiff on commitcreate(): sets up directory structure, mounts if availablecommit(): copies changed/added files from overlay to original, removes deleted filesrollback(): unmounts (or removes) merged, recreates from scratchcleanup(): unmounts, removes all temp dirs, idempotentDomain Model Updates
OVERLAY = "overlay"toSandboxStrategyenum in bothresource_type.pyandresource.pySTRATEGY_OVERLAYtoSandboxFactory, registered forfs-mount,fs-directory,fs-fileresourcesTests
Quality Gates
nox -s lintnox -s typechecknox -s unit_testsnox -s coverage_reportCloses #880
1a466e2fd5551e280180551e280180db8767382ePM Status — Day 37 — Rebase Required
This PR has merge conflicts and cannot be merged in its current state. 42% of all open PRs (21 of 50) have conflicts — this is a project-wide issue that must be resolved.
@brent.edwards — Please rebase this PR onto
masterby Day 39 EOD (2026-03-19). If you cannot rebase by then, please post a comment explaining the blocker.PM rebase request — Day 37
db8767382eb98568bd1aCode Review Report -- PR #994: feat(sandbox): implement overlay filesystem sandbox strategy
Branch:
feature/m5-overlay-sandbox| Issue: #880 | Reviewer scope: All code changes in this branch + close connections to surrounding code | Spec reference:docs/specification.mdThis review was conducted over two full cycles scanning all problem categories (bugs, regressions, security, performance, test coverage, spec compliance) in each cycle. Findings are organized by severity then by category.
CRITICAL
Regression Bugs (Bundled
plan.py/plan_lifecycle_service.pyChanges)The commit bundles several simplifications to
plan.pyandplan_lifecycle_service.pythat revert fixes previously merged into master by other contributors. These are not related to the overlay sandbox feature itself.[REG-1]
list_plans()database fallback removed --plan_lifecycle_service.py:8002258075("fix(service): add database fallback to PlanLifecycleService.list_plans()").list_plans()now only reads from the in-memory_plansdict. Since the DI container creates PlanLifecycleService viaproviders.Factory(new instance per CLI invocation), every new process starts with an empty dict.plan usein one CLI invocation are invisible toplan list,plan execute(auto-select), andplan apply(auto-select) in subsequent invocations. This breaks the fundamental cross-process plan discovery workflow.[REG-2] Plan override persistence removed from
use_action()--plan.py:~1614f07d347("fix(cli): persist plan overrides and run strategize inline in plan execute").service._commit_plan(plan)call after applying automation-profile, actor, and execution-environment overrides.plan use --automation-profile Xdo not survive toplan execute(separate process). The second process loads the plan from the database without the overrides.[REG-3] Shared
lifecycle_serviceparameter removed from_get_plan_executor()--plan.py:~1218ff2d824("fix(cli): share PlanLifecycleService instance between CLI handler and PlanExecutor").HIGH
Regression Bugs
[REG-4]
PreflightRejectionexception handler removed --plan.pyexecute_plan()f0bdc3c("fix(cli): load persisted actions in start_strategize").PreflightRejection(which extends bareException, notCleverAgentsError) is no longer caught. Users see opaque "Error [500] INTERNAL" tracebacks instead of actionable pre-flight failure messages.[REG-5] Generic exception catch-all removed --
plan.pyexecute_plan()except Exceptionhandler that produced clean "Unexpected error:" messages was removed. Any unexpected exception now produces a raw traceback.[REG-6]
lifecycle_apply_planno longer handles auto-progressed plans --plan.py:~1829complete_execute()'sauto_progress()call will never be found byplan applyauto-select, producing "No plans ready for apply."[REG-7]
execute_planno longer completes in a single invocation for strategize-queued plans --plan.py:~1728plan executecompleting the full strategize-through-execute pipeline in one call. Users must now invokeplan executetwice.Overlay Sandbox Bugs
[BUG-1] Sandbox unusable after rollback -- violates protocol contract --
overlay.py:325-367rollback()transitions toSandboxStatus.ROLLED_BACK. The state machine definesROLLED_BACK: [ACTIVE, CLEANED_UP], and the protocol docstring states: "After rollback the sandbox may be re-used."get_path()(the only method that transitions to ACTIVE) only accepts CREATED or ACTIVE states. There is no code path to re-enter ACTIVE from ROLLED_BACK.CopyOnWriteSandbox(its docstring explicitly promises re-activation viaget_pathbut the code prevents it). Consistency with the existing implementation is acknowledged, but both implementations violate the protocol contract.Error Handling
[ERR-1]
suppress(Exception)is dangerously broad --plan_lifecycle_service.py:853suppress(NotFoundError)tosuppress(Exception).DatabaseError,ConnectionError,TypeError,AttributeError, and other programming/infrastructure errors.MEDIUM
Overlay Sandbox Bugs
[BUG-2]
commit()messageparameter silently ignored --overlay.py:251CommitResult. Callers passing a message get no indication it was discarded.[BUG-3] Empty directories not preserved during commit --
overlay.py:_compute_diff()os.walk()only yields file entries. If a new empty directory is created in the sandbox, it won't appear inadded_filesand won't be created in the original during commit. Similarly, directories that become empty after file deletion won't be removed from the original.[BUG-4] Orphaned parent directories after file deletion during commit --
overlay.py:292-296os.remove(dst)is called. Parent directories that become empty as a result are left behind in the original tree.Security
[SEC-1] Symlinks can escape sandbox boundary --
overlay.py:174shutil.copytree(symlinks=True), which preserves symbolic links. If the original directory contains symlinks pointing outside its tree (e.g.,/etc/shadow,../../../sensitive), those symlinks are copied into the merged directory, giving sandbox operations access to files outside the intended resource boundary.symlinks=False) or validate that symlink targets remain within the original path.[SEC-2] No boundary validation for
original_path--overlay.py:110original_pathis within a project boundary or registered resource. In combination with real OverlayFS mounts (which require root), this could mount overlays on arbitrary filesystem locations.Performance
[PERF-1]
_is_overlayfs_available()probe not cached --overlay.py:33-77OverlaySandbox.__init__()invocation runs a full mount probe:tempfile.mkdtemp(),os.makedirs()(4 dirs), reads/proc/filesystems, and if root, performsmount+umount+shutil.rmtree.functools.lru_cacheor sentinel variable would eliminate this overhead for all but the first instantiation.[PERF-2] Full directory tree walk during commit (real OverlayFS mode) --
overlay.py:_compute_diff()Test Coverage
[TEST-1] Real OverlayFS code paths have zero test coverage --
overlay_sandbox.feature,helper_overlay_sandbox.py_use_real_overlay is False(the userspace fallback). The_mount_overlay(),_unmount_overlay(), real-overlayrollback()(unmount/recreate/remount), and real-overlaycleanup()(unmount before rmtree) paths are never executed._is_overlayfs_availableto return True and verifies thatsubprocess.runis called with the correct mount/umount arguments.[TEST-2] No temp directory cleanup in BDD scenarios --
overlay_sandbox_steps.pycontext.ovl_tmpdiris created ingiven_ovl_test_directorybut never cleaned up. There is no@after_scenariohook to remove temp directories./tmp/ovl-test-*.[TEST-3] Tests depend on private implementation attributes --
overlay_sandbox_steps.py,helper_overlay_sandbox.py_merged_dir,_upper_dir,_work_dir,_base_dir,_use_real_overlay. These are implementation details, not protocol-defined properties.Spec Compliance
[SPEC-1]
overlaynot in resource type YAML JSON Schema enum -- spec line 34404sandbox_strategydefines:["git_worktree", "copy_on_write", "transaction_rollback", "snapshot", "none"]."overlay"is absent. Resource type YAML files specifyingsandbox_strategy: overlaywould fail schema validation against the spec's canonical schema.[SPEC-2] Missing
checkpoint()/restore_checkpoint()methods -- overlay.pycheckpoint()andrestore_checkpoint().LOW
Overlay Sandbox Bugs
[BUG-5] Cleanup may attempt unmount on never-mounted path --
overlay.py:cleanup()create()failed during_mount_overlay(),_use_real_overlayis still True. Cleanup tries_unmount_overlay()on a path that was never mounted, producing a confusing warning log. Not harmful but noisy.[BUG-6] Race condition in OverlayFS availability probe --
overlay.py:_is_overlayfs_available()mountandumountin the probe, the temporary overlay mount is leaked. Thefinallyblock doesshutil.rmtreewhich cannot remove a mountpoint. Low probability but worth noting.Performance
[PERF-3] Full directory copy for userspace fallback on creation --
overlay.py:174shutil.copytreecopies the entire original directory into the merged layer on sandbox creation. For large project directories (thousands of files), this is expensive both in I/O and disk space.Test Coverage
[TEST-4] No test for nested subdirectory operations
new_file.txt,existing.txt). No scenarios test paths likesrc/deep/nested/file.py.[TEST-5] No test for symlink handling
[TEST-6] No test for binary file handling
[TEST-7] No test for double
create()callsandbox.create()twice raisesSandboxStateError(CREATED -> CREATED is not a valid transition).[TEST-8]
commit(message=...)never tested[TEST-9] Robot helper tests don't verify stderr --
overlay_sandbox.robotresult.stdoutfor success markers. If a Python exception occurs, it writes to stderr, which is never asserted.Spec Compliance
[SPEC-3] Sandbox strength ordering doesn't include overlay -- spec line 24767
git_worktree > snapshot > copy_on_write > transaction_rollback. Overlay is not placed in this ordering, which may affect virtual resource resolution.[SPEC-4]
fs-directory/fs-filenot explicitly listed with overlay in spec strategy tables -- spec line 24932fs-mount. The implementation adds overlay support tofs-directoryandfs-fileas well. While the security properties table (line 45925) describes overlay generically for "Filesystems supporting overlay mounts," the resource-specific tables are inconsistent.Summary
Recommendation
The overlay sandbox implementation (
overlay.py) is solid in its core design -- the directory structure, lifecycle management, and userspace fallback are well-engineered. However:plan.py/plan_lifecycle_service.pychanges (REG-1 through REG-7) revert multiple previously-merged bug fixes and should be separated from this PR or reverted. These affect core plan lifecycle functionality unrelated to sandbox overlay support.suppress(Exception)broadening (ERR-1) should be narrowed back to specific exception types._is_overlayfs_available()caching (PERF-1) and the re-activation after rollback gap (BUG-1) are the highest-priority fixes within the overlay code itself.b98568bd1a7dedf9a0937dedf9a09321fd9b1c9c21fd9b1c9c5b84c86dcd5b84c86dcdb98568bd1ab98568bd1aa669a99185a669a99185e1c75bd60eChanges Applied (REQUEST_CHANGES resolved)
Step 1: Reverted unrelated regressions
Restored the following 9 files to their
masterversions — these were accidentally included in the squashed overlay commit:CHANGELOG.mdbenchmarks/bench_uko_layer3.pydocs/timeline.mdfeatures/consolidated_resource.featurefeatures/consolidated_sandbox.featurefeatures/steps/plan_lifecycle_commands_coverage_steps.pyfeatures/steps/sandbox_factory_coverage_steps.pysrc/cleveragents/application/services/plan_lifecycle_service.pysrc/cleveragents/cli/commands/plan.pyThe commit now touches only the 10 overlay-sandbox files.
Step 2: Overlay-specific fixes
get_path()from ROLLED_BACKoverlay.py:233-237SandboxStatus.ROLLED_BACKto allowed states so sandboxes can be re-used after rollback_is_overlayfs_available()overlay.py:33@functools.lru_cache(maxsize=1)— avoids repeated/proc/filesystemsreads and mount probesoverlay.py:181,362symlinks=True→symlinks=Falsein bothcreate()androllback()userspace fallback pathsoverlay_sandbox_steps.py:31context.add_cleanup(shutil.rmtree, context.ovl_tmpdir)to prevent temp dir leaksplan_lifecycle_service.pysuppress(Exception)issue was in an unrelated fileVerification
nox -s lint— passed (0 errors)nox -s typecheck— passed (0 errors, 1 pre-existing warning instrategy_registry.py)overlay.py— 499 lines (under 500 limit)# type: ignoredirectivese1c75bd60e88785aef9988785aef997013e1f060Review Fixes Applied — Commit
7013e1f0Rebase onto master (resolves REG-1 through REG-7, ERR-1)
The branch has been rebased onto current
origin/master(ad98d41d). This automatically resolved all critical and high regression findings (REG-1 through REG-7, ERR-1) because the regressions were caused by stale versions ofplan.pyandplan_lifecycle_service.pybundled into the commit. After rebase, these files match master exactly and are no longer modified by this PR.The branch now modifies only overlay-sandbox-specific files:
src/cleveragents/infrastructure/sandbox/overlay.py(new)src/cleveragents/infrastructure/sandbox/factory.py(registers overlay)src/cleveragents/infrastructure/sandbox/__init__.py(exports)src/cleveragents/infrastructure/sandbox/protocol.py(metadata field on CommitResult)src/cleveragents/domain/models/core/resource.py/resource_type.py(overlay strategy)features/overlay_sandbox.feature+features/steps/overlay_sandbox_steps.pyrobot/overlay_sandbox.robot+robot/helper_overlay_sandbox.pyvulture_whitelist.py,CHANGELOG.mdOverlay-specific fixes
plan.pyandplan_lifecycle_service.pyno longer modifiedsuppress(Exception)was in plan_lifecycle_service.pyROLLED_BACKwas already inget_path()allowed statessymlinks=Falsealready in bothcreate()androllback()@lru_cache(maxsize=1)already applied to_is_overlayfs_available()context.add_cleanup(shutil.rmtree, ...)already presentmetadatafield toCommitResult;commit()now stores message## UnreleasedQuality Gates
nox -s lint— PASSnox -s typecheck— PASS (0 errors)overlay.py— 498 lines (under 500 limit)ad98d41dCode Review — PR #994
feat(sandbox): implement overlay filesystem sandbox strategyWell-implemented feature with a clean architecture: OverlayFS detection → real mount or userspace fallback → commit/rollback lifecycle. The 22 Behave scenarios + 6 Robot integration tests provide thorough coverage of the full lifecycle including edge cases (path traversal, empty commit, idempotent cleanup, state transitions).
Approved with minor notes:
filecmp.dircmpshallow comparison — In_compute_diff(),filecmp.dircmpdefaults toshallow=True, which compares only file metadata (size, mtime) rather than content. This means files with the same size and modification time but different content would not be detected as changed. Consider usingshallow=Falsefor content-based comparison, or document this as a known limitation.Robot helper temp dir cleanup — Functions like
_overlay_lifecycle(),_overlay_rollback(), etc. createtempfile.mkdtemp()but don't clean up infinallyblocks. On test failure, these temp directories will leak. The Behave steps correctly usecontext.add_cleanup(shutil.rmtree, ...)— consider applying the same pattern in the Robot helpers.Dual
SandboxStrategyenums —OVERLAY = "overlay"is added to bothresource.py:SandboxStrategyandresource_type.py:SandboxStrategy. This is a pre-existing design issue (not introduced by this PR), but worth noting as tech debt — two parallel enum definitions will eventually drift.7013e1f060b7cc214b94New commits pushed, approval review dismissed automatically according to repository settings
Rebased onto
origin/master(79b0a2c5). CHANGELOG conflict resolved (kept master, re-added PR entry).resource_type.pyauto-merged cleanly.nox -s lintPASS,nox -s typecheckPASS (0 errors). Commitb7cc214b.Fixed 6 failing unit tests, boosted coverage, and merged latest
master.Unit test failures (all related to the new OVERLAY strategy not being reflected in existing assertions):
consolidated_resource.feature:386— "SandboxStrategy has exactly five values" → updated to six, addedoverlayto the values listconsolidated_resource.feature:403— "SandboxStrategy rejects overlay" → changed to "accepts overlay"consolidated_sandbox.feature:761— "overlay strategy is now unknown and rejected" → changed to "overlay strategy is supported" with new step definitionconsolidated_sandbox.feature:813/819/825— fs-mount/fs-directory/fs-file "support copy-on-write and none" → updated to "copy-on-write, overlay, and none" with new step definitionCoverage boost (96.9% → targeting 97%):
Added 5 new Behave scenarios to
overlay_sandbox.featurecovering:get_pathwhenmerged_diris None (line 248)SandboxCommitError(lines 299-303)SandboxRollbackError(lines 370-374)SandboxCreationError(lines 187-192)Also: Merged latest
origin/master(resolved CHANGELOG.md conflict).Benchmark log: Server died at 25% — no code fix needed.