feat(tool): add move_file, create_directory, and get_file_info built-in tools #1070

Closed
CoreRasurae wants to merge 1 commit from feature/m4-missing-builtin-tools into master
Member

Summary

  • Add builtin/file-move tool for moving/renaming files within sandbox boundaries using shutil.move with path traversal protection
  • Add builtin/dir-create tool for creating directories with optional parent creation using Path.mkdir with sandbox validation
  • Add builtin/file-info tool for retrieving comprehensive file metadata (size, permissions, timestamps, type) with ISO 8601 formatted dates

All three tools follow the existing ToolSpec pattern with proper JSON Schema definitions, capability metadata, and sandbox root validation. Added to ALL_FILE_TOOLS list and registration helpers. Updated docstring from "6 built-in file tools" to "9 built-in file tools".

Files Changed

  • src/cleveragents/tool/builtins/file_tools.py — Added 3 handler functions, 3 ToolSpec constants, updated ALL_FILE_TOOLS
  • src/cleveragents/tool/builtins/__init__.py — Added imports and __all__ entries for FILE_MOVE_SPEC, DIR_CREATE_SPEC, FILE_INFO_SPEC
  • features/builtin_file_tools_new.feature — 7 Behave BDD scenarios
  • features/steps/builtin_file_tools_new_steps.py — Step definitions
  • features/tool_builtins.feature — Updated registration test from 6 to 9 tools
  • robot/builtin_file_tools_new.robot — Robot integration tests
  • robot/helper_builtin_file_tools_new.py — Robot helper script
  • benchmarks/bench_builtin_file_tools_new.py — ASV benchmarks

Quality Gates

  • Lint: PASSED
  • Typecheck: PASSED (0 errors)
  • Unit Tests: PASSED (11462 scenarios, 0 failures)
  • Integration Tests: PASSED (1583/1584 — 1 pre-existing unrelated failure in "File Watching")
  • Coverage: 97% overall, 99% on file_tools.py

ISSUES CLOSED: #883

## Summary - Add `builtin/file-move` tool for moving/renaming files within sandbox boundaries using `shutil.move` with path traversal protection - Add `builtin/dir-create` tool for creating directories with optional parent creation using `Path.mkdir` with sandbox validation - Add `builtin/file-info` tool for retrieving comprehensive file metadata (size, permissions, timestamps, type) with ISO 8601 formatted dates All three tools follow the existing `ToolSpec` pattern with proper JSON Schema definitions, capability metadata, and sandbox root validation. Added to `ALL_FILE_TOOLS` list and registration helpers. Updated docstring from "6 built-in file tools" to "9 built-in file tools". ## Files Changed - `src/cleveragents/tool/builtins/file_tools.py` — Added 3 handler functions, 3 `ToolSpec` constants, updated `ALL_FILE_TOOLS` - `src/cleveragents/tool/builtins/__init__.py` — Added imports and `__all__` entries for `FILE_MOVE_SPEC`, `DIR_CREATE_SPEC`, `FILE_INFO_SPEC` - `features/builtin_file_tools_new.feature` — 7 Behave BDD scenarios - `features/steps/builtin_file_tools_new_steps.py` — Step definitions - `features/tool_builtins.feature` — Updated registration test from 6 to 9 tools - `robot/builtin_file_tools_new.robot` — Robot integration tests - `robot/helper_builtin_file_tools_new.py` — Robot helper script - `benchmarks/bench_builtin_file_tools_new.py` — ASV benchmarks ## Quality Gates - **Lint**: PASSED - **Typecheck**: PASSED (0 errors) - **Unit Tests**: PASSED (11462 scenarios, 0 failures) - **Integration Tests**: PASSED (1583/1584 — 1 pre-existing unrelated failure in "File Watching") - **Coverage**: 97% overall, 99% on `file_tools.py` ISSUES CLOSED: #883
feat(tool): add move_file, create_directory, and get_file_info built-in tools
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 21s
CI / lint (pull_request) Successful in 26s
CI / quality (pull_request) Successful in 30s
CI / typecheck (pull_request) Successful in 51s
CI / security (pull_request) Successful in 54s
CI / unit_tests (pull_request) Successful in 3m28s
CI / integration_tests (pull_request) Successful in 3m38s
CI / docker (pull_request) Successful in 56s
CI / e2e_tests (pull_request) Successful in 5m14s
CI / coverage (pull_request) Successful in 7m13s
CI / benchmark-regression (pull_request) Successful in 38m1s
e65c3866f8
Implemented three missing built-in resource tools from the specification:

- builtin/file-move: Move/rename files within sandbox boundaries using
  shutil.move with path traversal protection.
- builtin/dir-create: Create directories with optional parent creation
  using Path.mkdir with sandbox validation.
- builtin/file-info: Retrieve comprehensive file metadata (size,
  permissions, timestamps, type) with ISO 8601 formatted dates.

All tools follow the existing ToolSpec pattern with proper JSON Schema
definitions, capability metadata, and sandbox root validation. Added
to ALL_FILE_TOOLS and registration helpers.

Tests: 7+ Behave BDD scenarios, Robot integration tests, ASV benchmarks.

ISSUES CLOSED: #883
CoreRasurae added this to the v3.4.0 milestone 2026-03-19 14:55:42 +00:00
Author
Member

Code Review: PR #1070feat(tool): add move_file, create_directory, and get_file_info built-in tools

Reviewer: Automated review (on behalf of review request)
Scope: Code changes in branch feature/m4-missing-builtin-tools (commit e65c386) plus close connections to surrounding code.
References: Issue #883, docs/specification.md
Review cycles performed: 8 (bug detection, security, test coverage/flaws, performance, spec compliance, code quality, second global sweep, final global sweep)


Summary

The PR implements three missing built-in resource tools (builtin/file-move, builtin/dir-create, builtin/file-info) with 7 Behave scenarios, 4 Robot Framework smoke tests, and 3 ASV benchmark suites. The tool handler logic is straightforward, but the review identified 21 findings across 6 categories with issues ranging from a broken ChangeSet integration to missing test coverage for security-critical edge cases.

Finding counts: 4 High, 7 Medium, 10 Low


HIGH Severity

H1 — ChangeSet Integration Broken for builtin/file-move [Bug / Spec Compliance]

Location: src/cleveragents/tool/builtins/changeset.py (wrap_tool, line ~211) interacting with file_tools.py:246-258

ChangeSetCapture.wrap_tool() extracts the file path via inputs.get("path", ""). However, builtin/file-move uses source_path and destination_path — not path. When file-move is wrapped via register_file_tools_with_changeset():

  • path_str evaluates to "" (empty string) — no before/after hashes are computed.
  • _detect_operation("builtin/file-move", None, None, output) falls through all checks and returns "modify" (wrong — should be "move").
  • _normalize_path("", sandbox) returns "".
  • The resulting ChangeSetEntry has operation="modify", path="", and None hashes — completely wrong and useless.

The specification states: "Each built-in tool automatically: Records changes to the ChangeSet" (spec line ~19443). Additionally, _detect_operation has no branch for "move" even though ChangeSetEntry.operation documents "move" as a valid value and _map_operation knows how to map it to ChangeOperation.RENAME.

Suggested fix: wrap_tool needs a special case for tools with source_path/destination_path inputs, and _detect_operation needs a "move" in tool_name branch returning "move".


H2 — _handle_dir_create Always Returns created: True [Bug]

Location: file_tools.py:261-271

The handler unconditionally returns {"created": True} regardless of whether the directory already existed. Since exist_ok=True is always passed to mkdir(), the call succeeds silently for pre-existing directories. This is semantically incorrect — callers relying on this field to determine if a directory was actually created will get misleading information.

Suggested fix:

existed = path.exists()
path.mkdir(parents=parents, exist_ok=True)
return {"path": str(path), "created": not existed}

H3 — _handle_file_move Silently Overwrites Existing Destination [Bug]

Location: file_tools.py:246-258

shutil.move() overwrites the destination file without warning if it already exists. During plan execution, an agent moving a file to a path that already has content would silently destroy the original — a data loss risk with no safety net. No overwrite parameter exists to control this behavior.

Suggested fix: Either add an overwrite: bool = False parameter that rejects the move when destination exists and overwrite is False, or at minimum check destination.exists() before moving and raise a clear error.


H4 — Missing checkpointable and checkpoint_scope on Write Tools [Spec Compliance]

Location: file_tools.py:506-537 (FILE_MOVE_SPEC), file_tools.py:539-570 (DIR_CREATE_SPEC)

The specification's skill tool listing (spec line ~7250) shows all write-capable file tools have checkpoint_scope: file. The new tools only set writes=True but don't set checkpointable=True or checkpoint_scope="file". Without this, the sandbox/checkpoint system cannot properly track and rollback operations from these tools.


MEDIUM Severity

M1 — Missing encoding Field in file-info Output [Spec Compliance]

Location: file_tools.py:274-288

Issue #883 explicitly specifies that file-info "Returns: size, permissions, timestamps (created, modified, accessed), file type, encoding." The implementation does not return an encoding field. The output schema also omits it.


M2 — validate_path String Prefix Check is Bypassable [Security — Pre-existing, Extended by New Tools]

Location: file_tools.py:91

The check str(target).startswith(str(root)) is vulnerable to prefix collisions. If sandbox_root is /tmp/sandbox, then a path resolving to /tmp/sandbox_evil/file passes the check because the string /tmp/sandbox_evil/file starts with /tmp/sandbox.

While this is pre-existing code (not introduced by this PR), the new tools significantly extend the attack surface: file-move can now move files to/from escaped paths and dir-create can create directories outside the sandbox — both write operations.

Suggested fix: Use target.is_relative_to(root) (Python 3.9+) or compare with trailing separator: str(target).startswith(str(root) + os.sep) or target == root.


M3 — No Path Traversal Tests for dir-create and file-info [Test Coverage]

dir-create is a write operation and file-info can leak information about files outside the sandbox. Both need path traversal rejection tests. Currently only file-move destination traversal is tested.


M4 — No Source Path Traversal Test for file-move [Test Coverage]

Only the destination path traversal is tested (../../escaped.txt). A malicious source path (e.g., ../../etc/passwd) is not tested. This is particularly important given M2.


M5 — No Test for Overwriting Existing Destination in file-move [Test Coverage]

The silent overwrite behavior described in H3 has no test coverage. A scenario creating a file at the destination path before the move would expose this behavior.


M6 — No Test for dir-create on Already Existing Directory [Test Coverage]

The created: True bug (H2) has no test to catch it. A scenario creating a directory that already exists and asserting created is false would be a regression guard.


M7 — sandbox_root Inconsistently Included in input_schema [Code Quality]

Location: file_tools.py:506-604 (new specs) vs. file_tools.py:296-503 (existing specs)

The 6 existing tools omit sandbox_root from their input_schema properties (even though their handlers accept it). The 3 new tools explicitly declare it. This inconsistency means schema-validating callers would treat sandbox_root differently for old vs. new tools. The module docstring (line 23) says "All file tools accept an optional sandbox_root input parameter" — either all schemas should declare it or none should.


LOW Severity

L1 — st_ctime is Not Creation Time on Linux [Bug]

Location: file_tools.py:285

On Linux, st_ctime is the inode change time (last metadata change), not the file creation time. The field name created is misleading. Consider documenting this limitation or using st_birthtime where available (Python 3.12+ on some platforms).


L2 — _handle_file_move Does Not Validate Source is a File [Bug]

Location: file_tools.py:246-252

The description says "Move/rename a file" but shutil.move works on directories too. Nothing prevents moving a directory, making the behavior inconsistent with the stated description.


L3 — ISO Timestamp Validation in Tests is Too Weak [Test Flaw]

Location: features/steps/builtin_file_tools_new_steps.py:77

The assertion "T" in value would pass for any string containing the letter "T" (e.g., "True", "Test"). Should use datetime.fromisoformat(value) for proper validation.


L4 — Robot Tests Missing on_timeout=kill [Test Flaw]

Location: robot/builtin_file_tools_new.robot (all Run Process calls)

The established project pattern (visible in recent PRs #812, #1050, #1052) is to use on_timeout=kill on Run Process calls. Without it, a hung test could block CI indefinitely.


L5 — Robot Tests Do Not Verify Clean stderr [Test Flaw]

Tests only check stdout content but don't assert that stderr is clean. Unexpected warnings or errors on stderr would go undetected.


Symlink behavior (following symlinks to report target metadata vs. reporting the symlink itself) is untested.


L7 — No Test for Moving a Directory [Test Coverage]

If file-move should reject directories (per L2), a test should verify this. If it should accept them, the tool description should be updated.


L8 — Test Steps Do Not Support ChangeSet Capture [Test Coverage]

Location: features/steps/builtin_file_tools_new_steps.py:22-28

The _run_tool() helper always creates a fresh ToolRegistry without changeset capture, unlike the existing tool_builtins_steps.py which checks for context.changeset_capture. The changeset wrapping path is never tested for the new tools, which means H1 was not caught by tests.


L9 — Unused sandbox Parameter in Robot Helper [Code Quality]

Location: robot/helper_builtin_file_tools_new.py:19

_make_runner(sandbox) accepts a sandbox parameter but never uses it.


L10 — No CHANGELOG Entry [Process]

All other recent merged PRs in this repository include a CHANGELOG.md entry under ## Unreleased. This PR does not.


Findings Summary Table

ID Severity Category Description
H1 High Bug / Spec ChangeSet integration broken for file-move — empty path, wrong operation
H2 High Bug dir-create always returns created: True even when directory pre-existed
H3 High Bug file-move silently overwrites existing destination (data loss risk)
H4 High Spec Missing checkpointable=True and checkpoint_scope on write tools
M1 Medium Spec Missing encoding field in file-info output
M2 Medium Security validate_path prefix check bypassable (pre-existing, extended by new tools)
M3 Medium Test Coverage No path traversal tests for dir-create and file-info
M4 Medium Test Coverage No source path traversal test for file-move
M5 Medium Test Coverage No test for overwriting existing destination
M6 Medium Test Coverage No test for dir-create on existing directory
M7 Medium Code Quality sandbox_root inconsistently in input_schema (new tools have it, old tools don't)
L1 Low Bug st_ctime not creation time on Linux
L2 Low Bug file-move doesn't validate source is a file
L3 Low Test Flaw ISO timestamp validation too weak ("T" in value)
L4 Low Test Flaw Robot tests missing on_timeout=kill
L5 Low Test Flaw Robot tests don't verify clean stderr
L6 Low Test Coverage No test for file-info on symlinks
L7 Low Test Coverage No test for moving a directory
L8 Low Test Coverage Test steps don't support changeset capture
L9 Low Code Quality Unused sandbox parameter in _make_runner
L10 Low Process No CHANGELOG entry

Recommendation: H1 through H4 should be addressed before merge. M1-M7 are strongly recommended. Low-severity items can be deferred but are worth addressing in this PR where feasible.

# Code Review: PR #1070 — `feat(tool): add move_file, create_directory, and get_file_info built-in tools` **Reviewer:** Automated review (on behalf of review request) **Scope:** Code changes in branch `feature/m4-missing-builtin-tools` (commit `e65c386`) plus close connections to surrounding code. **References:** Issue #883, `docs/specification.md` **Review cycles performed:** 8 (bug detection, security, test coverage/flaws, performance, spec compliance, code quality, second global sweep, final global sweep) --- ## Summary The PR implements three missing built-in resource tools (`builtin/file-move`, `builtin/dir-create`, `builtin/file-info`) with 7 Behave scenarios, 4 Robot Framework smoke tests, and 3 ASV benchmark suites. The tool handler logic is straightforward, but the review identified **21 findings** across 6 categories with issues ranging from a broken ChangeSet integration to missing test coverage for security-critical edge cases. **Finding counts:** 4 High, 7 Medium, 10 Low --- ## HIGH Severity ### H1 — ChangeSet Integration Broken for `builtin/file-move` [Bug / Spec Compliance] **Location:** `src/cleveragents/tool/builtins/changeset.py` (`wrap_tool`, line ~211) interacting with `file_tools.py:246-258` `ChangeSetCapture.wrap_tool()` extracts the file path via `inputs.get("path", "")`. However, `builtin/file-move` uses `source_path` and `destination_path` — not `path`. When file-move is wrapped via `register_file_tools_with_changeset()`: - `path_str` evaluates to `""` (empty string) — no before/after hashes are computed. - `_detect_operation("builtin/file-move", None, None, output)` falls through all checks and returns `"modify"` (wrong — should be `"move"`). - `_normalize_path("", sandbox)` returns `""`. - The resulting `ChangeSetEntry` has `operation="modify"`, `path=""`, and `None` hashes — **completely wrong and useless**. The specification states: *"Each built-in tool automatically: Records changes to the ChangeSet"* (spec line ~19443). Additionally, `_detect_operation` has no branch for `"move"` even though `ChangeSetEntry.operation` documents `"move"` as a valid value and `_map_operation` knows how to map it to `ChangeOperation.RENAME`. **Suggested fix:** `wrap_tool` needs a special case for tools with `source_path`/`destination_path` inputs, and `_detect_operation` needs a `"move" in tool_name` branch returning `"move"`. --- ### H2 — `_handle_dir_create` Always Returns `created: True` [Bug] **Location:** `file_tools.py:261-271` The handler unconditionally returns `{"created": True}` regardless of whether the directory already existed. Since `exist_ok=True` is always passed to `mkdir()`, the call succeeds silently for pre-existing directories. This is semantically incorrect — callers relying on this field to determine if a directory was actually created will get misleading information. **Suggested fix:** ```python existed = path.exists() path.mkdir(parents=parents, exist_ok=True) return {"path": str(path), "created": not existed} ``` --- ### H3 — `_handle_file_move` Silently Overwrites Existing Destination [Bug] **Location:** `file_tools.py:246-258` `shutil.move()` overwrites the destination file without warning if it already exists. During plan execution, an agent moving a file to a path that already has content would silently destroy the original — a data loss risk with no safety net. No `overwrite` parameter exists to control this behavior. **Suggested fix:** Either add an `overwrite: bool = False` parameter that rejects the move when destination exists and `overwrite` is False, or at minimum check `destination.exists()` before moving and raise a clear error. --- ### H4 — Missing `checkpointable` and `checkpoint_scope` on Write Tools [Spec Compliance] **Location:** `file_tools.py:506-537` (`FILE_MOVE_SPEC`), `file_tools.py:539-570` (`DIR_CREATE_SPEC`) The specification's skill tool listing (spec line ~7250) shows all write-capable file tools have `checkpoint_scope: file`. The new tools only set `writes=True` but don't set `checkpointable=True` or `checkpoint_scope="file"`. Without this, the sandbox/checkpoint system cannot properly track and rollback operations from these tools. --- ## MEDIUM Severity ### M1 — Missing `encoding` Field in `file-info` Output [Spec Compliance] **Location:** `file_tools.py:274-288` Issue #883 explicitly specifies that `file-info` *"Returns: size, permissions, timestamps (created, modified, accessed), file type, encoding."* The implementation does not return an `encoding` field. The output schema also omits it. --- ### M2 — `validate_path` String Prefix Check is Bypassable [Security — Pre-existing, Extended by New Tools] **Location:** `file_tools.py:91` The check `str(target).startswith(str(root))` is vulnerable to prefix collisions. If `sandbox_root` is `/tmp/sandbox`, then a path resolving to `/tmp/sandbox_evil/file` passes the check because the string `/tmp/sandbox_evil/file` starts with `/tmp/sandbox`. While this is pre-existing code (not introduced by this PR), the new tools **significantly extend the attack surface**: `file-move` can now move files to/from escaped paths and `dir-create` can create directories outside the sandbox — both write operations. **Suggested fix:** Use `target.is_relative_to(root)` (Python 3.9+) or compare with trailing separator: `str(target).startswith(str(root) + os.sep) or target == root`. --- ### M3 — No Path Traversal Tests for `dir-create` and `file-info` [Test Coverage] `dir-create` is a write operation and `file-info` can leak information about files outside the sandbox. Both need path traversal rejection tests. Currently only `file-move` destination traversal is tested. --- ### M4 — No Source Path Traversal Test for `file-move` [Test Coverage] Only the destination path traversal is tested (`../../escaped.txt`). A malicious source path (e.g., `../../etc/passwd`) is not tested. This is particularly important given M2. --- ### M5 — No Test for Overwriting Existing Destination in `file-move` [Test Coverage] The silent overwrite behavior described in H3 has no test coverage. A scenario creating a file at the destination path before the move would expose this behavior. --- ### M6 — No Test for `dir-create` on Already Existing Directory [Test Coverage] The `created: True` bug (H2) has no test to catch it. A scenario creating a directory that already exists and asserting `created` is `false` would be a regression guard. --- ### M7 — `sandbox_root` Inconsistently Included in `input_schema` [Code Quality] **Location:** `file_tools.py:506-604` (new specs) vs. `file_tools.py:296-503` (existing specs) The 6 existing tools omit `sandbox_root` from their `input_schema` properties (even though their handlers accept it). The 3 new tools explicitly declare it. This inconsistency means schema-validating callers would treat `sandbox_root` differently for old vs. new tools. The module docstring (line 23) says *"All file tools accept an optional `sandbox_root` input parameter"* — either all schemas should declare it or none should. --- ## LOW Severity ### L1 — `st_ctime` is Not Creation Time on Linux [Bug] **Location:** `file_tools.py:285` On Linux, `st_ctime` is the inode change time (last metadata change), not the file creation time. The field name `created` is misleading. Consider documenting this limitation or using `st_birthtime` where available (Python 3.12+ on some platforms). --- ### L2 — `_handle_file_move` Does Not Validate Source is a File [Bug] **Location:** `file_tools.py:246-252` The description says *"Move/rename a file"* but `shutil.move` works on directories too. Nothing prevents moving a directory, making the behavior inconsistent with the stated description. --- ### L3 — ISO Timestamp Validation in Tests is Too Weak [Test Flaw] **Location:** `features/steps/builtin_file_tools_new_steps.py:77` The assertion `"T" in value` would pass for any string containing the letter "T" (e.g., `"True"`, `"Test"`). Should use `datetime.fromisoformat(value)` for proper validation. --- ### L4 — Robot Tests Missing `on_timeout=kill` [Test Flaw] **Location:** `robot/builtin_file_tools_new.robot` (all `Run Process` calls) The established project pattern (visible in recent PRs #812, #1050, #1052) is to use `on_timeout=kill` on `Run Process` calls. Without it, a hung test could block CI indefinitely. --- ### L5 — Robot Tests Do Not Verify Clean stderr [Test Flaw] Tests only check stdout content but don't assert that stderr is clean. Unexpected warnings or errors on stderr would go undetected. --- ### L6 — No Test for `file-info` on Symlinks [Test Coverage] Symlink behavior (following symlinks to report target metadata vs. reporting the symlink itself) is untested. --- ### L7 — No Test for Moving a Directory [Test Coverage] If `file-move` should reject directories (per L2), a test should verify this. If it should accept them, the tool description should be updated. --- ### L8 — Test Steps Do Not Support ChangeSet Capture [Test Coverage] **Location:** `features/steps/builtin_file_tools_new_steps.py:22-28` The `_run_tool()` helper always creates a fresh `ToolRegistry` without changeset capture, unlike the existing `tool_builtins_steps.py` which checks for `context.changeset_capture`. The changeset wrapping path is never tested for the new tools, which means H1 was not caught by tests. --- ### L9 — Unused `sandbox` Parameter in Robot Helper [Code Quality] **Location:** `robot/helper_builtin_file_tools_new.py:19` `_make_runner(sandbox)` accepts a `sandbox` parameter but never uses it. --- ### L10 — No CHANGELOG Entry [Process] All other recent merged PRs in this repository include a `CHANGELOG.md` entry under `## Unreleased`. This PR does not. --- ## Findings Summary Table | ID | Severity | Category | Description | |----|----------|----------|-------------| | H1 | **High** | Bug / Spec | ChangeSet integration broken for `file-move` — empty path, wrong operation | | H2 | **High** | Bug | `dir-create` always returns `created: True` even when directory pre-existed | | H3 | **High** | Bug | `file-move` silently overwrites existing destination (data loss risk) | | H4 | **High** | Spec | Missing `checkpointable=True` and `checkpoint_scope` on write tools | | M1 | Medium | Spec | Missing `encoding` field in `file-info` output | | M2 | Medium | Security | `validate_path` prefix check bypassable (pre-existing, extended by new tools) | | M3 | Medium | Test Coverage | No path traversal tests for `dir-create` and `file-info` | | M4 | Medium | Test Coverage | No source path traversal test for `file-move` | | M5 | Medium | Test Coverage | No test for overwriting existing destination | | M6 | Medium | Test Coverage | No test for `dir-create` on existing directory | | M7 | Medium | Code Quality | `sandbox_root` inconsistently in `input_schema` (new tools have it, old tools don't) | | L1 | Low | Bug | `st_ctime` not creation time on Linux | | L2 | Low | Bug | `file-move` doesn't validate source is a file | | L3 | Low | Test Flaw | ISO timestamp validation too weak (`"T" in value`) | | L4 | Low | Test Flaw | Robot tests missing `on_timeout=kill` | | L5 | Low | Test Flaw | Robot tests don't verify clean stderr | | L6 | Low | Test Coverage | No test for `file-info` on symlinks | | L7 | Low | Test Coverage | No test for moving a directory | | L8 | Low | Test Coverage | Test steps don't support changeset capture | | L9 | Low | Code Quality | Unused `sandbox` parameter in `_make_runner` | | L10 | Low | Process | No CHANGELOG entry | --- **Recommendation:** H1 through H4 should be addressed before merge. M1-M7 are strongly recommended. Low-severity items can be deferred but are worth addressing in this PR where feasible.
CoreRasurae force-pushed feature/m4-missing-builtin-tools from e65c3866f8
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 21s
CI / lint (pull_request) Successful in 26s
CI / quality (pull_request) Successful in 30s
CI / typecheck (pull_request) Successful in 51s
CI / security (pull_request) Successful in 54s
CI / unit_tests (pull_request) Successful in 3m28s
CI / integration_tests (pull_request) Successful in 3m38s
CI / docker (pull_request) Successful in 56s
CI / e2e_tests (pull_request) Successful in 5m14s
CI / coverage (pull_request) Successful in 7m13s
CI / benchmark-regression (pull_request) Successful in 38m1s
to ff8dcbddfb
Some checks failed
CI / typecheck (pull_request) Successful in 1m5s
CI / quality (pull_request) Successful in 49s
CI / benchmark-publish (pull_request) Has been skipped
CI / security (pull_request) Successful in 54s
CI / build (pull_request) Successful in 18s
CI / unit_tests (pull_request) Successful in 3m5s
CI / integration_tests (pull_request) Successful in 4m21s
CI / e2e_tests (pull_request) Successful in 5m1s
CI / lint (pull_request) Successful in 14s
CI / coverage (pull_request) Successful in 7m0s
CI / docker (pull_request) Successful in 1m11s
CI / benchmark-regression (pull_request) Has been cancelled
2026-03-19 22:36:12 +00:00
Compare
CoreRasurae left a comment

Code Review Report — PR #1070 (Issue #883)

feat(tool): add move_file, create_directory, and get_file_info built-in tools

Review performed over 3 global analysis cycles covering: Bugs, Security, Test Coverage, Test Flaws, Performance, and Specification Compliance. All findings experimentally verified where applicable.

Scope: Code changes on feature/m4-missing-builtin-tools plus close connections to surrounding code (changeset.py, file_tools.py, ToolRunner, ChangeEntry validators, validate_path).


Summary

Severity Count
High 2
Medium 9
Low 9
Total 20

1. BUGS

P1-1 — [HIGH] _file_hash crashes with IsADirectoryError when dir-create is used with changeset wrapping

File: changeset.py:118-124

_file_hash() calls p.read_bytes() without checking if the path is a directory. When builtin/dir-create is wrapped by ChangeSetCapture (the production code path during plan execution), the wrapped handler creates the directory successfully, then attempts _file_hash() on the directory path for the after hash — which raises IsADirectoryError.

Impact: The ToolRunner catches the exception and returns success=False, but the directory was already created on disk. This causes a silent partial success: the directory exists (side effect persisted), the tool reports failure, and no changeset entry is recorded. The plan's changeset becomes incomplete.

Reproduction path: ChangeSetCapture.wrap_tool(DIR_CREATE_SPEC) → execute → _wrapped_handler_file_hash(dir_path, sandbox)Path.read_bytes()IsADirectoryError.

Verified experimentally: Path(tempdir).read_bytes() raises IsADirectoryError: [Errno 21] Is a directory.

Suggested fix: Guard _file_hash against directories:

def _file_hash(path_str, sandbox_root=None):
    root = Path(sandbox_root) if sandbox_root else Path.cwd()
    p = (root / path_str).resolve()
    if not p.exists() or p.is_dir():
        return None
    return hashlib.sha256(p.read_bytes()).hexdigest()

P1-2 — [HIGH] No test for dir-create with changeset capture wrapping

File: features/builtin_file_tools_new.feature, features/tool_builtins.feature

All BDD tests for dir-create use register_file_tools() (no changeset wrapping). The changeset registration test (Scenario: Register file tools with changeset) only checks tool count, not execution. The production code path (register_file_tools_with_changeset) is never exercised for any of the 3 new tools, leaving P1-1 undetected.

Suggested fix: Add a BDD scenario that executes dir-create through a changeset-wrapped registry and verifies the changeset entry has operation="create", correct path, and None hashes.


P2-1 — [MEDIUM] file-move returns incorrect destination path when destination is an existing directory

File: file_tools.py:246-261

shutil.move(src, dst) when dst is an existing directory moves the source file inside that directory (e.g., dst/source.txt). The handler returns {"destination": str(destination)} which is the directory path, not the actual final file location.

Verified experimentally: shutil.move("/sandbox/a.txt", "/sandbox/existing_dir/") → file ends up at /sandbox/existing_dir/a.txt, but handler returns destination: "/sandbox/existing_dir".

Suggested fix: Check if destination is an existing directory and either reject it (consistent with the "not a file" check on source) or capture shutil.move()'s return value to report the correct final path.


P2-2 — [MEDIUM] Move detection in changeset _wrapped_handler relies on input field names instead of tool name

File: changeset.py:211-214

source_path = inputs.get("source_path", "")
dest_path = inputs.get("destination_path", "")
is_move = bool(source_path and dest_path)

This detects move operations by checking for source_path/destination_path in the input dict. If any future non-move tool has parameters with these names, it would be incorrectly treated as a move — computing hashes from wrong paths. The _detect_operation function already uses "move" in tool_name, so the detection approach is inconsistent within the same file.

Suggested fix: Use is_move = "move" in tool_spec.name for consistency with _detect_operation.


P2-3 — [MEDIUM] file-info reports st_ctime as "created" timestamp — incorrect on Linux

File: file_tools.py:289

"created": datetime.fromtimestamp(stat.st_ctime, tz=UTC).isoformat(),

On Linux, st_ctime is the inode change time (metadata change time), not file creation time. File creation time (st_birthtime) is only available on macOS/BSDs and Python 3.12+ on some Linux kernels. The "created" field reports misleading data on Linux.

Verified experimentally: hasattr(os.stat_result, 'st_birthtime') returns False on this Linux platform.

Suggested fix: Either rename the field to "ctime" with a description clarifying its meaning, or use conditional logic: st_birthtime when available, fall back to st_ctime with a note in the output schema description.


P2-4 — [MEDIUM] _detect_operation returns "modify" for dir-create on already-existing directory

File: changeset.py:306-324

When dir-create is called on an already-existing directory (returns created: False), _detect_operation falls through all checks to return "modify". This is semantically incorrect — re-creating an existing directory is a no-op, not a modification. Combined with P1-1, if the hash bug is fixed (both hashes None), the entry would record operation="modify" with before_hash=None, after_hash=None.

Suggested fix: Add a case for when both hashes are None and created is False to return a no-op marker, or skip entry creation entirely.


P2-5 — [MEDIUM] permissions field in file-info includes file type bits

File: file_tools.py:288

"permissions": oct(stat.st_mode),

This returns 0o100644 (for a regular file with 644 permissions) instead of just 0o644. The 0o100 prefix encodes the file type (regular file), which is not what "permissions" conventionally means.

Verified experimentally: oct(os.stat('/etc/hostname').st_mode)'0o100644'.

Suggested fix: Use oct(stat.st_mode & 0o777) or oct(stat.S_IMODE(stat.st_mode)) to extract only the permission bits.


P2-6 — [MEDIUM] _detect_encoding reports "utf-8" for all non-binary text files regardless of actual encoding

File: file_tools.py:296-307

The function only checks for null bytes to detect binary files. Any text file without null bytes is reported as "utf-8", even if the actual encoding is Latin-1, Shift-JIS, etc.

Verified experimentally: A file containing café résumé in Latin-1 (b'caf\xe9 r\xe9sum\xe9') — which is NOT valid UTF-8 — is reported as "utf-8".

Suggested fix: Attempt to decode the chunk as UTF-8 and return "utf-8" only on success; return None or "unknown" on UnicodeDecodeError.


P3-1 — [LOW] dir-create raises unhandled FileExistsError when path is an existing regular file

File: file_tools.py:264-275

path.mkdir(parents=parents, exist_ok=True) raises FileExistsError when the target is an existing regular file (not a directory). The exist_ok=True flag only suppresses the error for existing directories. The error is caught by ToolRunner and normalized, but the message is a raw Python exception.

Verified experimentally: Path('/tmp/file.txt').mkdir(parents=True, exist_ok=True)FileExistsError.


P3-2 — [LOW] file-move error message for non-existent source is misleading

File: file_tools.py:251

if not source.is_file():
    raise ValueError(f"Source is not a file: {source}")

When the source doesn't exist at all, is_file() returns False and the error says "not a file" — but the real issue is the file doesn't exist. A more accurate message would distinguish "does not exist" from "is a directory/special file".


2. SECURITY

P2-7 — [MEDIUM] validate_path uses str.startswith() for sandbox boundary check — vulnerable to directory name prefix collision (pre-existing, amplified by 3 new tools)

File: file_tools.py:91

if not str(target).startswith(str(root)):

If root is /tmp/sandbox123, a path resolving to /tmp/sandbox1234evil/file passes the check.

Verified experimentally: Created /tmp/tmpXXevil/ alongside /tmp/tmpXX/. str(target).startswith(str(root))True (incorrect). target.is_relative_to(root)False (correct).

Note: This is a pre-existing vulnerability affecting all 9 file tools, not introduced by this PR. However, the 3 new tools increase the attack surface. Path.is_relative_to() (Python 3.9+) is the correct check.


3. TEST COVERAGE GAPS

P2-8 — [MEDIUM] No test for file-move with changeset capture wrapping

File: features/builtin_file_tools_new.feature

The changeset wrapper was modified to handle move operations (source/destination path detection, metadata recording, _detect_operation update), but no test verifies that the changeset entry has correct operation="move", correct path (destination), correct metadata["source_path"], or correct before_hash/after_hash values.


P2-9 — [MEDIUM] No test for file-move when source doesn't exist

File: features/builtin_file_tools_new.feature

The feature tests source path traversal and source-is-directory, but not source-file-not-found. This is a common failure path that agents will encounter.


P2-10 — [MEDIUM] No test for file-move when destination is an existing directory

File: features/builtin_file_tools_new.feature

Would expose P2-1 (incorrect destination path in return value).


P2-11 — [MEDIUM] No test for dir-create when path is an existing regular file

File: features/builtin_file_tools_new.feature

Would expose P3-1 (unhandled FileExistsError from mkdir). Verifying the error message content would ensure agents receive actionable feedback.


P3-3 — [LOW] No test for binary file encoding detection in file-info

No scenario verifies that _detect_encoding returns None for binary files containing null bytes.


P3-4 — [LOW] No test for file-move creating destination parent directories

The handler calls destination.parent.mkdir(parents=True, exist_ok=True) but no test exercises the case where the destination path has non-existent parent directories.


P3-5 — [LOW] Robot tests cover only 4 of 14 BDD scenarios

The Robot suite covers move, dir_create, file_info, and move_traversal. Missing: dir-create path traversal, file-info path traversal, existing directory re-creation, overwrite behavior, source-is-directory rejection, etc.


4. TEST FLAWS

P3-6 — [LOW] ISO timestamp validation assertion is too permissive

File: builtin_file_tools_new_steps.py:79

assert parsed.tzinfo is not None or "T" in value

The or "T" in value condition makes this always pass for any valid ISO timestamp containing T. Since the production code uses tz=UTC, the check should verify timezone awareness specifically:

assert parsed.tzinfo is not None, f"Expected timezone-aware timestamp for {key}"

P3-7 — [LOW] Overwrite test uses substring matching instead of exact content match

File: builtin_file_tools_new_steps.py:89

assert expected in content

The "Move a file overwrites existing destination" scenario uses assert "new content" in content which would pass even if old content was also present. An exact match (content == expected) would be more precise.


P3-8 — [LOW] dir-create without-parents failure test doesn't verify error message

File: builtin_file_tools_new.feature:60-62

The scenario only asserts tool result should not be successful without verifying the error content. Adding the tool result error should mention "FileNotFoundError" or similar would make the test more specific.


5. SPECIFICATION COMPLIANCE

P2-12 — [MEDIUM] checkpointable capability not set on write tools

File: file_tools.py:550, 579

The specification (skill tools table, JSON format) shows checkpoint: "file" for move_file and create_directory. The implementation sets ToolCapability(writes=True) without checkpointable=True. This follows the pre-existing pattern of the original 6 tools, but the issue #883 acceptance criteria mention "checkpointable" in the capability metadata expectation.


P3-9 — [LOW] Input parameter names differ from specification

Spec (line 21969): move_file(source: str, destination: str). Implementation: source_path, destination_path. This is a naming inconsistency with the specification's simplified signatures, though it's arguably more descriptive.


P3-10 — [LOW] file-info "created" field semantics don't match specification

Covered by P2-3. The specification says "timestamps (created, modified, accessed)" but "created" is actually st_ctime (inode change time on Linux), not file creation time.


6. PERFORMANCE

P3-11 — [LOW] Benchmark time_file_move includes file creation I/O in timing

File: bench_builtin_file_tools_new.py:44

(Path(self.sandbox) / src).write_text("benchmark data") is inside time_file_move(), adding file creation overhead to the benchmark measurement. Ideally, file creation should be in setup or a separate pre-benchmark step.


Prioritized Fix Recommendations

Priority IDs Action
Must fix P1-1, P1-2 Fix _file_hash to skip directories; add changeset integration test for dir-create
Should fix P2-1, P2-2, P2-3, P2-5 Fix shutil.move directory destination handling; use tool name for move detection; fix st_ctime/permissions output
Should add P2-8 through P2-11 Add missing test scenarios for changeset+move, non-existent source, directory-as-destination, file-as-dir-create target
Nice to fix P2-6, P2-7, P2-12, P3-* Improve encoding detection, note validate_path prefix issue, set checkpointable, improve test assertions

Review performed over 3 global analysis cycles. All bugs marked as "verified experimentally" were confirmed via isolated Python scripts on the target platform (Linux).

# Code Review Report — PR #1070 (Issue #883) **feat(tool): add move_file, create_directory, and get_file_info built-in tools** Review performed over 3 global analysis cycles covering: **Bugs, Security, Test Coverage, Test Flaws, Performance, and Specification Compliance**. All findings experimentally verified where applicable. **Scope**: Code changes on `feature/m4-missing-builtin-tools` plus close connections to surrounding code (changeset.py, file_tools.py, ToolRunner, ChangeEntry validators, validate_path). --- ## Summary | Severity | Count | |----------|-------| | High | 2 | | Medium | 9 | | Low | 9 | | **Total** | **20** | --- ## 1. BUGS ### P1-1 — [HIGH] `_file_hash` crashes with `IsADirectoryError` when `dir-create` is used with changeset wrapping **File**: `changeset.py:118-124` `_file_hash()` calls `p.read_bytes()` without checking if the path is a directory. When `builtin/dir-create` is wrapped by `ChangeSetCapture` (the production code path during plan execution), the wrapped handler creates the directory successfully, then attempts `_file_hash()` on the directory path for the `after` hash — which raises `IsADirectoryError`. **Impact**: The `ToolRunner` catches the exception and returns `success=False`, but the directory **was already created on disk**. This causes a **silent partial success**: the directory exists (side effect persisted), the tool reports failure, and no changeset entry is recorded. The plan's changeset becomes incomplete. **Reproduction path**: `ChangeSetCapture.wrap_tool(DIR_CREATE_SPEC)` → execute → `_wrapped_handler` → `_file_hash(dir_path, sandbox)` → `Path.read_bytes()` → `IsADirectoryError`. **Verified experimentally**: `Path(tempdir).read_bytes()` raises `IsADirectoryError: [Errno 21] Is a directory`. **Suggested fix**: Guard `_file_hash` against directories: ```python def _file_hash(path_str, sandbox_root=None): root = Path(sandbox_root) if sandbox_root else Path.cwd() p = (root / path_str).resolve() if not p.exists() or p.is_dir(): return None return hashlib.sha256(p.read_bytes()).hexdigest() ``` --- ### P1-2 — [HIGH] No test for `dir-create` with changeset capture wrapping **File**: `features/builtin_file_tools_new.feature`, `features/tool_builtins.feature` All BDD tests for `dir-create` use `register_file_tools()` (no changeset wrapping). The changeset registration test (`Scenario: Register file tools with changeset`) only checks tool count, not execution. The production code path (`register_file_tools_with_changeset`) is **never exercised** for any of the 3 new tools, leaving P1-1 undetected. **Suggested fix**: Add a BDD scenario that executes `dir-create` through a changeset-wrapped registry and verifies the changeset entry has `operation="create"`, correct path, and `None` hashes. --- ### P2-1 — [MEDIUM] `file-move` returns incorrect destination path when destination is an existing directory **File**: `file_tools.py:246-261` `shutil.move(src, dst)` when `dst` is an existing directory moves the source file **inside** that directory (e.g., `dst/source.txt`). The handler returns `{"destination": str(destination)}` which is the directory path, not the actual final file location. **Verified experimentally**: `shutil.move("/sandbox/a.txt", "/sandbox/existing_dir/")` → file ends up at `/sandbox/existing_dir/a.txt`, but handler returns `destination: "/sandbox/existing_dir"`. **Suggested fix**: Check if destination is an existing directory and either reject it (consistent with the "not a file" check on source) or capture `shutil.move()`'s return value to report the correct final path. --- ### P2-2 — [MEDIUM] Move detection in changeset `_wrapped_handler` relies on input field names instead of tool name **File**: `changeset.py:211-214` ```python source_path = inputs.get("source_path", "") dest_path = inputs.get("destination_path", "") is_move = bool(source_path and dest_path) ``` This detects move operations by checking for `source_path`/`destination_path` in the input dict. If any future non-move tool has parameters with these names, it would be incorrectly treated as a move — computing hashes from wrong paths. The `_detect_operation` function already uses `"move" in tool_name`, so the detection approach is inconsistent within the same file. **Suggested fix**: Use `is_move = "move" in tool_spec.name` for consistency with `_detect_operation`. --- ### P2-3 — [MEDIUM] `file-info` reports `st_ctime` as "created" timestamp — incorrect on Linux **File**: `file_tools.py:289` ```python "created": datetime.fromtimestamp(stat.st_ctime, tz=UTC).isoformat(), ``` On Linux, `st_ctime` is the **inode change time** (metadata change time), **not** file creation time. File creation time (`st_birthtime`) is only available on macOS/BSDs and Python 3.12+ on some Linux kernels. The "created" field reports misleading data on Linux. **Verified experimentally**: `hasattr(os.stat_result, 'st_birthtime')` returns `False` on this Linux platform. **Suggested fix**: Either rename the field to `"ctime"` with a description clarifying its meaning, or use conditional logic: `st_birthtime` when available, fall back to `st_ctime` with a note in the output schema description. --- ### P2-4 — [MEDIUM] `_detect_operation` returns `"modify"` for `dir-create` on already-existing directory **File**: `changeset.py:306-324` When `dir-create` is called on an already-existing directory (returns `created: False`), `_detect_operation` falls through all checks to `return "modify"`. This is semantically incorrect — re-creating an existing directory is a no-op, not a modification. Combined with P1-1, if the hash bug is fixed (both hashes `None`), the entry would record `operation="modify"` with `before_hash=None, after_hash=None`. **Suggested fix**: Add a case for when both hashes are `None` and `created` is `False` to return a no-op marker, or skip entry creation entirely. --- ### P2-5 — [MEDIUM] `permissions` field in `file-info` includes file type bits **File**: `file_tools.py:288` ```python "permissions": oct(stat.st_mode), ``` This returns `0o100644` (for a regular file with 644 permissions) instead of just `0o644`. The `0o100` prefix encodes the file type (regular file), which is not what "permissions" conventionally means. **Verified experimentally**: `oct(os.stat('/etc/hostname').st_mode)` → `'0o100644'`. **Suggested fix**: Use `oct(stat.st_mode & 0o777)` or `oct(stat.S_IMODE(stat.st_mode))` to extract only the permission bits. --- ### P2-6 — [MEDIUM] `_detect_encoding` reports `"utf-8"` for all non-binary text files regardless of actual encoding **File**: `file_tools.py:296-307` The function only checks for null bytes to detect binary files. Any text file without null bytes is reported as `"utf-8"`, even if the actual encoding is Latin-1, Shift-JIS, etc. **Verified experimentally**: A file containing `café résumé` in Latin-1 (`b'caf\xe9 r\xe9sum\xe9'`) — which is NOT valid UTF-8 — is reported as `"utf-8"`. **Suggested fix**: Attempt to decode the chunk as UTF-8 and return `"utf-8"` only on success; return `None` or `"unknown"` on `UnicodeDecodeError`. --- ### P3-1 — [LOW] `dir-create` raises unhandled `FileExistsError` when path is an existing regular file **File**: `file_tools.py:264-275` `path.mkdir(parents=parents, exist_ok=True)` raises `FileExistsError` when the target is an existing regular file (not a directory). The `exist_ok=True` flag only suppresses the error for existing directories. The error is caught by `ToolRunner` and normalized, but the message is a raw Python exception. **Verified experimentally**: `Path('/tmp/file.txt').mkdir(parents=True, exist_ok=True)` → `FileExistsError`. --- ### P3-2 — [LOW] `file-move` error message for non-existent source is misleading **File**: `file_tools.py:251` ```python if not source.is_file(): raise ValueError(f"Source is not a file: {source}") ``` When the source doesn't exist at all, `is_file()` returns `False` and the error says "not a file" — but the real issue is the file doesn't exist. A more accurate message would distinguish "does not exist" from "is a directory/special file". --- ## 2. SECURITY ### P2-7 — [MEDIUM] `validate_path` uses `str.startswith()` for sandbox boundary check — vulnerable to directory name prefix collision (pre-existing, amplified by 3 new tools) **File**: `file_tools.py:91` ```python if not str(target).startswith(str(root)): ``` If `root` is `/tmp/sandbox123`, a path resolving to `/tmp/sandbox1234evil/file` passes the check. **Verified experimentally**: Created `/tmp/tmpXXevil/` alongside `/tmp/tmpXX/`. `str(target).startswith(str(root))` → `True` (incorrect). `target.is_relative_to(root)` → `False` (correct). **Note**: This is a pre-existing vulnerability affecting all 9 file tools, not introduced by this PR. However, the 3 new tools increase the attack surface. `Path.is_relative_to()` (Python 3.9+) is the correct check. --- ## 3. TEST COVERAGE GAPS ### P2-8 — [MEDIUM] No test for `file-move` with changeset capture wrapping **File**: `features/builtin_file_tools_new.feature` The changeset wrapper was modified to handle move operations (source/destination path detection, metadata recording, `_detect_operation` update), but no test verifies that the changeset entry has correct `operation="move"`, correct `path` (destination), correct `metadata["source_path"]`, or correct `before_hash`/`after_hash` values. --- ### P2-9 — [MEDIUM] No test for `file-move` when source doesn't exist **File**: `features/builtin_file_tools_new.feature` The feature tests source path traversal and source-is-directory, but not source-file-not-found. This is a common failure path that agents will encounter. --- ### P2-10 — [MEDIUM] No test for `file-move` when destination is an existing directory **File**: `features/builtin_file_tools_new.feature` Would expose P2-1 (incorrect destination path in return value). --- ### P2-11 — [MEDIUM] No test for `dir-create` when path is an existing regular file **File**: `features/builtin_file_tools_new.feature` Would expose P3-1 (unhandled `FileExistsError` from `mkdir`). Verifying the error message content would ensure agents receive actionable feedback. --- ### P3-3 — [LOW] No test for binary file encoding detection in `file-info` No scenario verifies that `_detect_encoding` returns `None` for binary files containing null bytes. --- ### P3-4 — [LOW] No test for `file-move` creating destination parent directories The handler calls `destination.parent.mkdir(parents=True, exist_ok=True)` but no test exercises the case where the destination path has non-existent parent directories. --- ### P3-5 — [LOW] Robot tests cover only 4 of 14 BDD scenarios The Robot suite covers move, dir_create, file_info, and move_traversal. Missing: dir-create path traversal, file-info path traversal, existing directory re-creation, overwrite behavior, source-is-directory rejection, etc. --- ## 4. TEST FLAWS ### P3-6 — [LOW] ISO timestamp validation assertion is too permissive **File**: `builtin_file_tools_new_steps.py:79` ```python assert parsed.tzinfo is not None or "T" in value ``` The `or "T" in value` condition makes this always pass for any valid ISO timestamp containing `T`. Since the production code uses `tz=UTC`, the check should verify timezone awareness specifically: ```python assert parsed.tzinfo is not None, f"Expected timezone-aware timestamp for {key}" ``` --- ### P3-7 — [LOW] Overwrite test uses substring matching instead of exact content match **File**: `builtin_file_tools_new_steps.py:89` ```python assert expected in content ``` The "Move a file overwrites existing destination" scenario uses `assert "new content" in content` which would pass even if old content was also present. An exact match (`content == expected`) would be more precise. --- ### P3-8 — [LOW] `dir-create` without-parents failure test doesn't verify error message **File**: `builtin_file_tools_new.feature:60-62` The scenario only asserts `tool result should not be successful` without verifying the error content. Adding `the tool result error should mention "FileNotFoundError"` or similar would make the test more specific. --- ## 5. SPECIFICATION COMPLIANCE ### P2-12 — [MEDIUM] `checkpointable` capability not set on write tools **File**: `file_tools.py:550, 579` The specification (skill tools table, JSON format) shows `checkpoint: "file"` for `move_file` and `create_directory`. The implementation sets `ToolCapability(writes=True)` without `checkpointable=True`. This follows the pre-existing pattern of the original 6 tools, but the issue #883 acceptance criteria mention "`checkpointable`" in the capability metadata expectation. --- ### P3-9 — [LOW] Input parameter names differ from specification Spec (line 21969): `move_file(source: str, destination: str)`. Implementation: `source_path`, `destination_path`. This is a naming inconsistency with the specification's simplified signatures, though it's arguably more descriptive. --- ### P3-10 — [LOW] `file-info` "created" field semantics don't match specification Covered by P2-3. The specification says "timestamps (created, modified, accessed)" but "created" is actually `st_ctime` (inode change time on Linux), not file creation time. --- ## 6. PERFORMANCE ### P3-11 — [LOW] Benchmark `time_file_move` includes file creation I/O in timing **File**: `bench_builtin_file_tools_new.py:44` `(Path(self.sandbox) / src).write_text("benchmark data")` is inside `time_file_move()`, adding file creation overhead to the benchmark measurement. Ideally, file creation should be in setup or a separate pre-benchmark step. --- ## Prioritized Fix Recommendations | Priority | IDs | Action | |----------|-----|--------| | **Must fix** | P1-1, P1-2 | Fix `_file_hash` to skip directories; add changeset integration test for `dir-create` | | **Should fix** | P2-1, P2-2, P2-3, P2-5 | Fix `shutil.move` directory destination handling; use tool name for move detection; fix `st_ctime`/permissions output | | **Should add** | P2-8 through P2-11 | Add missing test scenarios for changeset+move, non-existent source, directory-as-destination, file-as-dir-create target | | **Nice to fix** | P2-6, P2-7, P2-12, P3-* | Improve encoding detection, note `validate_path` prefix issue, set `checkpointable`, improve test assertions | --- *Review performed over 3 global analysis cycles. All bugs marked as "verified experimentally" were confirmed via isolated Python scripts on the target platform (Linux).*
CoreRasurae force-pushed feature/m4-missing-builtin-tools from ff8dcbddfb
Some checks failed
CI / typecheck (pull_request) Successful in 1m5s
CI / quality (pull_request) Successful in 49s
CI / benchmark-publish (pull_request) Has been skipped
CI / security (pull_request) Successful in 54s
CI / build (pull_request) Successful in 18s
CI / unit_tests (pull_request) Successful in 3m5s
CI / integration_tests (pull_request) Successful in 4m21s
CI / e2e_tests (pull_request) Successful in 5m1s
CI / lint (pull_request) Successful in 14s
CI / coverage (pull_request) Successful in 7m0s
CI / docker (pull_request) Successful in 1m11s
CI / benchmark-regression (pull_request) Has been cancelled
to dc4294cd0c
Some checks failed
CI / lint (pull_request) Successful in 15s
CI / typecheck (pull_request) Successful in 43s
CI / security (pull_request) Successful in 38s
CI / quality (pull_request) Successful in 32s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 36s
CI / unit_tests (pull_request) Successful in 3m26s
CI / integration_tests (pull_request) Successful in 3m12s
CI / docker (pull_request) Successful in 1m7s
CI / coverage (pull_request) Failing after 17m26s
CI / e2e_tests (pull_request) Failing after 17m28s
CI / benchmark-regression (pull_request) Successful in 38m17s
2026-03-20 17:12:49 +00:00
Compare
CoreRasurae force-pushed feature/m4-missing-builtin-tools from dc4294cd0c
Some checks failed
CI / lint (pull_request) Successful in 15s
CI / typecheck (pull_request) Successful in 43s
CI / security (pull_request) Successful in 38s
CI / quality (pull_request) Successful in 32s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 36s
CI / unit_tests (pull_request) Successful in 3m26s
CI / integration_tests (pull_request) Successful in 3m12s
CI / docker (pull_request) Successful in 1m7s
CI / coverage (pull_request) Failing after 17m26s
CI / e2e_tests (pull_request) Failing after 17m28s
CI / benchmark-regression (pull_request) Successful in 38m17s
to 8fec3877ab
Some checks failed
CI / lint (pull_request) Successful in 25s
CI / typecheck (pull_request) Successful in 41s
CI / quality (pull_request) Successful in 42s
CI / benchmark-publish (pull_request) Has been skipped
CI / security (pull_request) Successful in 48s
CI / build (pull_request) Successful in 18s
CI / benchmark-regression (pull_request) Failing after 12m39s
CI / coverage (pull_request) Failing after 12m39s
CI / e2e_tests (pull_request) Failing after 12m57s
CI / integration_tests (pull_request) Failing after 13m22s
CI / unit_tests (pull_request) Failing after 13m22s
CI / docker (pull_request) Has been cancelled
2026-03-20 18:13:57 +00:00
Compare
Owner

Planning review (Day 42):

  • Merge conflict: This PR is currently not mergeable. Please rebase or resolve conflicts against the target branch.
  • Reviewers assigned: @freemo + @brent.edwards.

Metadata otherwise looks good: milestone set (v3.4.0), Type/Feature label present, closes #883.

**Planning review (Day 42):** - **Merge conflict**: This PR is currently not mergeable. Please rebase or resolve conflicts against the target branch. - **Reviewers assigned**: @freemo + @brent.edwards. Metadata otherwise looks good: milestone set (v3.4.0), Type/Feature label present, closes #883.
Owner

Day 43 Review — Rebase Required

This PR has merge conflicts with master. @freemo: Please rebase onto current master at your earliest convenience.

**Day 43 Review — Rebase Required** This PR has merge conflicts with `master`. @freemo: Please rebase onto current `master` at your earliest convenience.
CoreRasurae force-pushed feature/m4-missing-builtin-tools from 8fec3877ab
Some checks failed
CI / lint (pull_request) Successful in 25s
CI / typecheck (pull_request) Successful in 41s
CI / quality (pull_request) Successful in 42s
CI / benchmark-publish (pull_request) Has been skipped
CI / security (pull_request) Successful in 48s
CI / build (pull_request) Successful in 18s
CI / benchmark-regression (pull_request) Failing after 12m39s
CI / coverage (pull_request) Failing after 12m39s
CI / e2e_tests (pull_request) Failing after 12m57s
CI / integration_tests (pull_request) Failing after 13m22s
CI / unit_tests (pull_request) Failing after 13m22s
CI / docker (pull_request) Has been cancelled
to 1d4649d0c3
Some checks failed
CI / build (pull_request) Successful in 17s
CI / integration_tests (pull_request) Failing after 2m40s
CI / lint (pull_request) Successful in 3m17s
CI / quality (pull_request) Successful in 3m41s
CI / typecheck (pull_request) Successful in 3m53s
CI / security (pull_request) Successful in 3m56s
CI / unit_tests (pull_request) Successful in 6m57s
CI / docker (pull_request) Successful in 40s
CI / e2e_tests (pull_request) Successful in 8m25s
CI / coverage (pull_request) Successful in 11m19s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-publish (pull_request) Has been skipped
CI / benchmark-regression (pull_request) Has started running
2026-03-23 23:00:39 +00:00
Compare
freemo approved these changes 2026-03-24 15:29:24 +00:00
Dismissed
freemo left a comment

Review: APPROVED

Feature adding move_file, create_directory, and get_file_info built-in tools. Clean tool implementations following existing patterns.

## Review: APPROVED Feature adding `move_file`, `create_directory`, and `get_file_info` built-in tools. Clean tool implementations following existing patterns.
freemo approved these changes 2026-03-27 17:11:06 +00:00
Dismissed
freemo left a comment

Review: feat(tool): add move_file, create_directory, and get_file_info built-in tools

Approved with comments. Well-implemented tools with thorough test coverage.

Issues to Address

1. _handle_file_info missing FileNotFoundError handling (Medium)
path.stat() will raise a raw FileNotFoundError if the path doesn't exist. The test scenario expects a failure, but the error message won't be user-friendly. Consider an explicit existence check like _handle_file_move does.

2. Unrelated fix(test) commit (Low)
The fix(test): replace shell=True with shell=False commit in cli_coverage_steps.py is unrelated to the feature. Per CONTRIBUTING.md, this should be in a separate commit/PR.

3. Permissions output format (Low)
oct(stat_module.S_IMODE(st.st_mode)) produces "0o644" not "0644". This is valid Python octal but unusual for file permissions display. Consider f"{stat_module.S_IMODE(st.st_mode):04o}".

What's Good

  • Solid path traversal prevention via existing validate_path.
  • _handle_file_move correctly rejects directories as sources, creates parent directories for destination.
  • _detect_encoding reads 8KiB and checks for null bytes — simple and effective.
  • _file_hash fix to skip directories prevents IsADirectoryError for dir-create changesets.
  • 26 BDD scenarios covering happy paths, path traversal, overwrites, and changeset integration.
  • Robot Framework smoke tests and ASV benchmarks included.
## Review: feat(tool): add move_file, create_directory, and get_file_info built-in tools **Approved with comments.** Well-implemented tools with thorough test coverage. ### Issues to Address **1. `_handle_file_info` missing FileNotFoundError handling (Medium)** `path.stat()` will raise a raw `FileNotFoundError` if the path doesn't exist. The test scenario expects a failure, but the error message won't be user-friendly. Consider an explicit existence check like `_handle_file_move` does. **2. Unrelated `fix(test)` commit (Low)** The `fix(test): replace shell=True with shell=False` commit in `cli_coverage_steps.py` is unrelated to the feature. Per CONTRIBUTING.md, this should be in a separate commit/PR. **3. Permissions output format (Low)** `oct(stat_module.S_IMODE(st.st_mode))` produces `"0o644"` not `"0644"`. This is valid Python octal but unusual for file permissions display. Consider `f"{stat_module.S_IMODE(st.st_mode):04o}"`. ### What's Good - Solid path traversal prevention via existing `validate_path`. - `_handle_file_move` correctly rejects directories as sources, creates parent directories for destination. - `_detect_encoding` reads 8KiB and checks for null bytes — simple and effective. - `_file_hash` fix to skip directories prevents `IsADirectoryError` for `dir-create` changesets. - 26 BDD scenarios covering happy paths, path traversal, overwrites, and changeset integration. - Robot Framework smoke tests and ASV benchmarks included.
freemo self-assigned this 2026-04-02 06:15:20 +00:00
freemo force-pushed feature/m4-missing-builtin-tools from 1d4649d0c3
Some checks failed
CI / build (pull_request) Successful in 17s
CI / integration_tests (pull_request) Failing after 2m40s
CI / lint (pull_request) Successful in 3m17s
CI / quality (pull_request) Successful in 3m41s
CI / typecheck (pull_request) Successful in 3m53s
CI / security (pull_request) Successful in 3m56s
CI / unit_tests (pull_request) Successful in 6m57s
CI / docker (pull_request) Successful in 40s
CI / e2e_tests (pull_request) Successful in 8m25s
CI / coverage (pull_request) Successful in 11m19s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-publish (pull_request) Has been skipped
CI / benchmark-regression (pull_request) Has started running
to 07d104f831
Some checks failed
CI / lint (pull_request) Failing after 2s
CI / typecheck (pull_request) Failing after 1s
CI / coverage (pull_request) Has been skipped
CI / security (pull_request) Failing after 2s
CI / quality (pull_request) Failing after 2s
CI / unit_tests (pull_request) Failing after 1s
CI / docker (pull_request) Has been skipped
CI / integration_tests (pull_request) Failing after 2s
CI / e2e_tests (pull_request) Failing after 2s
CI / build (pull_request) Failing after 2s
CI / helm (pull_request) Failing after 1s
CI / status-check (pull_request) Failing after 1s
CI / benchmark-publish (pull_request) Has been skipped
CI / benchmark-regression (pull_request) Has been skipped
2026-04-02 08:36:39 +00:00
Compare
freemo dismissed freemo's review 2026-04-02 08:36:39 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

Owner

🤖 Backlog Groomer (groomer-1): Closing as duplicate of #883.

Issue #883 (feat(tool): add move_file, create_directory, and get_file_info built-in tools) is the canonical version with full labels (MoSCoW/Must have, Priority/High, State/In Review, Type/Feature) and milestone v3.4.0. This issue is an exact title duplicate.

🤖 **Backlog Groomer (groomer-1):** Closing as duplicate of #883. Issue #883 (`feat(tool): add move_file, create_directory, and get_file_info built-in tools`) is the canonical version with full labels (`MoSCoW/Must have`, `Priority/High`, `State/In Review`, `Type/Feature`) and milestone `v3.4.0`. This issue is an exact title duplicate.
freemo closed this pull request 2026-04-02 17:31:01 +00:00
Some checks failed
CI / lint (pull_request) Failing after 2s
Required
Details
CI / typecheck (pull_request) Failing after 1s
Required
Details
CI / coverage (pull_request) Has been skipped
Required
Details
CI / security (pull_request) Failing after 2s
Required
Details
CI / quality (pull_request) Failing after 2s
Required
Details
CI / unit_tests (pull_request) Failing after 1s
Required
Details
CI / docker (pull_request) Has been skipped
Required
Details
CI / integration_tests (pull_request) Failing after 2s
Required
Details
CI / e2e_tests (pull_request) Failing after 2s
CI / build (pull_request) Failing after 2s
Required
Details
CI / helm (pull_request) Failing after 1s
CI / status-check (pull_request) Failing after 1s
CI / benchmark-publish (pull_request) Has been skipped
CI / benchmark-regression (pull_request) Has been skipped

Pull request closed

Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core!1070
No description provided.