fix(test): replace shell=True with shell=False in cli_coverage_steps.py subprocess call #1071

Merged
CoreRasurae merged 1 commit from fix/m3-shell-true-subprocess into master 2026-03-23 23:21:30 +00:00
Member

Summary

Replaced shell=True with shell=False and shlex.split() for command tokenization in features/steps/cli_coverage_steps.py, addressing a security concern identified during the PR #727 review (Finding #3 by @hurui200320).

This aligns with the safer pattern already used in cli_plan_context_commands_steps.py:351 and eliminates a potential command injection vector, following defense-in-depth principles.

Changes

  • features/steps/cli_coverage_steps.py: Added import shlex, replaced shell=True with shlex.split(command) for proper command tokenization
  • Audit: Searched all step files under features/steps/ and robot/ helpers for other shell=True usages — no other Python subprocess calls found

Quality Gates

Session Result
nox -s lint PASS
nox -s typecheck PASS
nox -s unit_tests PASS (11,455 scenarios, 0 failed)
nox -s integration_tests PASS (1,600 tests, 0 failed)
nox -s coverage_report PASS (97.0%)

Closes #734

## Summary Replaced `shell=True` with `shell=False` and `shlex.split()` for command tokenization in `features/steps/cli_coverage_steps.py`, addressing a security concern identified during the PR #727 review (Finding #3 by @hurui200320). This aligns with the safer pattern already used in `cli_plan_context_commands_steps.py:351` and eliminates a potential command injection vector, following defense-in-depth principles. ## Changes - **`features/steps/cli_coverage_steps.py`**: Added `import shlex`, replaced `shell=True` with `shlex.split(command)` for proper command tokenization - **Audit**: Searched all step files under `features/steps/` and `robot/` helpers for other `shell=True` usages — no other Python subprocess calls found ## Quality Gates | Session | Result | |---------|--------| | `nox -s lint` | PASS | | `nox -s typecheck` | PASS | | `nox -s unit_tests` | PASS (11,455 scenarios, 0 failed) | | `nox -s integration_tests` | PASS (1,600 tests, 0 failed) | | `nox -s coverage_report` | PASS (97.0%) | Closes #734
CoreRasurae added this to the v3.2.0 milestone 2026-03-19 14:55:42 +00:00
freemo approved these changes 2026-03-23 03:42:34 +00:00
Dismissed
freemo left a comment

Day 43 Review — PR #1071 fix(test): replace shell=True with shell=False

Milestone: v3.2.0
Status: Mergeable (no conflicts)

Review Notes

This PR has been reviewed for compliance with CONTRIBUTING.md standards. Key checks:

  • Commit message format: Verified Conventional Changelog format from title
  • Mergeable status: Clean
  • Milestone assignment: v3.2.0

Action Items

  • Ensure the PR body includes a closing keyword (e.g., Closes #NNN)
  • Ensure at least 2 peer reviewers are assigned
  • Verify all CI checks pass before merge

Please ensure all subtasks in the linked issue are complete before merging.

## Day 43 Review — PR #1071 `fix(test): replace shell=True with shell=False` **Milestone**: v3.2.0 **Status**: Mergeable (no conflicts) ### Review Notes This PR has been reviewed for compliance with `CONTRIBUTING.md` standards. Key checks: - **Commit message format**: Verified Conventional Changelog format from title - **Mergeable status**: Clean - **Milestone assignment**: v3.2.0 ### Action Items - Ensure the PR body includes a closing keyword (e.g., `Closes #NNN`) - Ensure at least 2 peer reviewers are assigned - Verify all CI checks pass before merge Please ensure all subtasks in the linked issue are complete before merging.
CoreRasurae left a comment

Code Review Report -- PR #1071 (Closes #734)

Reviewer: Automated Code Review
Branch: fix/m3-shell-true-subprocess
Commit: 1849c3d0 (Luis Mendes)
Scope: features/steps/cli_coverage_steps.py diff + close connections to surrounding code
Methodology: 3 global review cycles across all categories (Bug Detection, Security, Performance, Test Coverage, Test Flaws, Code Quality)


Summary

The change correctly replaces shell=True with shell=False + shlex.split() in the step_run_shell_command function, eliminating a command injection vector identified in PR #727 (Finding #3 by @hurui200320). The fix is consistent with the reference pattern at cli_plan_context_commands_steps.py:351 and all acceptance criteria from issue #734 are met.

No Critical or High severity issues found. The change is correct, minimal, and well-scoped. Below are minor findings for consideration.


Findings by Category and Severity

1. Bug Detection

# Severity Location Finding
1 Low cli_coverage_steps.py:27 Misleading docstring. The docstring reads """Run a CLI command using subprocess shell.""" but the function no longer uses a shell (shell=True was removed). Suggest updating to """Run a CLI command as a subprocess.""" or similar.
2 Low cli_coverage_steps.py:44 shlex.split() raises ValueError on malformed quoted strings (e.g., 'echo "unclosed'). The generic except Exception as e handler at line 62 catches this, but the resulting error message ("No closing quotation") provides no context about which command failed. Risk is very low since inputs originate from authored .feature files, not runtime user input.

2. Security

# Severity Location Finding
3 Positive cli_coverage_steps.py:43-44 Fix correctly eliminates the shell=True command injection vector. shlex.split() properly tokenizes the command string into a list, which is passed directly to subprocess.run() without shell interpretation. This is the correct defense-in-depth approach.
4 Positive features/steps/*.py Audit confirms zero remaining shell=True in Python step files. All 87 subprocess references across 20 step files were audited. Every real subprocess.run() call uses list-form commands with shell defaulting to False. The codebase is clean.

3. Performance

No issues found. shlex.split() overhead is negligible for test step functions.

4. Test Coverage

# Severity Location Finding
5 Low features/cli_coverage.feature:7 Single scenario exercises the modified step. Only one scenario (When I run shell command "cleveragents --version") tests the step_run_shell_command function. This covers the primary path (cleveragents command with arguments), but not: the bare cleveragents branch (line 38-39), non-cleveragents commands (the else-branch bypass at line 34), or edge cases through shlex.split(). The test adequately validates the behavioral equivalence of the change but coverage breadth is limited.

5. Code Quality / Maintainability

# Severity Location Finding
6 Low cli_coverage_steps.py:43-44 Consider adding explicit shell=False. While False is the default, being explicit documents the deliberate security decision -- especially since the original code had shell=True explicitly. The reference implementation also omits it, so this is consistent but less self-documenting.
7 Info cli_coverage_steps.py:25 BDD step name implies shell execution. The pattern I run shell command "{command}" suggests shell execution, but the implementation no longer uses a shell. This is a naming artifact; changing it would require updating feature files and is not worth the churn for this PR.
8 Info cli_coverage_steps.py:43 Removed comment not replaced. The old comment # Use shell=True to handle quoted arguments correctly was appropriately removed, but no replacement explains the rationale for shlex.split(). A brief comment like # Tokenize without shell to prevent injection (see #734) could aid future readers. The PR description and commit message provide this context externally.

Verification Summary

Check Result
shell=True removed PASS
shlex.split() correctly tokenizes quoted paths PASS (verified with sys.executable containing spaces)
Consistent with reference impl (cli_plan_context_commands_steps.py:351) PASS
No duplicate step patterns (checked tui_shell_exec_coverage_steps.py:67 -- different pattern "I run a shell command") PASS
No remaining shell=True in any Python step file PASS (full audit)
All other subprocess calls in step files use list-form commands PASS (audited 20 files)
Acceptance criteria from issue #734 met PASS (all 4 criteria)

Verdict

APPROVE -- The change is correct, well-scoped, and addresses the security finding appropriately. The minor findings above are suggestions for improvement and do not block merge.

# Code Review Report -- PR #1071 (Closes #734) **Reviewer:** Automated Code Review **Branch:** `fix/m3-shell-true-subprocess` **Commit:** `1849c3d0` (Luis Mendes) **Scope:** `features/steps/cli_coverage_steps.py` diff + close connections to surrounding code **Methodology:** 3 global review cycles across all categories (Bug Detection, Security, Performance, Test Coverage, Test Flaws, Code Quality) --- ## Summary The change correctly replaces `shell=True` with `shell=False` + `shlex.split()` in the `step_run_shell_command` function, eliminating a command injection vector identified in PR #727 (Finding #3 by @hurui200320). The fix is consistent with the reference pattern at `cli_plan_context_commands_steps.py:351` and all acceptance criteria from issue #734 are met. **No Critical or High severity issues found.** The change is correct, minimal, and well-scoped. Below are minor findings for consideration. --- ## Findings by Category and Severity ### 1. Bug Detection | # | Severity | Location | Finding | |---|----------|----------|---------| | 1 | **Low** | `cli_coverage_steps.py:27` | **Misleading docstring.** The docstring reads `"""Run a CLI command using subprocess shell."""` but the function no longer uses a shell (`shell=True` was removed). Suggest updating to `"""Run a CLI command as a subprocess."""` or similar. | | 2 | **Low** | `cli_coverage_steps.py:44` | **`shlex.split()` raises `ValueError` on malformed quoted strings** (e.g., `'echo "unclosed'`). The generic `except Exception as e` handler at line 62 catches this, but the resulting error message (`"No closing quotation"`) provides no context about which command failed. Risk is very low since inputs originate from authored `.feature` files, not runtime user input. | ### 2. Security | # | Severity | Location | Finding | |---|----------|----------|---------| | 3 | **Positive** | `cli_coverage_steps.py:43-44` | **Fix correctly eliminates the `shell=True` command injection vector.** `shlex.split()` properly tokenizes the command string into a list, which is passed directly to `subprocess.run()` without shell interpretation. This is the correct defense-in-depth approach. | | 4 | **Positive** | `features/steps/*.py` | **Audit confirms zero remaining `shell=True` in Python step files.** All 87 subprocess references across 20 step files were audited. Every real `subprocess.run()` call uses list-form commands with `shell` defaulting to `False`. The codebase is clean. | ### 3. Performance No issues found. `shlex.split()` overhead is negligible for test step functions. ### 4. Test Coverage | # | Severity | Location | Finding | |---|----------|----------|---------| | 5 | **Low** | `features/cli_coverage.feature:7` | **Single scenario exercises the modified step.** Only one scenario (`When I run shell command "cleveragents --version"`) tests the `step_run_shell_command` function. This covers the primary path (cleveragents command with arguments), but not: the bare `cleveragents` branch (line 38-39), non-cleveragents commands (the else-branch bypass at line 34), or edge cases through `shlex.split()`. The test adequately validates the behavioral equivalence of the change but coverage breadth is limited. | ### 5. Code Quality / Maintainability | # | Severity | Location | Finding | |---|----------|----------|---------| | 6 | **Low** | `cli_coverage_steps.py:43-44` | **Consider adding explicit `shell=False`.** While `False` is the default, being explicit documents the deliberate security decision -- especially since the original code had `shell=True` explicitly. The reference implementation also omits it, so this is consistent but less self-documenting. | | 7 | **Info** | `cli_coverage_steps.py:25` | **BDD step name implies shell execution.** The pattern `I run shell command "{command}"` suggests shell execution, but the implementation no longer uses a shell. This is a naming artifact; changing it would require updating feature files and is not worth the churn for this PR. | | 8 | **Info** | `cli_coverage_steps.py:43` | **Removed comment not replaced.** The old comment `# Use shell=True to handle quoted arguments correctly` was appropriately removed, but no replacement explains the rationale for `shlex.split()`. A brief comment like `# Tokenize without shell to prevent injection (see #734)` could aid future readers. The PR description and commit message provide this context externally. | --- ## Verification Summary | Check | Result | |-------|--------| | `shell=True` removed | PASS | | `shlex.split()` correctly tokenizes quoted paths | PASS (verified with `sys.executable` containing spaces) | | Consistent with reference impl (`cli_plan_context_commands_steps.py:351`) | PASS | | No duplicate step patterns (checked `tui_shell_exec_coverage_steps.py:67` -- different pattern `"I run a shell command"`) | PASS | | No remaining `shell=True` in any Python step file | PASS (full audit) | | All other subprocess calls in step files use list-form commands | PASS (audited 20 files) | | Acceptance criteria from issue #734 met | PASS (all 4 criteria) | --- ## Verdict **APPROVE** -- The change is correct, well-scoped, and addresses the security finding appropriately. The minor findings above are suggestions for improvement and do not block merge.
@ -1,5 +1,6 @@
"""Step definitions for CLI coverage tests."""
import shlex
Author
Member

[Low -- Bug] This docstring is now misleading. The function no longer uses a subprocess shell (shell=True was removed). Consider updating to:

"""Run a CLI command as a subprocess."""
**[Low -- Bug]** This docstring is now misleading. The function no longer uses a subprocess shell (`shell=True` was removed). Consider updating to: ```python """Run a CLI command as a subprocess.""" ```
Author
Member

[Low -- Code Quality] Consider adding explicit shell=False to document the deliberate security decision:

result = subprocess.run(
    shlex.split(command), capture_output=True, text=True, timeout=120, cwd=cwd, shell=False
)

This makes the security-conscious choice visible at a glance, especially since the original code had shell=True explicitly.

**[Low -- Code Quality]** Consider adding explicit `shell=False` to document the deliberate security decision: ```python result = subprocess.run( shlex.split(command), capture_output=True, text=True, timeout=120, cwd=cwd, shell=False ) ``` This makes the security-conscious choice visible at a glance, especially since the original code had `shell=True` explicitly.
CoreRasurae force-pushed fix/m3-shell-true-subprocess from 6077e92148
Some checks failed
CI / lint (pull_request) Successful in 28s
CI / typecheck (pull_request) Successful in 45s
CI / security (pull_request) Successful in 52s
CI / quality (pull_request) Successful in 44s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 24s
CI / unit_tests (pull_request) Successful in 4m7s
CI / integration_tests (pull_request) Successful in 4m13s
CI / e2e_tests (pull_request) Successful in 5m21s
CI / docker (pull_request) Successful in 1m55s
CI / coverage (pull_request) Successful in 10m0s
CI / benchmark-regression (pull_request) Failing after 18m34s
to bd18491b8d
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 24s
CI / lint (pull_request) Successful in 3m20s
CI / quality (pull_request) Successful in 3m47s
CI / security (pull_request) Successful in 4m1s
CI / typecheck (pull_request) Successful in 4m7s
CI / unit_tests (pull_request) Successful in 5m46s
CI / docker (pull_request) Successful in 50s
CI / integration_tests (pull_request) Successful in 6m44s
CI / e2e_tests (pull_request) Successful in 9m41s
CI / coverage (pull_request) Successful in 11m37s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-publish (push) Waiting to run
CI / build (push) Successful in 15s
CI / lint (push) Successful in 3m17s
CI / quality (push) Successful in 3m32s
CI / typecheck (push) Successful in 3m51s
CI / benchmark-regression (push) Waiting to run
CI / security (push) Successful in 4m0s
CI / integration_tests (push) Successful in 5m38s
CI / unit_tests (push) Successful in 5m40s
CI / docker (push) Successful in 1m9s
CI / e2e_tests (push) Successful in 7m17s
CI / coverage (push) Successful in 11m4s
CI / status-check (push) Successful in 3s
CI / benchmark-regression (pull_request) Failing after 27m59s
2026-03-23 22:22:35 +00:00
Compare
CoreRasurae dismissed freemo's review 2026-03-23 22:22:35 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

CoreRasurae scheduled this pull request to auto merge when all checks succeed 2026-03-23 22:23:02 +00:00
CoreRasurae deleted branch fix/m3-shell-true-subprocess 2026-03-23 23:21:30 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core!1071
No description provided.