fix(test): replace shell=True with shell=False in cli_coverage_steps.py subprocess call #1071

CoreRasurae · 2026-03-19T14:55:18Z

CoreRasurae commented

2026-03-19 14:55:18 +00:00

Summary

Replaced shell=True with shell=False and shlex.split() for command tokenization in features/steps/cli_coverage_steps.py, addressing a security concern identified during the PR #727 review (Finding #3 by @hurui200320).

This aligns with the safer pattern already used in cli_plan_context_commands_steps.py:351 and eliminates a potential command injection vector, following defense-in-depth principles.

Changes

features/steps/cli_coverage_steps.py: Added import shlex, replaced shell=True with shlex.split(command) for proper command tokenization
Audit: Searched all step files under features/steps/ and robot/ helpers for other shell=True usages — no other Python subprocess calls found

Quality Gates

Session	Result
`nox -s lint`	PASS
`nox -s typecheck`	PASS
`nox -s unit_tests`	PASS (11,455 scenarios, 0 failed)
`nox -s integration_tests`	PASS (1,600 tests, 0 failed)
`nox -s coverage_report`	PASS (97.0%)

Closes #734

## Summary Replaced `shell=True` with `shell=False` and `shlex.split()` for command tokenization in `features/steps/cli_coverage_steps.py`, addressing a security concern identified during the PR #727 review (Finding #3 by @hurui200320). This aligns with the safer pattern already used in `cli_plan_context_commands_steps.py:351` and eliminates a potential command injection vector, following defense-in-depth principles. ## Changes - **`features/steps/cli_coverage_steps.py`**: Added `import shlex`, replaced `shell=True` with `shlex.split(command)` for proper command tokenization - **Audit**: Searched all step files under `features/steps/` and `robot/` helpers for other `shell=True` usages — no other Python subprocess calls found ## Quality Gates | Session | Result | |---------|--------| | `nox -s lint` | PASS | | `nox -s typecheck` | PASS | | `nox -s unit_tests` | PASS (11,455 scenarios, 0 failed) | | `nox -s integration_tests` | PASS (1,600 tests, 0 failed) | | `nox -s coverage_report` | PASS (97.0%) | Closes #734

CoreRasurae added this to the v3.2.0 milestone

2026-03-19 14:55:42 +00:00

CoreRasurae added the

Type

Task

label

2026-03-19 14:55:44 +00:00

CoreRasurae referenced this pull request

2026-03-19 14:57:04 +00:00

fix(test): replace shell=True with shell=False in cli_coverage_steps.py subprocess call #734

freemo

2026-03-22 16:35:35 +00:00

added the
Priority
Medium

State
In Review
labels
requested reviews from freemo, brent.edwards

freemo approved these changes

2026-03-23 03:42:34 +00:00

Dismissed

freemo left a comment

Day 43 Review — PR #1071 `fix(test): replace shell=True with shell=False`

Milestone: v3.2.0
Status: Mergeable (no conflicts)

Review Notes

This PR has been reviewed for compliance with CONTRIBUTING.md standards. Key checks:

Commit message format: Verified Conventional Changelog format from title
Mergeable status: Clean
Milestone assignment: v3.2.0

Action Items

Ensure the PR body includes a closing keyword (e.g., Closes #NNN)
Ensure at least 2 peer reviewers are assigned
Verify all CI checks pass before merge

Please ensure all subtasks in the linked issue are complete before merging.

## Day 43 Review — PR #1071 `fix(test): replace shell=True with shell=False` **Milestone**: v3.2.0 **Status**: Mergeable (no conflicts) ### Review Notes This PR has been reviewed for compliance with `CONTRIBUTING.md` standards. Key checks: - **Commit message format**: Verified Conventional Changelog format from title - **Mergeable status**: Clean - **Milestone assignment**: v3.2.0 ### Action Items - Ensure the PR body includes a closing keyword (e.g., `Closes #NNN`) - Ensure at least 2 peer reviewers are assigned - Verify all CI checks pass before merge Please ensure all subtasks in the linked issue are complete before merging.

CoreRasurae reviewed

2026-03-23 22:14:17 +00:00

CoreRasurae left a comment

Code Review Report -- PR #1071 (Closes #734)

Reviewer: Automated Code Review
Branch: fix/m3-shell-true-subprocess
Commit: 1849c3d0 (Luis Mendes)
Scope: features/steps/cli_coverage_steps.py diff + close connections to surrounding code
Methodology: 3 global review cycles across all categories (Bug Detection, Security, Performance, Test Coverage, Test Flaws, Code Quality)

Summary

The change correctly replaces shell=True with shell=False + shlex.split() in the step_run_shell_command function, eliminating a command injection vector identified in PR #727 (Finding #3 by @hurui200320). The fix is consistent with the reference pattern at cli_plan_context_commands_steps.py:351 and all acceptance criteria from issue #734 are met.

No Critical or High severity issues found. The change is correct, minimal, and well-scoped. Below are minor findings for consideration.

Findings by Category and Severity

1. Bug Detection

#	Severity	Location	Finding
1	Low	`cli_coverage_steps.py:27`	Misleading docstring. The docstring reads `"""Run a CLI command using subprocess shell."""` but the function no longer uses a shell (`shell=True` was removed). Suggest updating to `"""Run a CLI command as a subprocess."""` or similar.
2	Low	`cli_coverage_steps.py:44`	`shlex.split()` raises `ValueError` on malformed quoted strings (e.g., `'echo "unclosed'`). The generic `except Exception as e` handler at line 62 catches this, but the resulting error message (`"No closing quotation"`) provides no context about which command failed. Risk is very low since inputs originate from authored `.feature` files, not runtime user input.

2. Security

#	Severity	Location	Finding
3	Positive	`cli_coverage_steps.py:43-44`	Fix correctly eliminates the `shell=True` command injection vector. `shlex.split()` properly tokenizes the command string into a list, which is passed directly to `subprocess.run()` without shell interpretation. This is the correct defense-in-depth approach.
4	Positive	`features/steps/*.py`	Audit confirms zero remaining `shell=True` in Python step files. All 87 subprocess references across 20 step files were audited. Every real `subprocess.run()` call uses list-form commands with `shell` defaulting to `False`. The codebase is clean.

3. Performance

No issues found. shlex.split() overhead is negligible for test step functions.

4. Test Coverage

#	Severity	Location	Finding
5	Low	`features/cli_coverage.feature:7`	Single scenario exercises the modified step. Only one scenario (`When I run shell command "cleveragents --version"`) tests the `step_run_shell_command` function. This covers the primary path (cleveragents command with arguments), but not: the bare `cleveragents` branch (line 38-39), non-cleveragents commands (the else-branch bypass at line 34), or edge cases through `shlex.split()`. The test adequately validates the behavioral equivalence of the change but coverage breadth is limited.

5. Code Quality / Maintainability

#	Severity	Location	Finding
6	Low	`cli_coverage_steps.py:43-44`	Consider adding explicit `shell=False`. While `False` is the default, being explicit documents the deliberate security decision -- especially since the original code had `shell=True` explicitly. The reference implementation also omits it, so this is consistent but less self-documenting.
7	Info	`cli_coverage_steps.py:25`	BDD step name implies shell execution. The pattern `I run shell command "{command}"` suggests shell execution, but the implementation no longer uses a shell. This is a naming artifact; changing it would require updating feature files and is not worth the churn for this PR.
8	Info	`cli_coverage_steps.py:43`	Removed comment not replaced. The old comment `# Use shell=True to handle quoted arguments correctly` was appropriately removed, but no replacement explains the rationale for `shlex.split()`. A brief comment like `# Tokenize without shell to prevent injection (see #734)` could aid future readers. The PR description and commit message provide this context externally.

Verification Summary

Check	Result
`shell=True` removed	PASS
`shlex.split()` correctly tokenizes quoted paths	PASS (verified with `sys.executable` containing spaces)
Consistent with reference impl (`cli_plan_context_commands_steps.py:351`)	PASS
No duplicate step patterns (checked `tui_shell_exec_coverage_steps.py:67` -- different pattern `"I run a shell command"`)	PASS
No remaining `shell=True` in any Python step file	PASS (full audit)
All other subprocess calls in step files use list-form commands	PASS (audited 20 files)
Acceptance criteria from issue #734 met	PASS (all 4 criteria)

Verdict

APPROVE -- The change is correct, well-scoped, and addresses the security finding appropriately. The minor findings above are suggestions for improvement and do not block merge.

# Code Review Report -- PR #1071 (Closes #734) **Reviewer:** Automated Code Review **Branch:** `fix/m3-shell-true-subprocess` **Commit:** `1849c3d0` (Luis Mendes) **Scope:** `features/steps/cli_coverage_steps.py` diff + close connections to surrounding code **Methodology:** 3 global review cycles across all categories (Bug Detection, Security, Performance, Test Coverage, Test Flaws, Code Quality) --- ## Summary The change correctly replaces `shell=True` with `shell=False` + `shlex.split()` in the `step_run_shell_command` function, eliminating a command injection vector identified in PR #727 (Finding #3 by @hurui200320). The fix is consistent with the reference pattern at `cli_plan_context_commands_steps.py:351` and all acceptance criteria from issue #734 are met. **No Critical or High severity issues found.** The change is correct, minimal, and well-scoped. Below are minor findings for consideration. --- ## Findings by Category and Severity ### 1. Bug Detection | # | Severity | Location | Finding | |---|----------|----------|---------| | 1 | **Low** | `cli_coverage_steps.py:27` | **Misleading docstring.** The docstring reads `"""Run a CLI command using subprocess shell."""` but the function no longer uses a shell (`shell=True` was removed). Suggest updating to `"""Run a CLI command as a subprocess."""` or similar. | | 2 | **Low** | `cli_coverage_steps.py:44` | **`shlex.split()` raises `ValueError` on malformed quoted strings** (e.g., `'echo "unclosed'`). The generic `except Exception as e` handler at line 62 catches this, but the resulting error message (`"No closing quotation"`) provides no context about which command failed. Risk is very low since inputs originate from authored `.feature` files, not runtime user input. | ### 2. Security | # | Severity | Location | Finding | |---|----------|----------|---------| | 3 | **Positive** | `cli_coverage_steps.py:43-44` | **Fix correctly eliminates the `shell=True` command injection vector.** `shlex.split()` properly tokenizes the command string into a list, which is passed directly to `subprocess.run()` without shell interpretation. This is the correct defense-in-depth approach. | | 4 | **Positive** | `features/steps/*.py` | **Audit confirms zero remaining `shell=True` in Python step files.** All 87 subprocess references across 20 step files were audited. Every real `subprocess.run()` call uses list-form commands with `shell` defaulting to `False`. The codebase is clean. | ### 3. Performance No issues found. `shlex.split()` overhead is negligible for test step functions. ### 4. Test Coverage | # | Severity | Location | Finding | |---|----------|----------|---------| | 5 | **Low** | `features/cli_coverage.feature:7` | **Single scenario exercises the modified step.** Only one scenario (`When I run shell command "cleveragents --version"`) tests the `step_run_shell_command` function. This covers the primary path (cleveragents command with arguments), but not: the bare `cleveragents` branch (line 38-39), non-cleveragents commands (the else-branch bypass at line 34), or edge cases through `shlex.split()`. The test adequately validates the behavioral equivalence of the change but coverage breadth is limited. | ### 5. Code Quality / Maintainability | # | Severity | Location | Finding | |---|----------|----------|---------| | 6 | **Low** | `cli_coverage_steps.py:43-44` | **Consider adding explicit `shell=False`.** While `False` is the default, being explicit documents the deliberate security decision -- especially since the original code had `shell=True` explicitly. The reference implementation also omits it, so this is consistent but less self-documenting. | | 7 | **Info** | `cli_coverage_steps.py:25` | **BDD step name implies shell execution.** The pattern `I run shell command "{command}"` suggests shell execution, but the implementation no longer uses a shell. This is a naming artifact; changing it would require updating feature files and is not worth the churn for this PR. | | 8 | **Info** | `cli_coverage_steps.py:43` | **Removed comment not replaced.** The old comment `# Use shell=True to handle quoted arguments correctly` was appropriately removed, but no replacement explains the rationale for `shlex.split()`. A brief comment like `# Tokenize without shell to prevent injection (see #734)` could aid future readers. The PR description and commit message provide this context externally. | --- ## Verification Summary | Check | Result | |-------|--------| | `shell=True` removed | PASS | | `shlex.split()` correctly tokenizes quoted paths | PASS (verified with `sys.executable` containing spaces) | | Consistent with reference impl (`cli_plan_context_commands_steps.py:351`) | PASS | | No duplicate step patterns (checked `tui_shell_exec_coverage_steps.py:67` -- different pattern `"I run a shell command"`) | PASS | | No remaining `shell=True` in any Python step file | PASS (full audit) | | All other subprocess calls in step files use list-form commands | PASS (audited 20 files) | | Acceptance criteria from issue #734 met | PASS (all 4 criteria) | --- ## Verdict **APPROVE** -- The change is correct, well-scoped, and addresses the security finding appropriately. The minor findings above are suggestions for improvement and do not block merge.

features/steps/cli_coverage_steps.py Outdated

					
				@ -1,5 +1,6 @@

				"""Step definitions for CLI coverage tests."""

				import shlex

[Low -- Bug] This docstring is now misleading. The function no longer uses a subprocess shell (shell=True was removed). Consider updating to:

"""Run a CLI command as a subprocess."""

**[Low -- Bug]** This docstring is now misleading. The function no longer uses a subprocess shell (`shell=True` was removed). Consider updating to: ```python """Run a CLI command as a subprocess.""" ```

features/steps/cli_coverage_steps.py

[Low -- Code Quality] Consider adding explicit shell=False to document the deliberate security decision:

result = subprocess.run(
    shlex.split(command), capture_output=True, text=True, timeout=120, cwd=cwd, shell=False
)

This makes the security-conscious choice visible at a glance, especially since the original code had shell=True explicitly.

**[Low -- Code Quality]** Consider adding explicit `shell=False` to document the deliberate security decision: ```python result = subprocess.run( shlex.split(command), capture_output=True, text=True, timeout=120, cwd=cwd, shell=False ) ``` This makes the security-conscious choice visible at a glance, especially since the original code had `shell=True` explicitly.

CoreRasurae force-pushed fix/m3-shell-true-subprocess from 6077e92148

CI / lint (pull_request) Successful in 28s

Details

CI / typecheck (pull_request) Successful in 45s

Details

CI / security (pull_request) Successful in 52s

Details

CI / quality (pull_request) Successful in 44s

Details

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / build (pull_request) Successful in 24s

Details

CI / unit_tests (pull_request) Successful in 4m7s

Details

CI / integration_tests (pull_request) Successful in 4m13s

Details

CI / e2e_tests (pull_request) Successful in 5m21s

Details

CI / docker (pull_request) Successful in 1m55s

Details

CI / coverage (pull_request) Successful in 10m0s

Details

CI / benchmark-regression (pull_request) Failing after 18m34s

Details

to bd18491b8d

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / build (pull_request) Successful in 24s

Details

CI / lint (pull_request) Successful in 3m20s

Details

CI / quality (pull_request) Successful in 3m47s

Details

CI / security (pull_request) Successful in 4m1s

Details

CI / typecheck (pull_request) Successful in 4m7s

Details

CI / unit_tests (pull_request) Successful in 5m46s

Details

CI / docker (pull_request) Successful in 50s

Details

CI / integration_tests (pull_request) Successful in 6m44s

Details

CI / e2e_tests (pull_request) Successful in 9m41s

Details

CI / coverage (pull_request) Successful in 11m37s

Details

CI / status-check (pull_request) Successful in 1s

Details

CI / benchmark-publish (push) Waiting to run

Details

CI / build (push) Successful in 15s

Details

CI / lint (push) Successful in 3m17s

Details

CI / quality (push) Successful in 3m32s

Details

CI / typecheck (push) Successful in 3m51s

Details

CI / benchmark-regression (push) Waiting to run

Details

CI / security (push) Successful in 4m0s

Details

CI / integration_tests (push) Successful in 5m38s

Details

CI / unit_tests (push) Successful in 5m40s

Details

CI / docker (push) Successful in 1m9s

Details

CI / e2e_tests (push) Successful in 7m17s

Details

CI / coverage (push) Successful in 11m4s

Details

CI / status-check (push) Successful in 3s

Details

CI / benchmark-regression (pull_request) Failing after 27m59s

Details

2026-03-23 22:22:35 +00:00

Compare

CoreRasurae dismissed freemo's review

2026-03-23 22:22:35 +00:00

Reason:

New commits pushed, approval review dismissed automatically according to repository settings

CoreRasurae scheduled this pull request to auto merge when all checks succeed

2026-03-23 22:23:02 +00:00

CoreRasurae merged commit bd18491b8d into master

2026-03-23 23:21:30 +00:00

CoreRasurae deleted branch fix/m3-shell-true-subprocess

2026-03-23 23:21:30 +00:00

Sign in to join this conversation.

No reviewers

No labels

auto/needs-reevaluation

auto/claimed-implementer

auto/claimed-merge

auto/claimed-reviewer

auto/driver-down

auto/invariant-violation

auto/last-attempt-tier-0

auto/last-attempt-tier-1

auto/last-attempt-tier-2

auto/last-attempt-tier-min

Automation Tracking

auto/needs-conflict-resolution

auto/needs-implementer

auto/postmortem

auto/ready-to-merge

auto/restart-throttled

auto/revert

auto/sentinel

auto/stale-inactivity

Signed-off: Scrum Master

Signed-off: Tech Lead

No milestone

No project

No assignees

2 participants

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

cleveragents/cleveragents-core!1071

Rows
Columns

fix(test): replace shell=True with shell=False in cli_coverage_steps.py subprocess call #1071

Summary

Changes

Quality Gates

Day 43 Review — PR #1071 fix(test): replace shell=True with shell=False

Review Notes

Action Items

Code Review Report -- PR #1071 (Closes #734)

Summary

Findings by Category and Severity

1. Bug Detection

2. Security

3. Performance

4. Test Coverage

5. Code Quality / Maintainability

Verification Summary

Verdict

Day 43 Review — PR #1071 `fix(test): replace shell=True with shell=False`